Deploying a Multi-Tenant AI SaaS on AWS Fargate with ECR and ALB

Infrastructure as Code (or Just AWS Fundamentals)

Deploying a full RAG platform involves tying together React, Python, Vector databases, and LLM endpoints. Instead of black-box PaaS solutions, we run VegaRAG natively on AWS ECS Fargate.

1. Containerizing the Stack

We have a monolithic Next.js frontend and a FastAPI backend. We bake NEXT_PUBLIC_API_URL directly into the frontend build stage. The backend runs uvicorn on port 8000.

# Frontend Dockerfile snippet
FROM node:18-alpine
ARG NEXT_PUBLIC_API_URL
ENV NEXT_PUBLIC_API_URL=$NEXT_PUBLIC_API_URL
RUN npm run build

2. ECR and the ALB Routing Pattern

We push both containers to Amazon Elastic Container Registry (ECR). The true magic happens at the Application Load Balancer (ALB). We map a single domain (e.g. vegarag.com) to the ALB and configure Listener Rules:

If the URL path matches /api/*, forward traffic to the Backend Fargate Target Group (Port 8000).
If the URL path matches /*, forward traffic to the Frontend Fargate Target Group (Port 3000).

This entirely eliminates CORS headaches and allows everything to run symmetrically over a single SSL certificate.

3. Zero-Key Security with IAM Task Roles

The biggest amateur mistake in deploying AWS apps is passing AWS_ACCESS_KEY_ID via Docker environment variables. We use an ECS Task IAM Role instead. Our Fargate containers intrinsically inherit permission to hit DynamoDB, S3, and Bedrock natively via boto3 without hardcoded secrets.

Infrastructure as Code (or Just AWS Fundamentals)

1. Containerizing the Stack

2. ECR and the ALB Routing Pattern

3. Zero-Key Security with IAM Task Roles

Build exactly what you just read.