Containers on Managed Services: ECS, Cloud Run, Fargate
What This Concept Is
Running a container in production takes more than docker run. You need an orchestrator that schedules containers onto hosts, restarts them when they fail, rolls them forward on new image versions, and connects them to networking, storage, and observability.
Managed container services hand most of that to the provider:
- Amazon ECS (Elastic Container Service) - AWS's own orchestrator; you define
Task DefinitionsandServices, and ECS schedules them. - AWS Fargate - a launch mode for ECS (and EKS) where the provider runs the hosts entirely; you only see tasks, never instances.
- Google Cloud Run - serverless containers: you push an image, the platform scales it from zero to many instances per request, with per-request billing and automatic TLS.
- Azure Container Apps / Container Instances - similar envelope: containers without managing a cluster. Container Apps adds Dapr, KEDA-based scaling, and revisions; Container Instances is the "just run one container" primitive.
The distinction from Kubernetes (Module 3) is that these services are opinionated and cluster-less from your perspective. You do not design control planes, node groups, or system pods. You ship an image and a service definition.
Everything these services do, you could do yourself on VMs with systemd --user and a load balancer. The point is that the provider has already done it, charges a small premium for the scheduler, and spares you the operational work.
Why It Matters Here
Most teams moving to the cloud after 2018 land on managed containers, not raw VMs and not Kubernetes. They are the new "default" for request/response services that do not fit serverless:
- long-running workers with 10+ minute execution
- anything needing WebSockets, gRPC streaming, or persistent connections
- services with custom runtimes, binaries, or heavy dependencies
- apps that outgrow the 15-minute, 10 GB limits of serverless functions
- polyglot shops where standardizing on one PaaS runtime is unrealistic
Understanding ECS, Fargate, and Cloud Run gives you a vocabulary before Module 3 teaches Kubernetes, and it tells you when Kubernetes is the wrong answer. "Do we actually need an EKS cluster?" is a better question than "which distro of Kubernetes do we use?" - and it is often answered no.
Concrete Example
You containerize a Flask API. Three deployment paths:
AWS ECS with Fargate:
- build image, push to ECR:
aws ecr get-login-password | docker login ... && docker push - define a Task Definition: container image, CPU
1024(1 vCPU), memory2048MB, port8080, IAM task role - define a Service: desired count 3, across 3 AZs, registered with an ALB target group
- Fargate provisions the hosts, starts the tasks, wires up the ENIs
- scaling: target tracking on ALB request count per target
// task-definition.json (abridged)
{
"family": "flask-api",
"requiresCompatibilities": ["FARGATE"],
"networkMode": "awsvpc",
"cpu": "1024", "memory": "2048",
"taskRoleArn": "arn:aws:iam::...:role/flask-api-task",
"containerDefinitions": [{
"name": "app",
"image": "123.dkr.ecr.us-east-1.amazonaws.com/flask-api:sha-abc123",
"portMappings": [{ "containerPort": 8080 }],
"logConfiguration": { "logDriver": "awslogs", "options": {"awslogs-group": "/ecs/flask-api"} },
"healthCheck": { "command": ["CMD-SHELL", "curl -f http://localhost:8080/healthz || exit 1"] }
}]
}
Google Cloud Run:
gcloud run deploy flask-api \
--image us-central1-docker.pkg.dev/proj/app/flask-api:sha-abc123 \
--region us-central1 --concurrency 80 --cpu 1 --memory 512Mi \
--min-instances 0 --max-instances 40 \
--service-account flask-api@proj.iam.gserviceaccount.com
- one command gives you an HTTPS endpoint with a managed cert
- platform auto-scales from 0 to N based on concurrency setting (default 80 requests per container)
- you pay only while a container is handling a request, plus a small idle charge if you use
minInstances > 0
Amazon ECS on EC2 (not Fargate):
- you manage a cluster of EC2 instances registered with the ECS agent
- you gain per-container cost control and GPU support
- you take on host patching, AMI rolling, and scaling the capacity provider
For a team of 3 engineers shipping a single service, Cloud Run or Fargate is almost always the right answer. For a team running a fleet of 40 microservices with tight cost targets, EC2-backed ECS or Kubernetes starts to pull ahead.
Common Confusion / Misconception
"Fargate is Kubernetes without clusters." Fargate is a compute launch mode, not an orchestrator. It works under ECS and EKS. ECS is the AWS-native orchestrator; EKS is AWS's managed Kubernetes. You can run ECS tasks on Fargate, ECS tasks on EC2, EKS pods on Fargate, or EKS pods on EC2. Different combinations, different tradeoffs.
"Cloud Run is Lambda." Both scale from zero and bill per request, but Cloud Run runs full containers (up to 60-minute requests, 32 GB memory, HTTP/2, WebSockets), whereas Lambda runs functions with tighter limits (15 min, 10 GB, more rigid runtime). Cloud Run's cold start for a small container is usually sub-second; Lambda is similar for small runtimes, worse for heavy ones.
"Managed containers are always cheaper than VMs." At steady utilization, raw EC2 can be cheaper because you pay for the whole box. Managed container pricing embeds a premium for the scheduler, the networking, and the operational work they do. Price containers against a realistic utilization scenario, not against 100% CPU.
"Task role and execution role are the same thing." On ECS they are not. The execution role is what the Fargate agent uses to pull the image and write the logs. The task role is what your application uses to call AWS APIs. Mixing them up leads to either over-privileged agents or apps that cannot read from S3.
Gotchas:
- Fargate ENIs consume VPC IPs. A Service with
desired=100eats 100 IPs in its subnets. If your subnets are small (/24= 251 usable IPs), you will run out. Plan subnet sizes with this in mind. - Cloud Run's default concurrency is 80 per container. If your app is single-threaded Python, it will serialize those 80 requests and tail-latency collapses. Tune concurrency to your runtime's actual parallelism.
- ECS service deployments by default use rolling updates with
minimumHealthyPercent=100, maximumPercent=200, which requires capacity for 2x during deploy. If your subnet IP count is tight, deploys will stall.
How To Use It
For a containerized workload:
- Decide whether the workload fits Cloud Run/Fargate's per-request limits (timeouts, memory, request body sizes).
- If yes, start there; you save weeks of cluster management.
- Build your image reproducibly (multi-stage Dockerfile, pinned base image, CI-built, scanned) and push to the provider's registry (ECR, Artifact Registry, ACR).
- Attach a workload identity (IAM task role or Cloud Run service account) - never use static credentials. On ECS, remember both roles (execution + task).
- Put the service behind a load balancer (or the platform's built-in ingress) with health checks and TLS termination.
- Set concurrency/CPU/memory based on measurement; overprovision slightly early, tighten once stable.
- Tag the image with the Git SHA so every deploy is traceable back to a commit (a small Pro Git habit that becomes a production-debugging superpower).
- Configure graceful shutdown (
SIGTERM-> drain in-flight requests -> exit within the platform's grace period, usually 30 s).
Check Yourself
- What is the difference between ECS and Fargate, and why is "ECS vs Fargate" a confusing phrasing?
- When would Cloud Run be a worse choice than Fargate?
- Why do managed containers not remove the need for image scanning and supply-chain hygiene?
- On ECS Fargate, what is the difference between the execution role and the task role, and what breaks if you conflate them?
- Your Cloud Run service's p99 latency spikes every morning at 9am. Concurrency is 80. What is probably happening, and what knob addresses it?
Mini Drill or Application
In 15 minutes, take the Flask API from the Example and write a one-page deployment sketch on Fargate and on Cloud Run. Include: image source, CPU/memory, concurrency/service count, identity, how TLS is terminated, and how scaling is controlled. Note one operational concern each option hides and one it exposes.
Extension: look up the smallest subnet size that would safely host your Fargate service at its peak desired count, accounting for deploy-time doubling and VPC-reserved IPs. Compute the answer - most engineers get it wrong by at least one bit.
Read This Only If Stuck
- AWS Fargate for Amazon ECS - task sizing, networking, pricing model
- Amazon ECS: Task definitions - authoritative task definition schema
- Google Cloud Run documentation - serverless containers and auto-scaling behavior
- Google Cloud Run: Container runtime contract - what Cloud Run expects from a container (port, signals, startup)
- Azure Container Apps overview - Container Apps model, revisions, KEDA scaling
- Linux Command Line: Processes - how a containerized process is still a process, and why
PID 1matters for signal handling - Pro Git: Tagging - tagging releases so image tags line up with source commits