Skip to main content

Data Egress and Region Boundaries in Cost and Compliance

What This Concept Is

"Egress" is data leaving the provider's boundary; "cross-region" is data moving between the provider's regions. Both are metered. Unlike ingress (free almost everywhere), egress is one of the larger and least-predictable line items on a cloud bill.

Three boundaries matter:

  • Inside one AZ - usually free or a tiny per-GB fee.
  • Between AZs in the same region - metered (AWS typical: $0.01/GB each way, so $0.02/GB round-trip for many services).
  • Between regions or out to the public internet - the big one. AWS typical: $0.02/GB for cross-region, $0.05-$0.09/GB for public-internet egress, tiered. Cross-cloud (AWS -> GCP) traffic crosses two meters and is almost always the most expensive path.

Compliance adds a parallel boundary: some data (EU personal data under GDPR, HIPAA PHI, UK/AU data-residency laws) is legally constrained to specific regions. Moving a byte across a boundary may be a legal event, not just a billing event.

This is the "supporting" concept in the cluster because it is rarely the headline goal of a design - but it shapes the cost and legal envelope of every other choice. Storage class, database placement, CDN origin, and log destination all depend on this.

Why It Matters Here

"We just copy the data to the analytics region" is the start of most surprise bills:

  • a 2 TB daily sync between regions is ~$40/day = $1200/month in data transfer alone
  • a chatty microservice in us-east-1 calling a database in us-west-2 can egress tens of GB per hour
  • S3 replication across regions doubles your storage cost and charges per-GB replicated
  • logs shipped to a SaaS observability vendor are egress to the public internet, billed per GB
  • a misplaced NAT Gateway charges "data processed" per GB on top of egress

And it is mostly invisible until the monthly bill arrives - the Cost Explorer default view often hides data-transfer under each service line, so you have to explicitly group by usage type.

Concrete Example

A SaaS with its primary stack in us-east-1 decides to add a DR copy in eu-west-1.

Costs to model:

  • S3 cross-region replication: roughly $0.02/GB for the transfer + storage in both regions. 10 TB/month of new uploads replicated = ~$200/month in transfer + doubled storage.
  • RDS cross-region read replica: per-GB transfer for the replication lag + one extra RDS instance. Often 30-50% uplift over single-region cost.
  • inter-region API calls during failover: inbound is free; outbound from eu-west-1 back to us-east-1 is egress.

Compliance layer:

  • some EU customers require data residency; pushing their data into us-east-1 for the DR path violates that contract
  • the fix is per-tenant region assignment, not a global cross-region copy
  • audit tooling: set SCPs or GCP organization policies that deny replication to disallowed regions; don't rely on dev discipline alone

Common "oops" cases:

  • putting a NAT Gateway in the wrong AZ so all traffic from 1b and 1c hops via 1a first (doubles inter-AZ charges)
  • hosting a CDN origin in one region when users are global (every request to distant users crosses boundaries)
  • letting a Kubernetes cluster span regions "for HA" and sending pod-to-pod traffic across them
  • uploading container images to a registry in one region and pulling from nodes in another (an image pull is many GB of egress per node, per deploy)

Back-of-envelope egress estimator (shell):

# approximate monthly S3 cross-region cost given bytes/day
DAY_GB=200; PRICE=0.02
python - <<'PY'
day_gb = 200; price = 0.02
print(f"Monthly: ${day_gb*30*price:.2f}")
PY
# -> Monthly: $120.00

Every architecture doc should contain a table like this. It is the quickest way to force the discussion "do we actually need this cross-region copy?"

Common Confusion / Misconception

"Data in the cloud is free to move." Data in the cloud is cheap to ingress and cheap to hold. Moving it is where the invoice comes from.

"Within one provider is all free." Only within one AZ (and only for some services). Between AZs, between regions, and out to the internet all cost. Intra-region traffic to public endpoints (like S3 without a Gateway Endpoint) is billed as NAT data processing.

"Compliance and cost are the same problem." They rhyme: both care about where data sits and moves. But compliance is a legal constraint (data must stay here) and cost is a financial constraint (data should stay here). They can conflict: the cheapest path may cross a legal boundary you cannot cross.

"A CDN eliminates egress." A CDN shifts where egress is billed and often lowers it per GB (CloudFront < S3-to-internet for popular paths), but cache fills and long-tail requests still egress from the origin. Budget for both.

Gotcha: VPC peering and Transit Gateway do not remove inter-AZ or cross-region egress charges. They change the route but still meter the bytes.

Second gotcha: Cross-AZ traffic to a NAT Gateway costs both the NAT data-processing fee and the inter-AZ fee. If your instance in 1b uses a NAT in 1a to reach S3 (without a VPC endpoint), you are paying three meters.

How To Use It

For any architecture:

  1. Draw the data flow and mark every boundary crossing (AZ, region, public internet, provider).
  2. Estimate bytes per month at each crossing; multiply by the per-GB rate for a quick bill projection.
  3. Flag boundary crossings that are compliance-sensitive and verify each one against legal/contractual rules.
  4. Prefer keeping chatty components (app ↔ database, app ↔ cache) in the same AZ.
  5. Use VPC endpoints for S3 and DynamoDB so that traffic does not need a NAT path. Use PrivateLink for partner/SaaS data paths.
  6. Review egress dashboards monthly; data-transfer cost drift is a real signal.
  7. For log and metric shipping, aggregate before egress. Ship compressed batches, not per-event HTTPS.
  8. Set SCPs / GCP org policies denying replication/copy into regions that violate residency rules.

Check Yourself

  1. Why is "free ingress" not a real benefit when you are planning an architecture?
  2. Name two ways a Kubernetes cluster can accidentally generate inter-AZ egress.
  3. How does data residency interact with disaster recovery?
  4. A CloudWatch alarm fires on NAT-Gateway data-processed. Name three likely causes and the primitive that would eliminate each.
  5. You replicate 10 TB/month of S3 objects from us-east-1 to ap-southeast-2. Estimate the monthly transfer cost (order of magnitude) and propose one architectural change to cut it by 50%.

Mini Drill or Application

Pick an architecture you already know (or one from Cluster 3). In fifteen minutes, mark each data-flow edge with AZ, region, and whether it crosses to the internet. For the three heaviest edges, estimate monthly GB and rough cost. Propose one change that cuts at least 30% off that transfer cost.

Extension: enable the Cost and Usage Report (or GCP billing export to BigQuery) and write a SQL query that filters for usage_type LIKE '%DataTransfer%'. Group by service and region, and see what the top three actually are. You will be surprised at least once.

Read This Only If Stuck