Module 1: Cloud Platform Fundamentals: Case Studies

These case studies turn cloud primitives into design judgment: responsibility boundaries, failure domains, managed-service tradeoffs, network shape, identity, and cost.

Case Study 1: Shared Responsibility Misread

Scenario: A team runs a web app on managed compute and assumes the provider handles all security. A public storage bucket and overbroad instance role expose customer exports.

Source anchor: AWS's Shared Responsibility Model, which explains security and compliance as shared between AWS and the customer.

Module concepts: shared responsibility, IAM, storage policy, managed service limits.

Wrong Approach

"Managed service means managed security."

Better Approach

Write responsibility per layer:

Provider:
  facility, hardware, managed service control plane

Customer:
  IAM policy, data classification, bucket policy, network exposure, app code

Tradeoff Table

Choice	Gain	Cost
managed platform	less infrastructure ops	still own configuration and data
custom VMs	control	patching/hardening burden
broad IAM	fewer permission errors	breach blast radius
least privilege	reduced blast radius	policy design work

Required Artifact

Create a shared-responsibility matrix for one workload.

Case Study 2: Single-AZ Architecture Called Highly Available

Scenario: A product deploys web servers, database, and NAT in one availability zone. The design doc says "high availability" because there are two app instances.

Source anchor: AWS Well-Architected Availability, which frames availability as a measurable resiliency objective.

Module concepts: region, availability zone, failure domain, multi-AZ, dependency blast radius.

Wrong Approach

Count instances, not failure domains.

Better Approach

Map failure domains:

Web tier:
  at least two AZs

Database:
  multi-AZ or explicit recovery objective

NAT/load balancer:
  no single-zone choke point

Tradeoff Table

Choice	Gain	Cost
single AZ	cheap/simple	AZ failure outage
multi-AZ app	survives compute/AZ loss	cross-AZ cost/complexity
multi-AZ database	stronger availability	cost and failover behavior
multi-region	regional resilience	much higher complexity

Required Artifact

Draw a failure-domain diagram with RTO/RPO and what survives one AZ loss.

Case Study 3: Public Subnet By Accident

Scenario: A database is launched with a public IP because the default VPC made networking easy. Security groups restrict access today, but the exposure is unnecessary.

Source anchor: AWS VPC docs describe public and private subnets, route tables, internet gateways, and NAT gateways. See AWS VPC route tables.

Module concepts: VPC, subnet, route table, public IP, NAT, defense in depth.

Wrong Approach

"The security group blocks access, so public placement is fine."

Better Approach

Use network layers intentionally:

Public subnet:
  load balancer / bastion only if needed

Private app subnet:
  application tasks

Private data subnet:
  database, no route to internet gateway

Tradeoff Table

Choice	Gain	Cost
public DB	easy admin	unnecessary exposure
private DB	smaller attack surface	access path required
NAT egress	outbound updates	cost and dependency
VPC endpoints	private service access	endpoint setup

Required Artifact

Create a subnet/route-table review with every public route justified.

Case Study 4: Serverless Billing Surprise

Scenario: A serverless image-processing function looks cheap at launch. A marketing campaign triggers millions of invocations, high memory use, and expensive data egress.

Source anchor: AWS Lambda pricing and cloud provider pricing pages make cost proportional to requests, duration, memory, and data transfer. See AWS Lambda pricing.

Module concepts: serverless, unit economics, egress, cost model, scaling.

Wrong Approach

"Serverless is cheaper."

Better Approach

Model cost per operation:

invocations/month:
average duration:
memory:
storage read/write:
egress:
retry rate:

Tradeoff Table

Choice	Gain	Cost
serverless	scales to zero and fast start	cost spikes with volume/duration
containers	predictable baseline	pay for idle capacity
batch workers	throughput control	latency
CDN/cache	lower compute/egress	invalidation complexity

Required Artifact

Write a monthly cost model and alert threshold for one cloud workload.

Case Study 5: IAM User In Production Automation

Scenario: A CI job deploys using a long-lived IAM user access key stored as a secret. The key leaks through logs.

Source anchor: AWS IAM docs recommend roles and temporary credentials for workloads. See AWS IAM roles.

Module concepts: IAM role, temporary credentials, workload identity, least privilege.

Wrong Approach

Use long-lived users because they are easy to paste into CI.

Better Approach

Use workload identity:

CI identity:
  OIDC trust to cloud role

Role policy:
  only deploy target resources

Controls:
  short-lived credentials
  environment scoping
  audit logs

Tradeoff Table

Choice	Gain	Cost
IAM user key	simple	secret leakage and rotation burden
role + temporary creds	safer	trust-policy setup
broad deploy role	fewer failures	high blast radius
scoped role per env	lower blast radius	more policy work

Required Artifact

Write an IAM deployment-role policy review: principal, trust policy, allowed actions, denied actions, and audit trail.

Source Map

Source	Use it for
AWS Shared Responsibility Model	provider/customer responsibility boundaries
AWS Well-Architected Availability	availability and resiliency objectives
AWS VPC route tables	public/private routing and subnet design
AWS Lambda pricing	serverless unit economics
AWS IAM roles	temporary credentials and workload roles

Completion Standard

At least three artifacts are completed.
At least one artifact maps shared responsibility.
At least one artifact maps failure domains.
At least one artifact includes cost math.

Case Study 1: Shared Responsibility Misread​

Wrong Approach​

Better Approach​

Tradeoff Table​

Required Artifact​

Case Study 2: Single-AZ Architecture Called Highly Available​

Wrong Approach​

Better Approach​

Tradeoff Table​

Required Artifact​

Case Study 3: Public Subnet By Accident​

Wrong Approach​

Better Approach​

Tradeoff Table​

Required Artifact​

Case Study 4: Serverless Billing Surprise​

Wrong Approach​

Better Approach​

Tradeoff Table​

Required Artifact​

Case Study 5: IAM User In Production Automation​

Wrong Approach​

Better Approach​

Tradeoff Table​

Required Artifact​

Source Map​

Completion Standard​

Case Study 1: Shared Responsibility Misread

Wrong Approach

Better Approach

Tradeoff Table

Required Artifact

Case Study 2: Single-AZ Architecture Called Highly Available

Wrong Approach

Better Approach

Tradeoff Table

Required Artifact

Case Study 3: Public Subnet By Accident

Wrong Approach

Better Approach

Tradeoff Table

Required Artifact

Case Study 4: Serverless Billing Surprise

Wrong Approach

Better Approach

Tradeoff Table

Required Artifact

Case Study 5: IAM User In Production Automation

Wrong Approach

Better Approach

Tradeoff Table

Required Artifact

Source Map

Completion Standard