Module 1: Cloud Platform Fundamentals
Primary texts: official cloud provider documentation (AWS, GCP, Azure), treated as the source of truth Selective support: The DevOps Handbook, Software Engineering at Google, the Well-Architected Framework whitepapers, and the Twelve-Factor App for application patterns
This guide is the primary teacher. You do not need to read the provider docs front-to-back. You do need to leave this module able to read an AWS/GCP/Azure page directly, name the building blocks a production system relies on, and explain the tradeoffs in plain language to someone who has never touched a cloud console.
Scope of This Module
The cloud is not "someone else's computer." It is a rental market for compute, storage, networking, identity, and managed services, governed by a shared-responsibility model and priced by the unit. Get this foundation wrong and everything in the rest of the semester (IaC, Kubernetes, CI/CD, security, observability) is built on sand.
What it covers in depth:
- what a cloud platform actually rents you, and what remains your job
- regions, availability zones, and how failure domains shape architecture
- the abstraction ladder: IaaS, PaaS, serverless, and when each one fits
- compute primitives: VMs with autoscaling, managed containers, serverless functions
- networking primitives: VPCs, subnets, route tables, NAT, load balancers, DNS
- storage primitives: object, block, file storage, and managed databases
- identity: principals, policies, roles vs users, and why least privilege is non-negotiable
- account structure: organizations, landing zones, and multi-account strategy
- billing fundamentals: unit economics, tagging, budgets, and data egress costs
What it deliberately does not cover:
- Infrastructure as Code workflows (Module 2)
- Kubernetes and container orchestration (Module 3)
- CI/CD pipelines and release engineering (Module 4)
- observability, secrets, and cloud security deep dive (Module 5)
This is a foundations module. Treat it as provider-agnostic reasoning anchored in concrete AWS examples, with GCP and Azure cross-references where the vocabulary differs sharply.
Before You Start
Answer closed-book:
- What is the difference between a region and an availability zone? Which one survives a power failure?
- If you run your application on a "managed service," which failures are your responsibility and which are the provider's?
- A load balancer has a public IP and forwards HTTPS to your servers. Is that an L4 or L7 concern?
- Why is
s3://my-bucket/object.jsoncheaper per GB-month than an equivalent file on EBS? - What does an IAM role provide that an IAM user does not, and why do production workloads prefer roles?
Diagnostic Interpretation
4-5 solid answers - Ready for the full path. 2-3 solid answers - Continue, but plan extra time in Clusters 3 (networking) and 5 (identity). 0-1 solid answers - Skim Cluster 1 before attempting any lab work; the later clusters only make sense once the shared-responsibility model is clear.
What This Module Is For
Every later module in this semester assumes you can name and reason about these primitives. Throughout the program you will be asked:
- can we deploy this safely with a smaller blast radius?
- where should this workload run: VM, container, or serverless?
- who is allowed to read this bucket, and how do we prove it?
- why did our bill triple when we added a second region?
- which failures would take our service down, and at what granularity?
This module builds the reasoning needed for:
- Infrastructure as Code (S9M2), which automates these primitives
- CI/CD and release engineering (S9M4), which ship into these primitives
- cloud security and observability (S9M5), which defends and watches these primitives
- architecture decisions from S7 that now hit a concrete cost and failure model
You are learning the minimum vocabulary needed to design, deploy, and operate production workloads without bluffing.
Concept Map
How To Use This Module
Work in order. Cluster 5 (identity and accounts) only makes sense once you have seen how compute, networking, and storage primitives are scoped.
Cluster 1: What a Cloud Platform Is
| Order | Concept | Type | Focus |
|---|---|---|---|
| 1 | The Shared-Responsibility Model and What the Cloud Actually Rents You | PRIMARY | Where the provider's job ends and yours begins |
| 2 | Regions, Availability Zones, and Failure Domains | PRIMARY | Geography, isolation, and blast-radius reasoning |
| 3 | IaaS, PaaS, and Serverless: the Abstraction Ladder | PRIMARY | How much control versus how much operational burden |
Cluster mastery check: Given a workload description, can you state which layer of the ladder fits and which failure domains it must survive?
Cluster 2: Compute
| Order | Concept | Type | Focus |
|---|---|---|---|
| 4 | VMs: AMIs, Instance Types, and Autoscaling Groups | PRIMARY | The baseline compute primitive and how it scales |
| 5 | Containers on Managed Services: ECS, Cloud Run, Fargate | PRIMARY | Serverless containers and cluster-less orchestration |
| 6 | Serverless Functions: Lambda, Cloud Functions, Cold Starts, Limits | PRIMARY | Event-driven compute and its hard edges |
Cluster mastery check: Given four workloads (batch, request/response API, event pipeline, long-running worker), can you pick a compute primitive and defend it?
Cluster 3: Networking
| Order | Concept | Type | Focus |
|---|---|---|---|
| 7 | VPCs, Subnets, Route Tables, NAT | PRIMARY | Private network topology in the cloud |
| 8 | Load Balancers: L4 vs L7, Health Checks, TLS Termination | PRIMARY | How traffic is spread, checked, and terminated |
| 9 | DNS, Private Endpoints, and Service Discovery | PRIMARY | Naming services and reaching them privately |
Cluster mastery check: Can you sketch a 3-tier app in a VPC with public ALB, private app subnet, and database subnet, and name the routing required?
Cluster 4: Storage and Databases
| Order | Concept | Type | Focus |
|---|---|---|---|
| 10 | Object, Block, and File Storage: When Each Is Right | PRIMARY | S3 vs EBS vs EFS semantics and costs |
| 11 | Managed Databases: RDS, Aurora, DynamoDB, Cloud SQL | PRIMARY | Relational vs NoSQL managed offerings |
| 12 | Data Egress and Region Boundaries in Cost and Compliance | SUPPORTING | The invisible tax on "just move the data" |
Cluster mastery check: For a given workload, can you pick the storage service, justify the cost, and predict where data egress charges would appear?
Cluster 5: Identity and Accounts
| Order | Concept | Type | Focus |
|---|---|---|---|
| 13 | IAM: Principals, Policies, Roles vs Users | PRIMARY | The core access-control model of the cloud |
| 14 | Account Structure, Organizations, and Landing Zones | PRIMARY | Multi-account strategy and guardrails at scale |
| 15 | Billing Fundamentals: Units of Spend, Tagging, Budgets | SUPPORTING | Reading the bill before it reads you |
Cluster mastery check: Given a new team joining an organization, can you describe the accounts they get, the baseline IAM, the tagging policy, and the budget guardrails?
Then work these practice pages:
| Order | Practice path | Focus |
|---|---|---|
| 1 | Account and Networking Lab | Set up an account, plan a VPC, draw the topology |
| 2 | Compute and Storage Workshop | Pick compute and storage for realistic workloads |
| 3 | IAM Least-Privilege Clinic | Write, critique, and tighten policies |
| 4 | Cloud Katas | Timed drills across IAM, networking, compute, and accounts |
Use Module Quiz after the concept and practice path. Use Reference and Learning Resources for targeted reinforcement.
Learning Objectives
By the end of this module you should be able to:
- Draw the shared-responsibility boundary for IaaS, PaaS, and serverless workloads and name which failures belong to whom.
- Reason about regions, availability zones, and the failure-domain implications of single-AZ vs multi-AZ deployments.
- Pick a compute primitive (VM, container, serverless) for a given workload and defend it against the alternatives.
- Sketch a VPC with public and private subnets, routing, and NAT for a 3-tier application.
- Choose between an L4 and an L7 load balancer and explain what TLS termination implies for security and observability.
- Use DNS, private endpoints, and service discovery to reach internal services without crossing the public internet.
- Pick among object, block, and file storage and between relational and NoSQL managed databases with cost and access-pattern reasoning.
- Predict where data-egress charges appear in an architecture and articulate the compliance implications of crossing region boundaries.
- Write a narrowly-scoped IAM policy in JSON and explain every element (Effect, Principal, Action, Resource, Condition).
- Describe an organization with multiple accounts, a landing zone, a tagging policy, and budget guardrails.
Outputs
- one annotated diagram of a 3-tier architecture in a VPC with public ALB, private app subnet, database subnet, NAT, and an internet gateway
- a written IAM policy for a hypothetical bucket, reviewed and tightened at least twice
- a compute-decision memo for four workloads comparing VM, container, and serverless with cost and cold-start reasoning
- a one-page landing-zone sketch naming at least four account types (management, security, shared-services, workload)
- a tagging policy listing at least six required tags (
owner,environment,cost-center,service,data-classification,compliance-scope) and the enforcement mechanism - a cost-surprise log naming at least six places data-egress or region-boundary costs hide
- a mistake journal with at least eight cloud-foundations errors (
0.0.0.0/0 on a security group,NAT gateway in the wrong AZ,IAM user with static keys in CI,noBlock Public Accesson a bucket, etc.)
Completion Standard
You have completed Module 1 when all of these are true:
- you can read an AWS, GCP, or Azure doc page directly without a translator
- you can draw the shared-responsibility boundary for any compute primitive on a whiteboard
- you can sketch a VPC with correct routing in under ten minutes
- you can write an IAM policy with
Effect,Principal(where appropriate),Action,Resource, and aCondition, and defend each element - you can name where data-egress charges appear in a multi-region design
- you can describe a landing zone well enough that a junior engineer could create their first account under it
If your workload "works in the console" but you cannot say which IAM role it assumes, which subnet it runs in, or what happens when an availability zone fails, the module is not complete.
Reading Policy
- Concept pages are the main path.
- Because no local book chunks exist for cloud-platform fundamentals, the escalation target for every concept is an official documentation page from AWS, GCP, or Azure.
See also (external)means "if the concept page is not enough, go here next." It is not a reading assignment.- Prefer the provider's own docs over third-party blog posts for exact behavior, quotas, pricing, and API shapes.
- Treat blog posts and articles as commentary, never as ground truth.
Suggested Weekly Flow
| Day | Work |
|---|---|
| 1 | Concepts 1-3 and one shared-responsibility diagram for a workload you already know |
| 2 | Concepts 4-6 and a compute-choice memo for four workloads |
| 3 | Concepts 7-8 and a VPC + load-balancer diagram |
| 4 | Concept 9 and a service-discovery sketch for a two-service app |
| 5 | Concepts 10-12 and a storage/database decision record |
| 6 | Concepts 13-14, write an IAM policy, sketch a landing zone |
| 7 | Concept 15, Practice pages 1-2 |
| 8 | Practice 3 (IAM clinic), quiz, and mistake-journal cleanup |
Reference
If you need external-source links grouped by concept, use Reference.
Rich Learning Pages
Worked Examples | Guided Labs | Case Studies | Mistake Clinic | Reading Guide | Capstone Thread