Learning Resources

This module is still official-docs-first, but it is no longer disconnected from the local semester library. Use Building Secure and Reliable Systems and Software Engineering at Google as the local support layer, then escalate to the official docs and canonical essays below for exact cloud, telemetry, and security guidance.

All URLs on this page were validated at module-write time.

Source Stack

Source	Role	How to use it in this module
OWASP (Threat Modeling community + Cheat Sheet Series)	Primary security reference	Canonical threat-modeling framing and STRIDE guidance, plus checklists for logging and many other areas
NIST SP 800-207	Primary identity / Zero Trust reference	The authoritative definition of Zero Trust and deployment models
AWS / Google Cloud / Azure Well-Architected (Security Pillars)	Primary cloud-specific security reference	Concrete services, patterns, and checklists for each cloud
HashiCorp Vault docs	Primary secrets reference	Dynamic secrets, auth methods, secret engines, leases
Google Cloud KMS envelope encryption docs	Primary encryption reference	Clearest short explanation of DEK/KEK envelope encryption
SLSA and Sigstore	Primary supply-chain reference	Compliance levels, provenance, signing, transparency log
OpenTelemetry docs	Primary observability reference	Signals, semantic conventions, sampling
Prometheus docs (instrumentation, naming)	Primary metrics reference	Naming, cardinality, metric types
Google SRE Book	Primary ops reference	Monitoring principles, golden signals, symptom-based alerting
Grafana Labs / Honeycomb / charity.wtf	Selective support	Cardinality in practice, observability definitions, 3 a.m.-on-call reality
Building Secure and Reliable Systems	Local support	The best local bridge between reliability engineering, security review, and incident response
Software Engineering at Google	Local support	Long-lived engineering systems, review culture, and operational quality habits
Local shell/Git books	Selective support	Shell and Git basics that sharpen operational habits

Resource Map by Cluster

Cluster 1: Cloud Security Foundations

Need	Best external source	Why
Threat-modeling framing	OWASP: Threat Modeling	Canonical four-question framing
STRIDE checklist	OWASP Cheat Sheet: Threat Modeling	Compact step-by-step reference
STRIDE origin and tooling	Microsoft Learn: Threat Modeling Tool	The source of STRIDE as a practical tool
Zero Trust definition	NIST SP 800-207	Authoritative reference used across the industry
Layered security / defense in depth	AWS Well-Architected Security Pillar	Concrete AWS guidance
Layered security (GCP)	Google Cloud Well-Architected: Security	Zero-trust aligned layered patterns
Layered security (Azure)	Azure Well-Architected: Security	Checklists and maturity models

Cluster 2: Secrets, Keys, and Data

Need	Best external source	Why
Secret management architecture	HashiCorp Vault docs	Canonical reference on dynamic secrets, auth methods, leases
Envelope encryption explained	Google Cloud KMS: Envelope encryption	Clearest DEK/KEK walkthrough with best practices
Encryption in a cloud context	AWS Well-Architected: Security Pillar	Data protection patterns tied to AWS KMS
What not to log (keeps classification honest)	OWASP Logging Cheat Sheet	Directly relevant to data minimization in logs

Cluster 3: Network and Runtime Security

Need	Best external source	Why
Network moat patterns	AWS Well-Architected: Security Pillar	Canonical SG/NACL/VPC-endpoint patterns
Network moat (GCP)	Google Cloud Well-Architected: Security	Firewall rules, VPC Service Controls
Supply-chain framework	SLSA	Compliance levels and provenance concepts
Signing and verification	Sigstore	Canonical OSS signing with cosign and Rekor

Cluster 4: Observability Pillars in Cloud

Need	Best external source	Why
OpenTelemetry model overview	OpenTelemetry Concepts	Signals, context, semantic conventions in one page
Traces and span model	OpenTelemetry Traces	Span model, attributes, status, kinds
Sampling strategies	OpenTelemetry Sampling	Head vs tail sampling with trade-offs
OTel project status	CNCF: OpenTelemetry	Community maturity and case studies
Metric naming and cardinality	Prometheus: Metric and Label Naming	Canonical guidance on labels and units
Instrumentation shape	Prometheus: Instrumentation	USE/RED-style patterns and metric types
Cardinality failure modes	Grafana Labs: Cardinality Spikes	How cardinality blows up in real systems
Observability definitions	Honeycomb: Observability Glossary	Working definitions of dashboards, alerts, SLOs
Log pipeline design	OWASP Logging Cheat Sheet	What to log, what not to log, protect the pipeline

Cluster 5: Operating Under Observation

Need	Best external source	Why
Monitoring principles and golden signals	Google SRE Book: Monitoring Distributed Systems	Canonical four-golden-signals chapter
Practical alerting	Google SRE Book: Practical Alerting	Symptom-based alerting, alert noise as a cost
Whole-book navigation	Google SRE Book: Table of Contents	Free online edition; use for incident response, post-mortems
Observability realism	charity.wtf: Observability is a Many-Splendored Definition	Why metrics alone are not observability, grounded in operator experience

Local Book Chunks (Loosely Relevant)

The books under library/raw/semester-09-cloud-devops/books/ are included for Git and Linux shell fluency, which sharpen operational habits. They are not the primary teachers for this module.

The Linux Command Line -- useful for runbook commands, log hygiene, shell-level habits.
Pro Git and Git from the Bottom Up -- useful for runbook-as-code discipline and for the reviewability habit that security and observability both need.

Open them only if a runbook or a pipeline task exposes a shell or Git gap, not for the security or observability material itself.

Use Rules

For security topics, open OWASP / NIST / the relevant cloud provider's Well-Architected security pillar first.
For observability topics, open OpenTelemetry or the SRE book first.
For supply chain, SLSA and Sigstore are the primary sources.
Use essays (Honeycomb, charity.wtf, Grafana blog) for intuition, not as authoritative references.
If you cannot find the answer in one official doc in 5 minutes, stop and write the gap question in plain words before continuing -- it is almost always a definition mismatch, not a research gap.

Source Stack​

Resource Map by Cluster​

Cluster 1: Cloud Security Foundations​

Cluster 2: Secrets, Keys, and Data​

Cluster 3: Network and Runtime Security​

Cluster 4: Observability Pillars in Cloud​

Cluster 5: Operating Under Observation​

Local Book Chunks (Loosely Relevant)​

Use Rules​