Cumulative Review (Semesters 0--9)

note

[TECHNIQUE 7: Interleaved Review] Mix every prior semester with Semester 9, viewed through a cloud and operations lens. Answer closed-book first, then verify against notes and mark misses for spaced repetition. Fifty-five questions.

Instructions

Set a three-hour window. Work in one sitting if you can. Questions deliberately jump between semesters; resist the urge to re-sort. For every answer you get wrong or can only partially defend, write the question onto a new spaced-repetition card before moving on.

Questions

[S0] Describe the learning loop this program relies on (concept -> practice -> retrieval -> output). Where in your Semester 9 workflow did this loop break down most often, and what did you change about your study cadence?
[S0] "If the learner only read the books, the semester is not complete." Translate that into three observable behaviors a reviewer would see in your project repo today.
[S1] Define Big-O, Big-Theta, and Big-Omega. Use each one correctly in a sentence about a real part of your Semester 9 system (for example, request routing, pod scheduling, or log ingestion).
[S1] You need to estimate how many concurrent users your api can hold before latency degrades. Set up the Little's Law calculation (L = λW) with a concrete example from your project, and state one assumption that would invalidate the result.
[S1] A log-scale dashboard and a linear-scale dashboard tell different stories for the same traffic. Using what you know about logarithms, explain when each one misleads and which you would choose for a p99 latency panel.
[S2] Compare hash-based and tree-based maps under a workload where keys arrive sorted (for example, timestamped events). Which would you prefer for an in-memory cache in your worker service, and why?
[S2] Design a rate limiter for your api. Pick a data structure (token bucket, sliding window, leaky bucket), state its time and space complexity, and explain how it behaves under a burst of 10× normal traffic.
[S2] Describe a real algorithmic choice you made in the project (sorting, deduplication, retry scheduling) and walk through the complexity analysis.
[S2] Given an unsorted stream of 100 million events per day, you need the top-10 most frequent keys in near-real-time. Name two algorithms that solve this, their tradeoffs, and which one is easier to operate in a Kubernetes worker.
[S2] Explain amortized analysis using the dynamic-array-resize example. Then connect it to a real cloud operation: how a container registry's garbage collection "costs nothing" on most pushes but is expensive on one.
[S3] Identify two SOLID violations likely to appear in a service that bolts on cloud clients (S3, DynamoDB, SQS) directly in its business logic. Refactor one with a sketch.
[S3] Apply the Dependency Inversion Principle to the way your worker talks to its queue. What abstraction boundary did you pick, and what does it buy you when you want to swap SQS for Kafka?
[S3] Pick a design pattern you implicitly used in Semester 9 (factory for cloud clients, adapter for the DB, strategy for retry policies) and argue whether it earned its complexity.
[S3] You receive a PR that "makes the code cleaner" by introducing an abstract CloudProvider interface with AWS, GCP, and Azure implementations. Critique it using the Rule of Three and the cost of speculative generality.
[S3] What is a code smell you found and fixed in your own Semester 9 repo, and what heuristic told you to fix it?
[S4] Draw what happens, at the process and file-descriptor level, when a container starts: execve, PID 1, stdout/stderr handling, and why a misconfigured PID 1 leads to zombie reaping problems in Kubernetes.
[S4] Explain virtual memory and how container memory limits interact with the kernel OOM killer. What does "OOMKilled" actually mean in kubectl describe pod?
[S4] You notice your Go service pinned at 100% CPU on one pod but not others. Walk through the diagnosis path from kubectl top down to profiling -- what system-level mechanisms let you inspect a process running inside a container?
[S4] A container image is 1.2 GB; a stripped-down equivalent is 80 MB. Explain where that difference comes from (layers, base image, dev tooling) and how it affects cold-start and security posture.
[S4] Define an inode and describe how mounting an object store as a filesystem (e.g., with a FUSE driver) compares to native access via the S3 API in terms of semantics and failure modes.
[S5] Walk through what happens from typing curl https://api.example.com/orders on your laptop to bytes arriving at your api pod: DNS, TCP, TLS, load balancer, Kubernetes service, and pod networking.
[S5] A managed database is in a private subnet; your worker pods need to reach it. Describe the full path: security groups, route tables, NAT (or absence thereof), and DNS. Where does this call fail silently if one link is wrong?
[S5] Explain the difference between an L4 and an L7 load balancer with an example where the choice matters (e.g., sticky sessions, TLS termination, per-URL routing).
[S5] What is the TCP three-way handshake, and why does a misconfigured security group cause a "connection timed out" rather than a "connection refused"?
[S5] Describe TLS termination at the load balancer versus mTLS inside the cluster. For your project, where does TLS start and stop, and what would it take to enforce end-to-end encryption?
[S5] DNS is cached at multiple layers. Name three of them and describe one production incident pattern caused by stale DNS after failover.
[S5] What is a file descriptor leak at the syscall level, and how does it present in a long-running worker process?
[S6] Your managed Postgres has one primary in AZ-a. Product demands "no read downtime during an AZ failure and no data loss." Compare multi-AZ standby, read replicas, and multi-region failover, and pick one with its consistency and cost implications.
[S6] Define eventual consistency precisely. In your project, where do you already have eventual consistency whether you wanted it or not (cache, replica lag, queue delivery), and what user-visible behavior could surprise you?
[S6] Design a schema for an audit log that your cloud services write to. Relational or NoSQL? What indexes? How will you keep the write path fast without killing query performance?
[S6] An index lookup hits the buffer pool, then the disk, then returns. Map each of those three stages onto the equivalent in a cloud-managed RDS instance, including where EBS provisioned IOPS come into play.
[S6] Explain CAP with a concrete scenario from your project. Which leg are you sacrificing during an AZ partition, and is that choice documented anywhere outside your head?
[S6] Describe how you would run a schema migration on a production managed database without downtime. Include rollback strategy.
[S6] Compare snapshot-based backups and PITR (point-in-time recovery). For your project, how long would it take to restore to 20 minutes before an incident, and have you tested it?
[S7] Pick one module of your Semester 9 project and describe its bounded context: language, data, and external collaborators. Is the boundary clean or are there leaks into other modules?
[S7] Your api publishes a message shape; your worker consumes it. Is that relationship customer/supplier, conformist, or shared kernel? What ADR would you write to capture the decision, and what compatibility rules follow from it?
[S7] Rewrite one quality attribute for your project ("the api should be fast") as a measurable scenario, including source, stimulus, environment, response, and measure.
[S7] You are introducing a second team that will own the worker. Draw the context map and identify at least one place you would add an anti-corruption layer.
[S7] A senior engineer says "this modular monolith was the right call, but we should be microservices by next year." Write the three bullets you would put into an ADR to capture the current decision and its review triggers.
[S7] Your architecture review packet exists. Walk through how you would use it to onboard a new engineer in half a day, and what sections you would update after this semester.
[S7] Where does your architecture explicitly accept a risk rather than mitigate it? Name two risks and the evidence that would cause you to reopen the decision.
[S8] Define SLI, SLO, SLA, and error budget. For your api service, write one of each precisely, including how you measure it.
[S8] You are at 50% of your monthly error budget on day 10. What policy changes should that trigger for the team, per the SRE framing?
[S8] Design the decomposition for a second service that reads from your database. Will it share the database, take over a schema, or get its own? Justify using S7 boundary thinking and S6 data-ownership reasoning.
[S8] Describe a failure scenario in which your system violates its SLO despite every dashboard being green. How would you detect this, and what would you change to close the gap?
[S8] Write the three bullets you would put in a one-page strategy memo explaining why your team should not adopt a new container runtime this quarter, aimed at a non-engineering stakeholder.
[S8] You are asked to scale your system 10×. For each tier (ingress, compute, cache, database), name what breaks first and one mitigation.
[S8] Walk through how you would lead an architecture review of a peer team's design that you disagree with. What artifacts do you ask for, and what is your one sentence of written feedback?
[S9] Describe the Terraform plan/apply lifecycle in your own words, including state, locking, drift, and why -auto-approve on a production branch is almost always wrong.
[S9] Apply STRIDE to your CI/CD pipeline. Name one threat per category and the existing control, or honestly mark where you have no mitigation.
[S9] Trace one request through your observability stack from log line -> trace span -> metric. What correlation identifier ties them together, and what breaks when a service forgets to propagate it?
[S9] You replace a long-lived AWS access key in CI with OIDC federation. Describe the cloud-side trust relationship and the workflow change, and how you verify the old key is dead.
[S9] A Kubernetes Deployment is stuck in ImagePullBackOff. List five causes from most to least likely and the kubectl command that distinguishes each.
[S9] Define the four DORA metrics and describe a behavior that improves each metric without actually improving delivery. How do you guard against gaming them on your team?
[S9] Your monthly cloud bill doubled. Walk through your diagnosis: where you look first, what tags you rely on, and the three usual suspects (NAT traffic, data egress, idle managed services).

Final Readiness Check

Before treating this review as done:

Count correct answers out of 55 closed-book. Aim for ≥ 40. Below 35 means you should repeat the most-missed semester before the checkpoint gate.
For every missed question, create or update a spaced-repetition card and note the semester tag.
Pick the two weakest semesters and schedule a short focused review week before the Semester 10 capstone.

Answer Key

Write your own. This review is most useful when you attempt each question, grade yourself with notes and books open, then record the corrected answer in your own words. A "key" written by someone else short-circuits the retrieval practice this review exists to create. For a reviewed answer to count, it should:

be in your own wording, not a copy of a doc or textbook,
cite the semester and module or concept page it draws from,
name at least one tradeoff or failure mode where applicable.

If you cannot meet those three criteria, the card goes back into the deck.

Instructions​

Questions​

Final Readiness Check​

Answer Key​

Instructions

Questions

Final Readiness Check

Answer Key