Skip to main content

Semester 6 Exam

Required Output Classification

Required outputClassificationPublic/private guidance
Timed written answers, diagrams, code snippets, and design responsesCheckpoint evidenceKeep raw exam work private so it remains useful for assessment and retake calibration.
Post-exam review notes, missed-answer repairs, and Feynman explanationsPractice artifactUse for spaced review; publish only rewritten explanations that no longer reveal exam solutions wholesale.
Capstone-defense or architecture-defense packets created from exam promptsPortfolio candidatePolish publicly only when they are original to your project, sanitized, and framed as engineering rationale rather than exam answers.

This exam checks whether you can reason across the full data stack: schema, storage, transactions, and distributed tradeoffs. Treat it as closed-book on the first pass.


Rules

  • Suggested duration: 2.5 hours
  • Pass 1: closed book
  • Pass 2: open notes only for self-correction and citation repair
  • Allowed artifacts during grading: your semester notes, local book chunks, schema drafts, and project evidence

Section A: Short Answer and Definitions

Answer in 4-8 sentences each.

  1. Explain why relational modeling is still valuable even in systems that later add caches, queues, or search indexes.
  2. Define index selectivity and explain why it matters to query performance.
  3. Compare replication for availability with partitioning for scale. What different problems are they solving?
  4. Explain one transaction anomaly and one way a database system prevents it.
  5. Why is partial failure the default mental model for distributed systems rather than an edge case?

Section B: Applied Data Modeling and SQL

Use a commerce, ticketing, or learning-platform domain of your choice.

  1. Design a relational schema with at least five tables, appropriate keys, and at least three integrity constraints. Explain two modeling decisions that reduce future bugs.
  2. Write:
    • one join-heavy query
    • one aggregate query
    • one query that benefits from a carefully chosen index

For each query, explain what you expect the engine to do and what would make the query slow.


Section C: Storage Engines and Indexing

  1. You have a workload with heavy writes, periodic compaction pressure, and large range scans for analytics. Compare a B-tree-oriented approach with an LSM-oriented approach and defend your choice.
  2. A team wants to add indexes to fix performance complaints. Describe the process you would use to decide whether the problem is indexing, schema shape, query formulation, or workload mismatch.

Section D: Transactions and Consistency

  1. A booking system occasionally double-allocates seats during peak load. Describe:
    • the likely class of anomaly
    • two possible fixes
    • the cost of each fix
  2. Your product team wants read-your-writes behavior after profile updates, but the system uses replicated reads. Explain the issue and propose a design approach that satisfies the requirement.

Section E: Distributed Systems Reasoning

  1. A service is partitioned across regions. During a network event, each side can still serve some traffic. Explain the tradeoffs between continuing with reduced guarantees and halting writes until coordination is restored.
  2. Describe how time, retries, and duplicate delivery can interact to produce correctness bugs in a distributed workflow.

Section F: Cross-Cutting Engineering

  1. Define a test strategy for a schema migration that touches application code, background jobs, and reporting queries.
  2. List the observability and security controls you would require before trusting a production data service with sensitive records.

Scoring Rubric

SectionMax points
A20
B20
C15
D15
E15
F15

Suggested interpretation:

  • 85-100: strong control of Semester 6 material
  • 70-84: ready to proceed with a focused repair list
  • Below 70: review modules before advancing

What Strong Answers Show

  • mechanism, not slogans
  • explicit tradeoffs, not one-sided claims
  • evidence that storage, correctness, and operations are connected
  • language another engineer could review and trust

Mastery Rubric

LevelEvidence
Beginner passCan answer direct questions and complete familiar exercises with light notes.
Solid passCan solve new variants, explain choices, and connect the work to Semester 5 Operating Systems and Networking.
Strong passCan defend tradeoffs, identify failure modes, and produce clean evidence in the portfolio artifact.
Not readyRelies on copied solutions, cannot explain mistakes, or lacks durable artifacts.

Retake and Repair Rule

If a section is weak, do not only reread. Repair it by producing new evidence: a corrected solution, a fresh implementation, a rewritten proof, a benchmark, a diagram, a runbook, or a short teaching note.


Answer-Quality Examples

Use these examples when grading written answers or spoken explanations.

QualityExample pattern
WeakNames a concept but gives no example, constraint, or failure case.
AcceptableDefines the concept and applies it to a familiar exercise.
StrongApplies the concept to a new variant and explains why an alternative would fail.
Portfolio-readyConnects the concept to Semester 5 Operating Systems and Networking, current project evidence, and a future capstone decision.

Interleaving Prompt

For any missed answer, add one sentence starting with: This depends on an earlier skill because...

Calibration Materials

Use these learner-visible calibration materials before self-grading or requesting review: