Build Your Own Web Browser Engine
"The pipeline of a browser engine — parse → style → layout → paint — is the most elegant architecture in software." — Lin Clark (paraphrased)
A browser engine is one of the most architecturally interesting projects you can build. It is a multi-pass compiler whose output is a 2D image. The pipeline — HTML parser → DOM → CSS parser → style tree → layout tree → display list → paint — is a textbook example of staged transformation.
1. Overview & motivation
A browser engine takes HTML + CSS as input and produces a pixel grid as output, through a series of intermediate representations:
HTML text → [HTML parser] → DOM tree
CSS text → [CSS parser] → stylesheet (rules)
DOM + CSS → [style] → styled tree (each node has computed styles)
Styled tree → [layout] → layout tree (each node has a box: x, y, w, h)
Layout tree → [paint] → display list (rectangles, text, images to draw)
Display list → [render] → pixel grid
What you can only learn by building one:
- Why the DOM is a tree — and why that single design decision shapes everything that follows.
- Why CSS specificity is a real algorithm, not just a coincidence.
- Why block and inline layout are conceptually different and historically combined.
- Why flexbox and grid were necessary upgrades.
- Why browser engines are slow to develop — Servo took years just to be a research prototype.
2. Where this fits in the degree
- Phase: Architecture
- Semester: 7 (Architecture and DDD)
- Modules deepened: Module 1 (architecture fundamentals — quality attributes show up at every layer), Module 2 (modular architecture — the pipeline is the canonical modular pipeline).
Cross-phase relevance:
- Direct application of the Compiler tutorial — parsing, multi-pass transformation.
- HTML and CSS parsing reuse techniques from the Interpreter tutorial.
3. Prerequisites
- Complete the Interpreter tutorial — parsing is the gateway technology.
- Rust or Python (most BYO-X tutorials use one of these).
- A graphics library:
Cairo,Skia, ortiny-skiafor actual rendering. For the tutorial, ASCII art is often used to skip painting.
4. Theory & research
Required reading
- Tali Garsiel & Paul Irish, "How Browsers Work" (html5rocks.com/en/tutorials/internals/howbrowserswork) — the definitive overview. ⭐ start here.
- Pavel Panchekha & Chris Harrelson, "Web Browser Engineering" (browser.engineering) — free online book. Builds a complete browser in Python, ~1,000 lines per chapter. ⭐ recommended primary.
Strongly recommended
- Matt Brubeck, "Let's build a browser engine!" (limpet.net/mbrubeck/2014/08/08/toy-layout-engine-1.html) — 8-part Rust series. Predecessor of Servo. Concise and excellent. ⭐ alternative primary.
- Servo project (servo.org) — modern Rust browser engine. Source is illuminating.
- MDN Web Docs — for HTML and CSS reference.
CSS specifically
- CSS specifications (w3.org/Style/CSS). Modular; pick the relevant ones (selectors, box model, flexbox).
- CSS spec for the box model: CSS 2.1, Chapter 9 — most readable normative text on what "block" and "inline" mean.
5. Curated tutorial list (from BYO-X)
- Rust: Let's build a browser engine — Matt Brubeck, limpet.net/mbrubeck/ ⭐ recommended primary (Rust)
- Python: Browser Engineering — browser.engineering ⭐ recommended primary (Python)
(Two excellent tutorials. Pick the one in the language you prefer.)
6. Recommended primary path
Two excellent paths; pick by language preference.
Path A: Matt Brubeck's "Let's build a browser engine!" (Rust)
8 blog posts. Builds a toy engine in ~1,000 lines of Rust:
- Getting started.
- HTML parser.
- CSS parser.
- Style.
- Boxes (block layout).
- Block layout.
- Painting.
- Where to go from here.
The output is a PNG of a rendered HTML+CSS document. Just block layout — no inline, no flex, no JavaScript.
Path B: Panchekha & Harrelson's "Web Browser Engineering" (Python)
A free online book. Goes much further than Brubeck:
- HTML parser, CSS parser, layout, paint (same as Brubeck).
- JavaScript execution (uses an embedded JS engine).
- Forms.
- Security (same-origin policy).
- Animations and visual effects.
Substantially larger project; closer to a real browser.
For this degree: Path A first (1 month), Path B if you want depth (3+ months).
7. Implementation milestones (Brubeck's path)
Milestone 1: HTML parser
A minimal HTML parser. Handles tags, text, attributes. Builds a DOM tree.
pub struct Node { pub children: Vec<Node>, pub node_type: NodeType }
pub enum NodeType { Element(ElementData), Text(String) }
pub struct ElementData { pub tag_name: String, pub attributes: AttrMap }
pub type AttrMap = HashMap<String, String>;
impl Parser {
fn parse_node(&mut self) -> Node {
if self.starts_with("<") { self.parse_element() }
else { self.parse_text() }
}
fn parse_element(&mut self) -> Node {
assert!(self.consume_char() == '<');
let tag_name = self.parse_tag_name();
let attrs = self.parse_attributes();
assert!(self.consume_char() == '>');
let children = self.parse_nodes(); // recursive!
assert!(self.consume_char() == '<');
assert!(self.consume_char() == '/');
self.parse_tag_name();
assert!(self.consume_char() == '>');
Node { children, node_type: NodeType::Element(ElementData { tag_name, attributes: attrs }) }
}
}
Real HTML is a horror show — implicit closing tags, error recovery, parser modes. The tutorial parser is strict and minimal. State the limitation.
Evidence: Parse <html><body><h1>Hello</h1></body></html> into a 3-deep tree.
Milestone 2: CSS parser
selector { property: value; property: value; }
Parse selectors (tag, class, id), properties, values. No specificity yet.
pub struct Stylesheet { pub rules: Vec<Rule> }
pub struct Rule { pub selectors: Vec<Selector>, pub declarations: Vec<Declaration> }
pub struct Declaration { pub name: String, pub value: Value }
pub enum Value { Keyword(String), Length(f32, Unit), ColorValue(Color) }
Evidence: Parse h1 { color: red; font-size: 20px; } correctly.
Milestone 3: Style tree
For each DOM node, find the matching CSS rules, sort by specificity, compute the styled node.
pub struct StyledNode<'a> {
pub node: &'a Node,
pub specified_values: PropertyMap,
pub children: Vec<StyledNode<'a>>,
}
Evidence: Each styled node has the right computed styles. Specificity disputes resolve correctly.
Milestone 4: Box generation and block layout
Each styled node becomes a Box. Each Box has a BoxType (block, inline, anonymous). Compute width, height, x, y.
CSS box model: each box has content, padding, border, margin. Width = content + 2padding + 2border + 2*margin (block-level).
pub struct LayoutBox<'a> {
pub dimensions: Dimensions,
pub box_type: BoxType<'a>,
pub children: Vec<LayoutBox<'a>>,
}
pub struct Dimensions {
pub content: Rect,
pub padding: EdgeSizes,
pub border: EdgeSizes,
pub margin: EdgeSizes,
}
The block-layout algorithm is straightforward but full of subtle width-resolution rules. Read CSS 2.1 §10.
Evidence: A document with nested blocks gets correct box sizes and positions.
Milestone 5: Paint
Convert the layout tree into a flat list of paint commands. Then rasterize to a pixel buffer.
pub enum DisplayCommand {
SolidColor(Color, Rect),
}
pub struct Canvas { pub pixels: Vec<Color>, pub width: usize, pub height: usize }
impl Canvas {
pub fn paint(&mut self, item: &DisplayCommand) {
match item {
DisplayCommand::SolidColor(color, rect) => {
for y in rect.y..rect.y + rect.height {
for x in rect.x..rect.x + rect.width {
self.pixels[y * self.width + x] = *color;
}
}
}
}
}
}
Brubeck's tutorial outputs a PNG.
Evidence: Open a test HTML+CSS file with nested colored boxes. Output PNG matches a hand-drawn reference.
Milestone 6: Inline layout (text!)
Brubeck doesn't cover this. Adding inline layout is the single biggest leap.
You need:
- Text breaking (where can lines wrap?).
- Font metrics (height, baseline, kerning).
- Line boxes containing one or more inline boxes.
Once you have text, you have a real toy browser.
Evidence: A paragraph of text wraps correctly at the document width.
Milestone 7 (browser.engineering path): JavaScript
Embed a JavaScript engine (V8, QuickJS, or Python's js2py). Wire DOM access (document.getElementById, element.style).
This is when "browser engine" becomes "browser." Substantial.
Milestone 8 (optional): Networking
HTTP fetcher. Now your browser can open("http://example.com").
8. Tests & evidence
| Test | How |
|---|---|
| HTML parser | Round-trip simple documents |
| CSS parser | Parse all selectors and values from a small stylesheet |
| Specificity | #id .class rule wins over .class rule |
| Box dimensions | Sum of content+padding+border+margin matches the spec |
| Layout positions | Sibling blocks stack vertically; nested blocks offset by parent |
| Paint | Output PNG looks correct |
| Quality on a real page | Render a tiny static page (your own test case, not a real website — real pages need way more spec coverage) |
The strongest evidence: a side-by-side comparison of your output and Firefox/Chrome for the same simple test HTML. Document the differences honestly.
9. Common pitfalls
- Trying to parse real HTML. Real HTML uses implicit closing tags, error recovery, parser modes. A toy strict parser is the right scope. State the limitation.
- Specificity off-by-one. CSS specificity rules are well-defined (W3C). Hand-trace examples.
- Forgetting
display: none. Skip these elements in layout. - Inline before you understand it. Block layout is hard enough. Skip inline on your first pass.
- Trying to render fonts. Use a library (cairo, skia, fontdue). Don't write a font rasterizer for your first browser.
- No idea where to stop. Brubeck stops at block layout + paint. browser.engineering goes much further. Decide your scope before starting.
- Treating the engine as one big file. Each pipeline stage is a clear module. Keep them separate.
10. Extensions
- Inline layout — text wrapping. Huge step.
- JavaScript — embed an engine. Wire to DOM.
- Forms and events — click handlers, input fields.
- Flexbox / grid — modern CSS layout. Significant.
- Networking — fetch resources over HTTP.
- Incremental rendering — restyle and re-layout only changed subtrees.
- GPU acceleration — composite layers; offload to GPU.
A full browser engine is a multi-year project. Servo and Firefox have hundreds of engineers. Know where to stop.
11. Module integration
| Module | What the browser engine deepens |
|---|---|
| Sem 7 Module 1 — Architecture fundamentals | The pipeline embodies "single responsibility per stage." |
| Sem 7 Module 2 — Architecture patterns | Multi-pass compiler is the canonical layered architecture. |
| Compiler tutorial | Same multi-pass shape. HTML → DOM → styled → laid out → painted is the same pattern as source → AST → typed → IR → assembly. |
| Interpreter tutorial | Parsing HTML and CSS uses the same techniques. |
12. Portfolio framing
What to publish:
- Source organized by pipeline stage:
html/,css/,style/,layout/,paint/. - A set of test HTML files with their rendered PNGs (yours and a reference).
- A README with:
- Pipeline diagram.
- Supported HTML/CSS features (small list).
- Skipped features (long list).
- A comparison rendering against Firefox/Chrome for the same input.
Reviewer entry points:
html/parser.rs— DOM construction.layout/block.rs— block layout algorithm.tests/snapshots/— input HTML + expected PNG.- README must include the pipeline diagram and the side-by-side rendering.
A browser engine is an unusually striking portfolio project. Even a partial one — Brubeck's 8-post version — demonstrates serious architectural thinking and parsing fluency.
Source
This tutorial draws from the BYO-X catalog "Web Browser" section. Matt Brubeck's blog series and Panchekha & Harrelson's "Web Browser Engineering" book are the canonical primary references.