Build Your Own Web Browser Engine

"The pipeline of a browser engine — parse → style → layout → paint — is the most elegant architecture in software." — Lin Clark (paraphrased)

A browser engine is one of the most architecturally interesting projects you can build. It is a multi-pass compiler whose output is a 2D image. The pipeline — HTML parser → DOM → CSS parser → style tree → layout tree → display list → paint — is a textbook example of staged transformation.

1. Overview & motivation

A browser engine takes HTML + CSS as input and produces a pixel grid as output, through a series of intermediate representations:

HTML text  → [HTML parser] → DOM tree
CSS text   → [CSS parser]  → stylesheet (rules)
DOM + CSS  → [style]       → styled tree (each node has computed styles)
Styled tree → [layout]     → layout tree (each node has a box: x, y, w, h)
Layout tree → [paint]      → display list (rectangles, text, images to draw)
Display list → [render]    → pixel grid

What you can only learn by building one:

Why the DOM is a tree — and why that single design decision shapes everything that follows.
Why CSS specificity is a real algorithm, not just a coincidence.
Why block and inline layout are conceptually different and historically combined.
Why flexbox and grid were necessary upgrades.
Why browser engines are slow to develop — Servo took years just to be a research prototype.

2. Where this fits in the degree

Phase: Architecture
Semester: 7 (Architecture and DDD)
Modules deepened: Module 1 (architecture fundamentals — quality attributes show up at every layer), Module 2 (modular architecture — the pipeline is the canonical modular pipeline).

Cross-phase relevance:

Direct application of the Compiler tutorial — parsing, multi-pass transformation.
HTML and CSS parsing reuse techniques from the Interpreter tutorial.

3. Prerequisites

Complete the Interpreter tutorial — parsing is the gateway technology.
Rust or Python (most BYO-X tutorials use one of these).
A graphics library: Cairo, Skia, or tiny-skia for actual rendering. For the tutorial, ASCII art is often used to skip painting.

4. Theory & research

Required reading

Tali Garsiel & Paul Irish, "How Browsers Work" (html5rocks.com/en/tutorials/internals/howbrowserswork) — the definitive overview. ⭐ start here.
Pavel Panchekha & Chris Harrelson, "Web Browser Engineering" (browser.engineering) — free online book. Builds a complete browser in Python, ~1,000 lines per chapter. ⭐ recommended primary.

Strongly recommended

Matt Brubeck, "Let's build a browser engine!" (limpet.net/mbrubeck/2014/08/08/toy-layout-engine-1.html) — 8-part Rust series. Predecessor of Servo. Concise and excellent. ⭐ alternative primary.
Servo project (servo.org) — modern Rust browser engine. Source is illuminating.
MDN Web Docs — for HTML and CSS reference.

CSS specifically

CSS specifications (w3.org/Style/CSS). Modular; pick the relevant ones (selectors, box model, flexbox).
CSS spec for the box model: CSS 2.1, Chapter 9 — most readable normative text on what "block" and "inline" mean.

5. Curated tutorial list (from BYO-X)

Rust: Let's build a browser engine — Matt Brubeck, limpet.net/mbrubeck/ ⭐ recommended primary (Rust)
Python: Browser Engineering — browser.engineering ⭐ recommended primary (Python)

(Two excellent tutorials. Pick the one in the language you prefer.)

6. Recommended primary path

Two excellent paths; pick by language preference.

Path A: Matt Brubeck's "Let's build a browser engine!" (Rust)

8 blog posts. Builds a toy engine in ~1,000 lines of Rust:

Getting started.
HTML parser.
CSS parser.
Style.
Boxes (block layout).
Block layout.
Painting.
Where to go from here.

The output is a PNG of a rendered HTML+CSS document. Just block layout — no inline, no flex, no JavaScript.

Path B: Panchekha & Harrelson's "Web Browser Engineering" (Python)

A free online book. Goes much further than Brubeck:

HTML parser, CSS parser, layout, paint (same as Brubeck).
JavaScript execution (uses an embedded JS engine).
Forms.
Security (same-origin policy).
Animations and visual effects.

Substantially larger project; closer to a real browser.

For this degree: Path A first (1 month), Path B if you want depth (3+ months).

7. Implementation milestones (Brubeck's path)

Milestone 1: HTML parser

A minimal HTML parser. Handles tags, text, attributes. Builds a DOM tree.

pub struct Node { pub children: Vec<Node>, pub node_type: NodeType }
pub enum NodeType { Element(ElementData), Text(String) }
pub struct ElementData { pub tag_name: String, pub attributes: AttrMap }
pub type AttrMap = HashMap<String, String>;

impl Parser {
    fn parse_node(&mut self) -> Node {
        if self.starts_with("<") { self.parse_element() }
        else { self.parse_text() }
    }
    fn parse_element(&mut self) -> Node {
        assert!(self.consume_char() == '<');
        let tag_name = self.parse_tag_name();
        let attrs = self.parse_attributes();
        assert!(self.consume_char() == '>');
        let children = self.parse_nodes();  // recursive!
        assert!(self.consume_char() == '<');
        assert!(self.consume_char() == '/');
        self.parse_tag_name();
        assert!(self.consume_char() == '>');
        Node { children, node_type: NodeType::Element(ElementData { tag_name, attributes: attrs }) }
    }
}

Real HTML is a horror show — implicit closing tags, error recovery, parser modes. The tutorial parser is strict and minimal. State the limitation.

Evidence: Parse <html><body><h1>Hello</h1></body></html> into a 3-deep tree.

Milestone 2: CSS parser

selector { property: value; property: value; }

Parse selectors (tag, class, id), properties, values. No specificity yet.

pub struct Stylesheet { pub rules: Vec<Rule> }
pub struct Rule { pub selectors: Vec<Selector>, pub declarations: Vec<Declaration> }
pub struct Declaration { pub name: String, pub value: Value }
pub enum Value { Keyword(String), Length(f32, Unit), ColorValue(Color) }

Evidence: Parse h1 { color: red; font-size: 20px; } correctly.

Milestone 3: Style tree

For each DOM node, find the matching CSS rules, sort by specificity, compute the styled node.

pub struct StyledNode<'a> {
    pub node: &'a Node,
    pub specified_values: PropertyMap,
    pub children: Vec<StyledNode<'a>>,
}

Evidence: Each styled node has the right computed styles. Specificity disputes resolve correctly.

Milestone 4: Box generation and block layout

Each styled node becomes a Box. Each Box has a BoxType (block, inline, anonymous). Compute width, height, x, y.

CSS box model: each box has content, padding, border, margin. Width = content + 2padding + 2border + 2*margin (block-level).

pub struct LayoutBox<'a> {
    pub dimensions: Dimensions,
    pub box_type: BoxType<'a>,
    pub children: Vec<LayoutBox<'a>>,
}

pub struct Dimensions {
    pub content: Rect,
    pub padding: EdgeSizes,
    pub border: EdgeSizes,
    pub margin: EdgeSizes,
}

The block-layout algorithm is straightforward but full of subtle width-resolution rules. Read CSS 2.1 §10.

Evidence: A document with nested blocks gets correct box sizes and positions.

Milestone 5: Paint

Convert the layout tree into a flat list of paint commands. Then rasterize to a pixel buffer.

pub enum DisplayCommand {
    SolidColor(Color, Rect),
}

pub struct Canvas { pub pixels: Vec<Color>, pub width: usize, pub height: usize }

impl Canvas {
    pub fn paint(&mut self, item: &DisplayCommand) {
        match item {
            DisplayCommand::SolidColor(color, rect) => {
                for y in rect.y..rect.y + rect.height {
                    for x in rect.x..rect.x + rect.width {
                        self.pixels[y * self.width + x] = *color;
                    }
                }
            }
        }
    }
}

Brubeck's tutorial outputs a PNG.

Evidence: Open a test HTML+CSS file with nested colored boxes. Output PNG matches a hand-drawn reference.

Milestone 6: Inline layout (text!)

Brubeck doesn't cover this. Adding inline layout is the single biggest leap.

You need:

Text breaking (where can lines wrap?).
Font metrics (height, baseline, kerning).
Line boxes containing one or more inline boxes.

Once you have text, you have a real toy browser.

Evidence: A paragraph of text wraps correctly at the document width.

Milestone 7 (browser.engineering path): JavaScript

Embed a JavaScript engine (V8, QuickJS, or Python's js2py). Wire DOM access (document.getElementById, element.style).

This is when "browser engine" becomes "browser." Substantial.

Milestone 8 (optional): Networking

HTTP fetcher. Now your browser can open("http://example.com").

8. Tests & evidence

Test	How
HTML parser	Round-trip simple documents
CSS parser	Parse all selectors and values from a small stylesheet
Specificity	`#id .class` rule wins over `.class` rule
Box dimensions	Sum of content+padding+border+margin matches the spec
Layout positions	Sibling blocks stack vertically; nested blocks offset by parent
Paint	Output PNG looks correct
Quality on a real page	Render a tiny static page (your own test case, not a real website — real pages need way more spec coverage)

The strongest evidence: a side-by-side comparison of your output and Firefox/Chrome for the same simple test HTML. Document the differences honestly.

9. Common pitfalls

Trying to parse real HTML. Real HTML uses implicit closing tags, error recovery, parser modes. A toy strict parser is the right scope. State the limitation.
Specificity off-by-one. CSS specificity rules are well-defined (W3C). Hand-trace examples.
Forgetting display: none. Skip these elements in layout.
Inline before you understand it. Block layout is hard enough. Skip inline on your first pass.
Trying to render fonts. Use a library (cairo, skia, fontdue). Don't write a font rasterizer for your first browser.
No idea where to stop. Brubeck stops at block layout + paint. browser.engineering goes much further. Decide your scope before starting.
Treating the engine as one big file. Each pipeline stage is a clear module. Keep them separate.

10. Extensions

Inline layout — text wrapping. Huge step.
JavaScript — embed an engine. Wire to DOM.
Forms and events — click handlers, input fields.
Flexbox / grid — modern CSS layout. Significant.
Networking — fetch resources over HTTP.
Incremental rendering — restyle and re-layout only changed subtrees.
GPU acceleration — composite layers; offload to GPU.

A full browser engine is a multi-year project. Servo and Firefox have hundreds of engineers. Know where to stop.

11. Module integration

Module	What the browser engine deepens
Sem 7 Module 1 — Architecture fundamentals	The pipeline embodies "single responsibility per stage."
Sem 7 Module 2 — Architecture patterns	Multi-pass compiler is the canonical layered architecture.
Compiler tutorial	Same multi-pass shape. HTML → DOM → styled → laid out → painted is the same pattern as source → AST → typed → IR → assembly.
Interpreter tutorial	Parsing HTML and CSS uses the same techniques.

12. Portfolio framing

What to publish:

Source organized by pipeline stage: html/, css/, style/, layout/, paint/.
A set of test HTML files with their rendered PNGs (yours and a reference).
A README with:
- Pipeline diagram.
- Supported HTML/CSS features (small list).
- Skipped features (long list).
- A comparison rendering against Firefox/Chrome for the same input.

Reviewer entry points:

html/parser.rs — DOM construction.
layout/block.rs — block layout algorithm.
tests/snapshots/ — input HTML + expected PNG.
README must include the pipeline diagram and the side-by-side rendering.

A browser engine is an unusually striking portfolio project. Even a partial one — Brubeck's 8-post version — demonstrates serious architectural thinking and parsing fluency.

Source

This tutorial draws from the BYO-X catalog "Web Browser" section. Matt Brubeck's blog series and Panchekha & Harrelson's "Web Browser Engineering" book are the canonical primary references.

1. Overview & motivation​

2. Where this fits in the degree​

3. Prerequisites​

4. Theory & research​

Required reading​

Strongly recommended​

CSS specifically​

5. Curated tutorial list (from BYO-X)​

6. Recommended primary path​

Path A: Matt Brubeck's "Let's build a browser engine!" (Rust)​

Path B: Panchekha & Harrelson's "Web Browser Engineering" (Python)​

7. Implementation milestones (Brubeck's path)​

Milestone 1: HTML parser​

Milestone 2: CSS parser​

Milestone 3: Style tree​

Milestone 4: Box generation and block layout​

Milestone 5: Paint​

Milestone 6: Inline layout (text!)​

Milestone 7 (browser.engineering path): JavaScript​

Milestone 8 (optional): Networking​

8. Tests & evidence​

9. Common pitfalls​

10. Extensions​

11. Module integration​

12. Portfolio framing​

Source​