Skip to main content

Build Your Own Web Browser Engine

"The pipeline of a browser engine — parse → style → layout → paint — is the most elegant architecture in software." — Lin Clark (paraphrased)

A browser engine is one of the most architecturally interesting projects you can build. It is a multi-pass compiler whose output is a 2D image. The pipeline — HTML parser → DOM → CSS parser → style tree → layout tree → display list → paint — is a textbook example of staged transformation.


1. Overview & motivation

A browser engine takes HTML + CSS as input and produces a pixel grid as output, through a series of intermediate representations:

HTML text  → [HTML parser] → DOM tree
CSS text → [CSS parser] → stylesheet (rules)
DOM + CSS → [style] → styled tree (each node has computed styles)
Styled tree → [layout] → layout tree (each node has a box: x, y, w, h)
Layout tree → [paint] → display list (rectangles, text, images to draw)
Display list → [render] → pixel grid

What you can only learn by building one:

  • Why the DOM is a tree — and why that single design decision shapes everything that follows.
  • Why CSS specificity is a real algorithm, not just a coincidence.
  • Why block and inline layout are conceptually different and historically combined.
  • Why flexbox and grid were necessary upgrades.
  • Why browser engines are slow to develop — Servo took years just to be a research prototype.

2. Where this fits in the degree

  • Phase: Architecture
  • Semester: 7 (Architecture and DDD)
  • Modules deepened: Module 1 (architecture fundamentals — quality attributes show up at every layer), Module 2 (modular architecture — the pipeline is the canonical modular pipeline).

Cross-phase relevance:


3. Prerequisites

  • Complete the Interpreter tutorial — parsing is the gateway technology.
  • Rust or Python (most BYO-X tutorials use one of these).
  • A graphics library: Cairo, Skia, or tiny-skia for actual rendering. For the tutorial, ASCII art is often used to skip painting.

4. Theory & research

Required reading

  • Matt Brubeck, "Let's build a browser engine!" (limpet.net/mbrubeck/2014/08/08/toy-layout-engine-1.html) — 8-part Rust series. Predecessor of Servo. Concise and excellent. ⭐ alternative primary.
  • Servo project (servo.org) — modern Rust browser engine. Source is illuminating.
  • MDN Web Docs — for HTML and CSS reference.

CSS specifically

  • CSS specifications (w3.org/Style/CSS). Modular; pick the relevant ones (selectors, box model, flexbox).
  • CSS spec for the box model: CSS 2.1, Chapter 9 — most readable normative text on what "block" and "inline" mean.

5. Curated tutorial list (from BYO-X)

(Two excellent tutorials. Pick the one in the language you prefer.)


Two excellent paths; pick by language preference.

Path A: Matt Brubeck's "Let's build a browser engine!" (Rust)

8 blog posts. Builds a toy engine in ~1,000 lines of Rust:

  1. Getting started.
  2. HTML parser.
  3. CSS parser.
  4. Style.
  5. Boxes (block layout).
  6. Block layout.
  7. Painting.
  8. Where to go from here.

The output is a PNG of a rendered HTML+CSS document. Just block layout — no inline, no flex, no JavaScript.

Path B: Panchekha & Harrelson's "Web Browser Engineering" (Python)

A free online book. Goes much further than Brubeck:

  • HTML parser, CSS parser, layout, paint (same as Brubeck).
  • JavaScript execution (uses an embedded JS engine).
  • Forms.
  • Security (same-origin policy).
  • Animations and visual effects.

Substantially larger project; closer to a real browser.

For this degree: Path A first (1 month), Path B if you want depth (3+ months).


7. Implementation milestones (Brubeck's path)

Milestone 1: HTML parser

A minimal HTML parser. Handles tags, text, attributes. Builds a DOM tree.

pub struct Node { pub children: Vec<Node>, pub node_type: NodeType }
pub enum NodeType { Element(ElementData), Text(String) }
pub struct ElementData { pub tag_name: String, pub attributes: AttrMap }
pub type AttrMap = HashMap<String, String>;

impl Parser {
fn parse_node(&mut self) -> Node {
if self.starts_with("<") { self.parse_element() }
else { self.parse_text() }
}
fn parse_element(&mut self) -> Node {
assert!(self.consume_char() == '<');
let tag_name = self.parse_tag_name();
let attrs = self.parse_attributes();
assert!(self.consume_char() == '>');
let children = self.parse_nodes(); // recursive!
assert!(self.consume_char() == '<');
assert!(self.consume_char() == '/');
self.parse_tag_name();
assert!(self.consume_char() == '>');
Node { children, node_type: NodeType::Element(ElementData { tag_name, attributes: attrs }) }
}
}

Real HTML is a horror show — implicit closing tags, error recovery, parser modes. The tutorial parser is strict and minimal. State the limitation.

Evidence: Parse <html><body><h1>Hello</h1></body></html> into a 3-deep tree.

Milestone 2: CSS parser

selector { property: value; property: value; }

Parse selectors (tag, class, id), properties, values. No specificity yet.

pub struct Stylesheet { pub rules: Vec<Rule> }
pub struct Rule { pub selectors: Vec<Selector>, pub declarations: Vec<Declaration> }
pub struct Declaration { pub name: String, pub value: Value }
pub enum Value { Keyword(String), Length(f32, Unit), ColorValue(Color) }

Evidence: Parse h1 { color: red; font-size: 20px; } correctly.

Milestone 3: Style tree

For each DOM node, find the matching CSS rules, sort by specificity, compute the styled node.

pub struct StyledNode<'a> {
pub node: &'a Node,
pub specified_values: PropertyMap,
pub children: Vec<StyledNode<'a>>,
}

Evidence: Each styled node has the right computed styles. Specificity disputes resolve correctly.

Milestone 4: Box generation and block layout

Each styled node becomes a Box. Each Box has a BoxType (block, inline, anonymous). Compute width, height, x, y.

CSS box model: each box has content, padding, border, margin. Width = content + 2padding + 2border + 2*margin (block-level).

pub struct LayoutBox<'a> {
pub dimensions: Dimensions,
pub box_type: BoxType<'a>,
pub children: Vec<LayoutBox<'a>>,
}

pub struct Dimensions {
pub content: Rect,
pub padding: EdgeSizes,
pub border: EdgeSizes,
pub margin: EdgeSizes,
}

The block-layout algorithm is straightforward but full of subtle width-resolution rules. Read CSS 2.1 §10.

Evidence: A document with nested blocks gets correct box sizes and positions.

Milestone 5: Paint

Convert the layout tree into a flat list of paint commands. Then rasterize to a pixel buffer.

pub enum DisplayCommand {
SolidColor(Color, Rect),
}

pub struct Canvas { pub pixels: Vec<Color>, pub width: usize, pub height: usize }

impl Canvas {
pub fn paint(&mut self, item: &DisplayCommand) {
match item {
DisplayCommand::SolidColor(color, rect) => {
for y in rect.y..rect.y + rect.height {
for x in rect.x..rect.x + rect.width {
self.pixels[y * self.width + x] = *color;
}
}
}
}
}
}

Brubeck's tutorial outputs a PNG.

Evidence: Open a test HTML+CSS file with nested colored boxes. Output PNG matches a hand-drawn reference.

Milestone 6: Inline layout (text!)

Brubeck doesn't cover this. Adding inline layout is the single biggest leap.

You need:

  • Text breaking (where can lines wrap?).
  • Font metrics (height, baseline, kerning).
  • Line boxes containing one or more inline boxes.

Once you have text, you have a real toy browser.

Evidence: A paragraph of text wraps correctly at the document width.

Milestone 7 (browser.engineering path): JavaScript

Embed a JavaScript engine (V8, QuickJS, or Python's js2py). Wire DOM access (document.getElementById, element.style).

This is when "browser engine" becomes "browser." Substantial.

Milestone 8 (optional): Networking

HTTP fetcher. Now your browser can open("http://example.com").


8. Tests & evidence

TestHow
HTML parserRound-trip simple documents
CSS parserParse all selectors and values from a small stylesheet
Specificity#id .class rule wins over .class rule
Box dimensionsSum of content+padding+border+margin matches the spec
Layout positionsSibling blocks stack vertically; nested blocks offset by parent
PaintOutput PNG looks correct
Quality on a real pageRender a tiny static page (your own test case, not a real website — real pages need way more spec coverage)

The strongest evidence: a side-by-side comparison of your output and Firefox/Chrome for the same simple test HTML. Document the differences honestly.


9. Common pitfalls

  • Trying to parse real HTML. Real HTML uses implicit closing tags, error recovery, parser modes. A toy strict parser is the right scope. State the limitation.
  • Specificity off-by-one. CSS specificity rules are well-defined (W3C). Hand-trace examples.
  • Forgetting display: none. Skip these elements in layout.
  • Inline before you understand it. Block layout is hard enough. Skip inline on your first pass.
  • Trying to render fonts. Use a library (cairo, skia, fontdue). Don't write a font rasterizer for your first browser.
  • No idea where to stop. Brubeck stops at block layout + paint. browser.engineering goes much further. Decide your scope before starting.
  • Treating the engine as one big file. Each pipeline stage is a clear module. Keep them separate.

10. Extensions

  • Inline layout — text wrapping. Huge step.
  • JavaScript — embed an engine. Wire to DOM.
  • Forms and events — click handlers, input fields.
  • Flexbox / grid — modern CSS layout. Significant.
  • Networking — fetch resources over HTTP.
  • Incremental rendering — restyle and re-layout only changed subtrees.
  • GPU acceleration — composite layers; offload to GPU.

A full browser engine is a multi-year project. Servo and Firefox have hundreds of engineers. Know where to stop.


11. Module integration

ModuleWhat the browser engine deepens
Sem 7 Module 1 — Architecture fundamentalsThe pipeline embodies "single responsibility per stage."
Sem 7 Module 2 — Architecture patternsMulti-pass compiler is the canonical layered architecture.
Compiler tutorialSame multi-pass shape. HTML → DOM → styled → laid out → painted is the same pattern as source → AST → typed → IR → assembly.
Interpreter tutorialParsing HTML and CSS uses the same techniques.

12. Portfolio framing

What to publish:

  • Source organized by pipeline stage: html/, css/, style/, layout/, paint/.
  • A set of test HTML files with their rendered PNGs (yours and a reference).
  • A README with:
    • Pipeline diagram.
    • Supported HTML/CSS features (small list).
    • Skipped features (long list).
    • A comparison rendering against Firefox/Chrome for the same input.

Reviewer entry points:

  • html/parser.rs — DOM construction.
  • layout/block.rs — block layout algorithm.
  • tests/snapshots/ — input HTML + expected PNG.
  • README must include the pipeline diagram and the side-by-side rendering.

A browser engine is an unusually striking portfolio project. Even a partial one — Brubeck's 8-post version — demonstrates serious architectural thinking and parsing fluency.


Source

This tutorial draws from the BYO-X catalog "Web Browser" section. Matt Brubeck's blog series and Panchekha & Harrelson's "Web Browser Engineering" book are the canonical primary references.