Build Your Own MLIR Compiler Pipeline
MLIR is an advanced compiler-infrastructure project, not a first compiler. Build this after you have implemented an interpreter, a bytecode compiler, or a small native-code path. The goal is to understand modern multi-level IR: dialects, operations, verifiers, rewrites, passes, lowering, bufferization, GPU/AI-oriented pipelines, and reproducible compiler testing.
1. Overview & motivation
Build a small MLIR-based compiler pipeline for a tiny tensor or arithmetic language:
source or Python builder -> custom dialect -> canonicalization
-> structured dialects -> bufferization
-> LLVM / EmitC / runtime target
The project teaches why production compilers no longer use a single IR for every abstraction level.
2. Where this fits in the degree
- Phase: Systems
- Semester: 4 as an advanced compiler extension; also useful in Production AI specialization
- Modules deepened: Module 3 (machine model), Module 5 (abstraction and interpretation)
Cross-phase relevance:
- Builds on Compiler and Source Code to Machine Code.
- Extends the Foundations LLM and neural-network work when exploring tensor compilation, quantization, and deployment.
- Connects to production engineering through pinned toolchains, CI, test suites, and upgrade playbooks.
3. Local source backbone
Primary local source:
- Mastering MLIR (
build-your-own/mlir-oren-davis)
| Local chunks | Use them for | Project milestone |
|---|---|---|
001-009 | MLIR foundations, multi-level IR, operations, types, regions, reproducible toolchain | Environment and conceptual map |
010-016 | Dialects, ODS/TableGen, custom ops, attributes, verifiers, traits | Custom dialect and verifier |
017-024 | Pattern rewriting, PDLL, folding, greedy rewrite behavior | Canonicalization and rewrite passes |
025-031 | Pass manager, nested pipelines, timing/statistics, testing with lit/FileCheck | Pipeline registration and tests |
032-044 | Linalg, tensor, vector, bufferization, control flow, transform dialect | Lowering through structured computation |
045-061 | GPU, sparse tensors, quantization | Optional AI/compiler specialization path |
062-075 | Framework interop, end-to-end pipelines, bytecode, runtimes, debugging/profiling | Deployment and runtime evidence |
076-086 | APIs, Python/C bindings, xDSL, reproducibility, CI, upgrades, case studies | Tooling, automation, and professional practice |
4. Implementation milestones
Milestone 1: Reproducible toolchain
Pin LLVM/MLIR versions and document the build path. A Docker or devcontainer setup is acceptable.
Evidence: clean setup command, version report, and one mlir-opt smoke test.
Milestone 2: Custom dialect
Define a tiny dialect with at least three operations and one custom type or attribute.
Evidence: invalid IR fails verification with a stable diagnostic.
Milestone 3: Parser/builder path
Either parse a tiny source language into MLIR or construct IR through Python/C++ builders.
Evidence: source or builder input produces textual MLIR.
Milestone 4: Rewrites and canonicalization
Implement constant folding or algebraic simplification.
Evidence: x + 0, x * 1, or equivalent patterns rewrite predictably under tests.
Milestone 5: Pass pipeline
Register a pass pipeline and run it through mlir-opt.
Evidence: pipeline has timing/statistics and FileCheck tests.
Milestone 6: Lowering
Lower your dialect into standard structured dialects, then toward LLVM, EmitC, or a runtime boundary.
Evidence: before/after IR snapshots at each lowering level.
Milestone 7: Runtime or execution
Run a small program end to end, even if the runtime is minimal.
Evidence: input IR or source produces executable behavior and checked output.
5. Tests & evidence
| Test | Evidence |
|---|---|
| Dialect verifier | invalid operations produce stable errors |
| Rewrites | FileCheck confirms canonical forms |
| Pass pipeline | one command runs the full pipeline |
| Lowering | snapshots show each abstraction level |
| Reproducibility | pinned LLVM/MLIR version and CI command |
| Runtime | small end-to-end example executes or emits usable target code |
6. Pipeline design contract
An MLIR project is judged by the clarity of the pipeline, not by the size of the language. Each stage should preserve or intentionally lower information:
toy.tensor high-level domain ops
-> linalg structured computation
-> tensor explicit tensor values
-> memref bufferized memory
-> scf/cf structured or unstructured control flow
-> llvm/emitc target boundary
For every lowering step, write:
- what abstraction is removed
- what information must be preserved
- which verifier catches invalid IR
- which tests prove the conversion
Dialect scope
A good first dialect has only a few operations:
| Operation | Purpose | Verifier rule |
|---|---|---|
toy.const | literal scalar/tensor | attribute type matches result type |
toy.add | elementwise addition | operands and result have compatible shape/type |
toy.matmul | matrix multiplication | inner dimensions agree |
toy.print | observable output | accepts only supported scalar/tensor types |
The verifier is not decoration. It is what prevents invalid IR from leaking into later passes where the error is harder to explain.
Testing discipline
Use FileCheck-style tests as the main evidence:
// CHECK: toy.add
// CHECK-NOT: toy.add
// CHECK: arith.addf
The portfolio should include negative tests for verifier diagnostics and positive tests for each rewrite/lowering.
7. Required artifacts
| Artifact | Why it matters |
|---|---|
| Dialect reference | Makes operations, types, attributes, and invariants inspectable |
| Pipeline diagram | Shows how abstraction is progressively lowered |
| Before/after IR snapshots | Lets reviewers audit transformations |
| Verifier tests | Proves invalid programs fail early |
| Rewrite tests | Proves canonicalization is deterministic |
| Toolchain lock | Prevents LLVM version drift from making the project unreproducible |
| Upgrade note | Explains what changed when MLIR APIs moved |
8. Common failure modes
- Starting with GPU lowering. Build a CPU/textual pipeline first; GPU makes every mistake harder.
- No verifier. Invalid IR should fail at the dialect boundary.
- Too many operations. Four well-tested ops beat twenty shallow ones.
- Only screenshot evidence. Compiler infrastructure needs commands and FileCheck tests.
- Version drift. MLIR changes quickly; pin versions and record exact commits.
- Confusing MLIR with LLVM IR. MLIR's value is multiple abstraction levels, not just another syntax for low-level code.
9. Portfolio framing
This should be framed as compiler infrastructure, not as a toy language alone.
Publish dialect reference, pipeline diagram, pass and rewrite tests, before/after IR examples, setup/version pinning, and upgrade notes if the LLVM version changes.
Reviewer entry point: README -> dialect definition -> pass pipeline -> FileCheck tests -> end-to-end sample.
10. Deep project spec
Project contract
Build a small MLIR pipeline for a tiny tensor or arithmetic language. The project must define the source language or builder input, custom dialect scope, operations/types, verifier rules, pass pipeline, lowering targets, test strategy, and pinned LLVM/MLIR version. It is acceptable to stop at textual MLIR plus lowering to standard dialects if the pipeline is clear and testable.
Source-backed reading map
| Source ID | Use for | Required output |
|---|---|---|
build-your-own/mlir-oren-davis | MLIR concepts, dialects, operations, passes, lowering, bufferization, AI/compiler pipelines | pinned toolchain, dialect spec, pass tests |
Milestone map
| Milestone | Deliverable | Tests | Failure case |
|---|---|---|---|
| Toolchain | pinned build/devcontainer | mlir-opt smoke test | version mismatch note |
| Input language | parser or builder | input-to-MLIR fixture | invalid input rejected |
| Dialect | ops, types, parser/printer | round-trip textual MLIR | verifier catches bad op |
| Rewrite pass | canonicalization/lowering | FileCheck fixtures | illegal rewrite rejected |
| Lowering | path to standard/LLVM dialect | pipeline golden output | unsupported op remains visible |
| Execution/demo | interpreter/JIT/translated output if feasible | end-to-end smoke test | unsupported backend documented |
Test matrix
| Test type | Required examples |
|---|---|
| Golden | textual MLIR before/after each pass |
| Verifier | invalid ops fail with helpful diagnostics |
| Pass | mlir-opt plus FileCheck fixtures |
| Pipeline | one command runs all passes in order |
| Reproducibility | exact LLVM/MLIR commit or release pinned |
Design notes required
dialect.md: operations, types, attributes, regions, invariants.pipeline.md: pass order and why each abstraction level exists.toolchain.md: pinned version, build command, known API drift.
Portfolio evidence
Publish the dialect spec, MLIR before/after examples, pass tests, pinned setup, and a short explanation of why MLIR was useful compared with direct LLVM IR.
Source
This project is based on the local MASTERING MLIR chunks and belongs after the basic interpreter/compiler projects.