Build Your Own MLIR Compiler Pipeline

MLIR is an advanced compiler-infrastructure project, not a first compiler. Build this after you have implemented an interpreter, a bytecode compiler, or a small native-code path. The goal is to understand modern multi-level IR: dialects, operations, verifiers, rewrites, passes, lowering, bufferization, GPU/AI-oriented pipelines, and reproducible compiler testing.

1. Overview & motivation

Build a small MLIR-based compiler pipeline for a tiny tensor or arithmetic language:

source or Python builder -> custom dialect -> canonicalization
                         -> structured dialects -> bufferization
                         -> LLVM / EmitC / runtime target

The project teaches why production compilers no longer use a single IR for every abstraction level.

2. Where this fits in the degree

Phase: Systems
Semester: 4 as an advanced compiler extension; also useful in Production AI specialization
Modules deepened: Module 3 (machine model), Module 5 (abstraction and interpretation)

Cross-phase relevance:

Builds on Compiler and Source Code to Machine Code.
Extends the Foundations LLM and neural-network work when exploring tensor compilation, quantization, and deployment.
Connects to production engineering through pinned toolchains, CI, test suites, and upgrade playbooks.

3. Local source backbone

Primary local source:

Mastering MLIR (build-your-own/mlir-oren-davis)

Local chunks	Use them for	Project milestone
`001`-`009`	MLIR foundations, multi-level IR, operations, types, regions, reproducible toolchain	Environment and conceptual map
`010`-`016`	Dialects, ODS/TableGen, custom ops, attributes, verifiers, traits	Custom dialect and verifier
`017`-`024`	Pattern rewriting, PDLL, folding, greedy rewrite behavior	Canonicalization and rewrite passes
`025`-`031`	Pass manager, nested pipelines, timing/statistics, testing with lit/FileCheck	Pipeline registration and tests
`032`-`044`	Linalg, tensor, vector, bufferization, control flow, transform dialect	Lowering through structured computation
`045`-`061`	GPU, sparse tensors, quantization	Optional AI/compiler specialization path
`062`-`075`	Framework interop, end-to-end pipelines, bytecode, runtimes, debugging/profiling	Deployment and runtime evidence
`076`-`086`	APIs, Python/C bindings, xDSL, reproducibility, CI, upgrades, case studies	Tooling, automation, and professional practice

4. Implementation milestones

Milestone 1: Reproducible toolchain

Pin LLVM/MLIR versions and document the build path. A Docker or devcontainer setup is acceptable.

Evidence: clean setup command, version report, and one mlir-opt smoke test.

Milestone 2: Custom dialect

Define a tiny dialect with at least three operations and one custom type or attribute.

Evidence: invalid IR fails verification with a stable diagnostic.

Milestone 3: Parser/builder path

Either parse a tiny source language into MLIR or construct IR through Python/C++ builders.

Evidence: source or builder input produces textual MLIR.

Milestone 4: Rewrites and canonicalization

Implement constant folding or algebraic simplification.

Evidence: x + 0, x * 1, or equivalent patterns rewrite predictably under tests.

Milestone 5: Pass pipeline

Evidence: pipeline has timing/statistics and FileCheck tests.

Milestone 6: Lowering

Lower your dialect into standard structured dialects, then toward LLVM, EmitC, or a runtime boundary.

Evidence: before/after IR snapshots at each lowering level.

Milestone 7: Runtime or execution

Run a small program end to end, even if the runtime is minimal.

Evidence: input IR or source produces executable behavior and checked output.

5. Tests & evidence

Test	Evidence
Dialect verifier	invalid operations produce stable errors
Rewrites	FileCheck confirms canonical forms
Pass pipeline	one command runs the full pipeline
Lowering	snapshots show each abstraction level
Reproducibility	pinned LLVM/MLIR version and CI command
Runtime	small end-to-end example executes or emits usable target code

6. Pipeline design contract

An MLIR project is judged by the clarity of the pipeline, not by the size of the language. Each stage should preserve or intentionally lower information:

toy.tensor       high-level domain ops
  -> linalg      structured computation
  -> tensor      explicit tensor values
  -> memref      bufferized memory
  -> scf/cf      structured or unstructured control flow
  -> llvm/emitc  target boundary

For every lowering step, write:

what abstraction is removed
what information must be preserved
which verifier catches invalid IR
which tests prove the conversion

Dialect scope

A good first dialect has only a few operations:

Operation	Purpose	Verifier rule
`toy.const`	literal scalar/tensor	attribute type matches result type
`toy.add`	elementwise addition	operands and result have compatible shape/type
`toy.matmul`	matrix multiplication	inner dimensions agree
`toy.print`	observable output	accepts only supported scalar/tensor types

The verifier is not decoration. It is what prevents invalid IR from leaking into later passes where the error is harder to explain.

Testing discipline

Use FileCheck-style tests as the main evidence:

// CHECK: toy.add
// CHECK-NOT: toy.add
// CHECK: arith.addf

The portfolio should include negative tests for verifier diagnostics and positive tests for each rewrite/lowering.

7. Required artifacts

Artifact	Why it matters
Dialect reference	Makes operations, types, attributes, and invariants inspectable
Pipeline diagram	Shows how abstraction is progressively lowered
Before/after IR snapshots	Lets reviewers audit transformations
Verifier tests	Proves invalid programs fail early
Rewrite tests	Proves canonicalization is deterministic
Toolchain lock	Prevents LLVM version drift from making the project unreproducible
Upgrade note	Explains what changed when MLIR APIs moved

8. Common failure modes

Starting with GPU lowering. Build a CPU/textual pipeline first; GPU makes every mistake harder.
No verifier. Invalid IR should fail at the dialect boundary.
Too many operations. Four well-tested ops beat twenty shallow ones.
Only screenshot evidence. Compiler infrastructure needs commands and FileCheck tests.
Version drift. MLIR changes quickly; pin versions and record exact commits.
Confusing MLIR with LLVM IR. MLIR's value is multiple abstraction levels, not just another syntax for low-level code.

9. Portfolio framing

This should be framed as compiler infrastructure, not as a toy language alone.

Publish dialect reference, pipeline diagram, pass and rewrite tests, before/after IR examples, setup/version pinning, and upgrade notes if the LLVM version changes.

Reviewer entry point: README -> dialect definition -> pass pipeline -> FileCheck tests -> end-to-end sample.

10. Deep project spec

Project contract

Build a small MLIR pipeline for a tiny tensor or arithmetic language. The project must define the source language or builder input, custom dialect scope, operations/types, verifier rules, pass pipeline, lowering targets, test strategy, and pinned LLVM/MLIR version. It is acceptable to stop at textual MLIR plus lowering to standard dialects if the pipeline is clear and testable.

Source-backed reading map

Source ID	Use for	Required output
`build-your-own/mlir-oren-davis`	MLIR concepts, dialects, operations, passes, lowering, bufferization, AI/compiler pipelines	pinned toolchain, dialect spec, pass tests

Milestone map

Milestone	Deliverable	Tests	Failure case
Toolchain	pinned build/devcontainer	`mlir-opt` smoke test	version mismatch note
Input language	parser or builder	input-to-MLIR fixture	invalid input rejected
Dialect	ops, types, parser/printer	round-trip textual MLIR	verifier catches bad op
Rewrite pass	canonicalization/lowering	`FileCheck` fixtures	illegal rewrite rejected
Lowering	path to standard/LLVM dialect	pipeline golden output	unsupported op remains visible
Execution/demo	interpreter/JIT/translated output if feasible	end-to-end smoke test	unsupported backend documented

Test matrix

Test type	Required examples
Golden	textual MLIR before/after each pass
Verifier	invalid ops fail with helpful diagnostics
Pass	`mlir-opt` plus `FileCheck` fixtures
Pipeline	one command runs all passes in order
Reproducibility	exact LLVM/MLIR commit or release pinned

Design notes required

dialect.md: operations, types, attributes, regions, invariants.
pipeline.md: pass order and why each abstraction level exists.
toolchain.md: pinned version, build command, known API drift.

Portfolio evidence

Publish the dialect spec, MLIR before/after examples, pass tests, pinned setup, and a short explanation of why MLIR was useful compared with direct LLVM IR.

Source

This project is based on the local MASTERING MLIR chunks and belongs after the basic interpreter/compiler projects.

1. Overview & motivation​

2. Where this fits in the degree​

3. Local source backbone​

4. Implementation milestones​

Milestone 1: Reproducible toolchain​

Milestone 2: Custom dialect​

Milestone 3: Parser/builder path​

Milestone 4: Rewrites and canonicalization​

Milestone 5: Pass pipeline​

Milestone 6: Lowering​

Milestone 7: Runtime or execution​

5. Tests & evidence​

6. Pipeline design contract​

Dialect scope​

Testing discipline​

7. Required artifacts​

8. Common failure modes​

9. Portfolio framing​

10. Deep project spec​

Project contract​

Source-backed reading map​

Milestone map​

Test matrix​

Design notes required​

Portfolio evidence​

Source​