Skip to main content

Build Your Own Blockchain

A blockchain is a distributed, append-only, hash-chained log with a Sybil-resistant agreement protocol on top.

Building a blockchain is the easiest way to make cryptographic hashing, Merkle trees, peer-to-peer networking, and consensus concrete. The interesting part is not "crypto" in the financial sense -- it is the data structure and the protocol that makes a decentralized ledger work.


1. Overview & motivation

A blockchain has four layers:

  1. Block format -- a header (previous hash, Merkle root, timestamp, nonce) plus a list of transactions.
  2. Hash chain -- each block contains the hash of the previous, so tampering invalidates everything after.
  3. Proof of work / proof of stake -- Sybil resistance: making it costly to propose blocks.
  4. Peer-to-peer gossip -- nodes flood new blocks and transactions to neighbours.

What you can only learn by building one:

  • Why Merkle trees make light clients possible (verify membership with O(log n) hashes).
  • Why the longest-chain rule is the elegant heart of Nakamoto consensus.
  • Why mining difficulty has to adjust dynamically and how.
  • Why double-spend prevention falls out for free if you have hash-chained ordering.
  • Why blockchains do not solve most engineering problems -- most of what they offer can be done with Postgres + signatures.

2. Where this fits in the degree

  • Phase: Architecture
  • Semester: 6 (Databases and Distributed Systems)
  • Modules deepened: Module 3 (replication: blockchain is a single-leader-by-PoW replicated log), Module 5 (distributed fundamentals: Nakamoto consensus as an alternative to Paxos/Raft). Also touches Sem 2 Module 5 (Merkle tree as an advanced structure) as a Foundations carry-over.

Cross-phase relevance:

  • Builds intuition for content-addressable storage (Git uses similar ideas -- see Git tutorial)
  • Direct contrast with the Consensus / Raft tutorial -- different consensus families for different threat models

3. Prerequisites

  • Hashing: SHA-256 as a black-box function bytes -> 32 bytes.
  • Public-key cryptography: at the API level only -- signing and verifying.
  • HTTP or sockets: enough to send JSON between two processes (the Web Server tutorial or Network Stack tutorial gives more than enough).

You do not need to understand the math behind SHA-256 or ECDSA. Use libraries (hashlib, cryptography in Python; crypto/sha256 in Go).


4. Theory & research

Required reading

  • Andreas Antonopoulos, Mastering Bitcoin (2nd edition) -- Chapters 7-10. Free on GitHub.
  • Princeton's "Bitcoin and Cryptocurrency Technologies" (coursera.org) -- free course with strong CS foundations.

Useful Merkle-tree references

  • Ralph Merkle's original 1987 paper -- for completeness.
  • The Certificate Transparency RFC (RFC 6962) -- Merkle trees as used by real systems.

What to skip (for the first pass)

  • Ethereum / smart contracts / Solidity. Bitcoin's data structure is the right teaching target.
  • Cryptocurrency speculation, "tokenomics", and similar non-CS material.

5. Curated tutorial list (from BYO-X)

  • ATS: Functional Blockchain
  • Crystal: Write your own blockchain and PoW algorithm using Crystal
  • Go: Building Blockchain in Go -- Jeiwan
  • Go: Code your own blockchain in less than 200 lines of Go -- Mycoralhealth
  • Java: Creating Your First Blockchain with Java -- Kass
  • JavaScript: A cryptocurrency implementation in less than 1500 lines of code
  • JavaScript: Build your own Blockchain in JavaScript, Learn & Build a JavaScript Blockchain, Creating a blockchain with JavaScript
  • JavaScript: How To Launch Your Own Production-Ready Cryptocurrency
  • JavaScript: Writing a Blockchain in Node.js
  • Kotlin: Let's implement a cryptocurrency in Kotlin
  • Python: Learn Blockchains by Building One -- Daniel van Flymen ⭐ recommended primary
  • Python: Build your own blockchain: a Python tutorial
  • Python: A Practical Introduction to Blockchain with Python
  • Python: Let's Build the Tiniest Blockchain
  • Ruby: Programming Blockchains Step-by-Step (Manuscripts Book Edition)
  • Scala: How to build a simple actor-based blockchain
  • TypeScript: Naivecoin: a tutorial for building a cryptocurrency -- lhartikk/naivecoin -- substantial, ~500 lines
  • TypeScript: NaivecoinStake -- proof of stake variant
  • Rust: Building A Blockchain in Rust & Substrate

Beginner: van Flymen's "Learn Blockchains by Building One" (Python, ~200 lines). One sitting. You will have a working blockchain with mining, transactions, and a simple HTTP API by the end.

Intermediate: Jeiwan's "Building Blockchain in Go" (7-part series). Adds persistence (BoltDB), Merkle trees, P2P networking, and CLI. This is the version most worth completing for portfolio.

Advanced: lhartikk's "Naivecoin" (TypeScript). The most complete tutorial in the catalog. Includes proper UTXO model, wallets, transactions, and a web UI.

For this degree: Jeiwan's Go series, because the Go ecosystem makes the networking and persistence parts natural, and the series is comprehensive without being overwhelming.


7. Implementation milestones

Milestone 1: Block and chain (Day 1, ~50 lines)

A block has: index, timestamp, data, previous hash, current hash. The hash is SHA-256(everything else). A chain is a list of blocks where each block's previous_hash matches the previous block's hash.

import hashlib, json, time

class Block:
def __init__(self, index, prev_hash, data, timestamp=None):
self.index = index
self.timestamp = timestamp or time.time()
self.data = data
self.prev_hash = prev_hash
self.nonce = 0
self.hash = self.compute_hash()

def compute_hash(self):
block_str = json.dumps(
{"i": self.index, "t": self.timestamp, "d": self.data,
"p": self.prev_hash, "n": self.nonce},
sort_keys=True
)
return hashlib.sha256(block_str.encode()).hexdigest()

def genesis():
return Block(0, "0" * 64, "genesis")

Evidence: Build a chain of 3 blocks. Tamper with block 1's data. Show that block 1's hash and every subsequent prev_hash no longer validate.

Milestone 2: Proof of work (Day 2)

A block is valid only if its hash starts with N zero bits. To produce such a hash, vary the nonce until you find one.

def mine(block, difficulty):
target = "0" * difficulty
while not block.hash.startswith(target):
block.nonce += 1
block.hash = block.compute_hash()
return block

Evidence: Mine 5 blocks at difficulty 4. Record nonce values and time per block.

Milestone 3: Transactions and Merkle tree

A block now contains a list of transactions. The header stores the Merkle root -- a hash tree over the transactions. This lets light clients verify "transaction X is in block Y" with O(log n) hashes.

def merkle_root(txs):
if not txs: return ""
level = [hashlib.sha256(t.encode()).hexdigest() for t in txs]
while len(level) > 1:
if len(level) % 2: level.append(level[-1])
level = [hashlib.sha256((a + b).encode()).hexdigest()
for a, b in zip(level[::2], level[1::2])]
return level[0]

Evidence: Build the Merkle tree, produce a Merkle proof for one transaction, and write a verifier that checks the proof without the full block.

Milestone 4: Wallets and signatures

Each user has an ECDSA key pair (secp256k1 is Bitcoin's curve; for a tutorial, any curve works). Transactions are signed by the sender. Receivers / validators verify the signature.

from cryptography.hazmat.primitives.asymmetric import ec
from cryptography.hazmat.primitives import hashes

private_key = ec.generate_private_key(ec.SECP256K1())
public_key = private_key.public_key()

signature = private_key.sign(transaction_bytes, ec.ECDSA(hashes.SHA256()))
public_key.verify(signature, transaction_bytes, ec.ECDSA(hashes.SHA256()))

Evidence: Show that altering a signed transaction fails verification.

Milestone 5: Networking (P2P gossip)

Each node has a list of peers. When a node mines a new block, it broadcasts it. When a node receives a block, it validates it and rebroadcasts.

A simple HTTP-based version works:

  • POST /blocks -- receive a new block
  • GET /blocks -- return the current chain
  • POST /peers -- register a new peer
  • POST /resolve -- apply longest-chain rule among known peers

Evidence: Run three nodes on three ports. Mine on node 1. Show that nodes 2 and 3 receive the block. Then mine simultaneously on 2 and 3 to create a fork, and show that longest-chain wins.

Milestone 6: Difficulty adjustment

Bitcoin's rule: every 2016 blocks, the network adjusts difficulty so that average block time stays ~10 minutes. In your toy version: adjust every 10 blocks so block time stays at ~5 seconds.

Evidence: Plot block time over 50 blocks under varying mining hash rate. Difficulty should track.

Milestone 7 (optional): UTXO model

Bitcoin doesn't have "accounts" -- it has unspent transaction outputs. Each transaction consumes UTXOs and produces new ones. This is conceptually richer than account-based models (Ethereum).


8. Tests & evidence

TestHow
Hash chain validationTamper test (above)
Mining correctnessRandom data must produce a hash with N leading zeros
Merkle proof verificationGenerate proof, alter one byte, verify it now fails
Signature verificationSign, verify, then alter and re-verify (must fail)
P2P gossip3-node fork scenario (above)
Difficulty regulationBlock time stays near target across hash-rate change
Double-spend preventionSubmit two transactions spending the same UTXO; second must be rejected

9. Common pitfalls

  • Serialization inconsistency. Hash inputs must be reproducible. json.dumps(..., sort_keys=True) is the simplest fix. Without this, two nodes compute different hashes for the same block.
  • Including the hash field in the data being hashed. Classic bug. The hash is computed over everything except itself.
  • No genesis block agreement. All nodes must agree on the genesis block. Hardcode it.
  • Naive requests calls in P2P broadcast. Without timeouts, one slow peer blocks the network. Use timeouts and async.
  • Trusting incoming blocks. Always validate: hash matches, PoW satisfies difficulty, all transactions are signed and don't double-spend.
  • Difficulty too high for development. Start at 4 (seconds to mine), not 20 (hours).

10. Extensions

  • Proof of stake -- NaivecoinStake tutorial covers it. Replace mine() with stake-weighted selection.
  • Persistent storage -- LevelDB, BoltDB, or SQLite. The Jeiwan tutorial uses BoltDB.
  • CLI wallet -- send, balance, mine commands.
  • Smart contracts -- Ethereum-style stack-based VM. This is essentially the Interpreter project applied to a different domain.
  • Sharding -- partition the blockchain by transaction key.

11. Module integration

ModuleWhat the blockchain deepens
Sem 2 Module 5 -- Advanced structuresMerkle tree: a specific advanced structure with strong real-world use.
Sem 6 Module 2 -- Storage & indexingAppend-only hash-chained log is a textbook storage pattern.
Sem 6 Module 3 -- ReplicationBlockchain is a replicated log with conflict resolution by longest chain.
Sem 6 Module 5 -- Distributed fundamentalsNakamoto consensus is a probabilistic alternative to Paxos/Raft.
Sem 7 Module 4 -- Contract designTransaction formats and signature schemes are protocol contracts.
Cross-link: Consensus / RaftDirect contrast -- different consensus families.
Cross-link: Git tutorialGit is content-addressable; blockchain is content-addressable. Same fundamental idea applied to two different problems.

12. Portfolio framing

What to publish:

  • The chain implementation with clear block.py, chain.py, mining.py, wallet.py, network.py.
  • A README that honestly explains what the project does and what it does not (no production claims).
  • A blog post comparing your toy to Bitcoin's actual implementation, naming what's simplified.

What to keep private:

  • Private keys (obviously).
  • Speculative or financial framing (this is a CS project, not an investment thesis).

Reviewer entry points:

  • chain.py -- chain validation.
  • mining.py -- PoW + difficulty adjustment.
  • network.py -- P2P broadcast.
  • README must include the 3-node fork demonstration.

What this should not become:

  • A pitch deck.
  • An ICO.
  • An NFT marketplace.

Keep the project as a CS artifact. The value is in understanding the mechanism, not in playing financial markets.


13. Local source backbone

Use these local chunks to make the blockchain path more rigorous and less tutorial-shaped:

  • Build Your Own Blockchain (build-your-own/blockchain-hellwig)
  • Learn Blockchain Programming with JavaScript (build-your-own/blockchain-javascript-traub)
Local chunksUse them forAdd to this project
Hellwig 003-010DLT terminology, blocks, ledger propagation, Merkle trees, first blockchain buildAdd a concept map before implementation: block, transaction, ledger, hash, peer, validator.
Hellwig 011-023Double spending, mining, PoW/PoS/PoA, CAP, Byzantine generals, consensus mechanismsAdd a consensus comparison note: PoW chain selection versus Raft-style leadership.
Hellwig 024-029Ethereum, gas, smart contracts, ABI, sample deploymentKeep as optional extension; do not mix smart contracts into the base chain.
Hellwig 030-050Anonymity, privacy, cryptographic primitives, signatures, elliptic curvesAdd a cryptography caveat section: what this toy implements and what it deliberately does not.
Traub 005-018JavaScript blockchain object, blocks, transactions, SHA-256, PoW, genesis blockAdd constructor/data-model tests for every chain primitive.
Traub 019-035Express API, mining endpoint, multi-node registration, transaction syncAdd a three-node API demo with peer registration and transaction broadcast.
Traub 036-048Chain validation, consensus endpoint, block explorer, address lookup, improvement queueAdd /consensus, /block/:hash, /transaction/:id, and /address/:address as optional API milestones.

Extra checkpoints from the book chunks

  1. Validation checkpoint: corrupt one transaction, one Merkle proof, one block hash, and one previous-hash pointer; all must fail differently.
  2. Consensus checkpoint: run two nodes that mine competing tips, then a third node that resolves by the chosen rule.
  3. API checkpoint: expose read-only explorer endpoints without letting clients mutate chain state directly.
  4. Security checkpoint: document why this toy is not production crypto and list every shortcut.

14. Deep project spec

Project contract

Build a toy blockchain as a distributed-systems mechanism, not a finance product. The base contract is blocks, transactions, hashes, previous-hash links, Merkle roots or transaction commitments, chain validation, peer synchronization, and one consensus rule. Wallets, smart contracts, and explorers are extensions.

Source-backed reading map

Source IDUse forRequired output
build-your-own/blockchain-hellwigDLT concepts, blocks, propagation, Merkle trees, consensus comparison, cryptography caveatsconcept map, validation tests, consensus note
build-your-own/blockchain-javascript-traubJavaScript blockchain object, Express API, mining, peer registration, explorer endpointsAPI demo and multi-node transcript

Milestone map

MilestoneDeliverableTestsFailure case
Data modelblock and transaction schemaserialization/hash fixturesnon-canonical serialization
Validationblock, transaction, chain checkscorrupted-chain fixturesinvalid previous hash
Mining/consensusPoW or explicit alternativedifficulty and chain-selection testscompeting tips
Merkle/signature layercommitment or signature proofproof/signature fixturestampered transaction
Network APIpeer registration and broadcastthree-node demoduplicate peer/transaction
Explorerread-only lookup endpointsendpoint fixturesmissing block/transaction
Security caveatsthreat model and shortcutsreview checklistproduction-safety disclaimer missing

Test matrix

Test typeRequired examples
Goldenblock hash, transaction ID, Merkle root
Negativecorrupt transaction, block hash, previous pointer, consensus state
Integrationthree nodes broadcast transaction and resolve chain
Propertychain validator rejects any single-field mutation
APIread-only explorer and mutation endpoints separated

Design notes required

  • chain-model.md: block schema, transaction schema, hash inputs.
  • consensus.md: fork-choice rule, difficulty, and failure modes.
  • network.md: peer protocol, idempotency, broadcast behavior.
  • security.md: cryptographic shortcuts and why this is not production crypto.

Portfolio evidence

Publish the chain validator fixtures, three-node transcript, explorer screenshots or curl log, consensus comparison note, and security caveat list.


Source

This tutorial draws from the BYO-X catalog "Blockchain / Cryptocurrency" section. The Nakamoto whitepaper remains the canonical primary source.