Hash Functions (SHA-256)
If you learn only one cryptographic primitive deeply, make it this one. Hashing is the single most-used building block in Bitcoin: it links blocks, identifies transactions, builds Merkle trees, forms addresses, and is the “work” in Proof of Work. Get this page solid and the rest of the course gets dramatically easier.
What a hash function is
Section titled “What a hash function is”A cryptographic hash function takes an input of any size and produces an output of fixed size, called the digest (or just “the hash”).
any data (1 byte or 1 terabyte) ──► H(...) ──► fixed-size fingerprintBitcoin uses SHA-256 (Secure Hash Algorithm, 256-bit), part of the SHA-2 family published by NIST in 2001. Its output is always 256 bits = 32 bytes = 64 hexadecimal characters, no matter the input.
Think of a hash as a digital fingerprint: a short, fixed-length identifier that stands in for a much larger piece of data.
The properties that make it useful (and why Bitcoin needs each)
Section titled “The properties that make it useful (and why Bitcoin needs each)”A cryptographic hash isn’t just any function that shrinks data. It must have a specific set of properties. For each one, note the Bitcoin job it enables.
1. Deterministic
Section titled “1. Deterministic”The same input always produces the same output. H("hello") is the same on every computer,
forever.
→ Why Bitcoin needs it: every node must independently compute the same block hash and the
same transaction ID, or they could never agree on one ledger.
2. Fixed-size output
Section titled “2. Fixed-size output”Any input → exactly 256 bits. → Why: uniform, compact identifiers. A 1-byte transaction and a huge one both get a tidy 32-byte ID.
3. Fast to compute
Section titled “3. Fast to compute”Computing H(x) is cheap.
→ Why: nodes verify enormous numbers of hashes when validating blocks and transactions; it has to
be efficient.
4. Preimage resistance (one-way)
Section titled “4. Preimage resistance (one-way)”Given an output h, it is computationally infeasible to find any input m with H(m) = h. You
cannot run the function backwards.
→ Why: lets Bitcoin commit to data and hide secrets. Addresses, for example, are hashes of
public keys — you can publish the hash without exposing what produced it.
5. Second-preimage resistance
Section titled “5. Second-preimage resistance”Given a specific input m1, it’s infeasible to find a different input m2 ≠ m1 with the same
hash.
→ Why: you can’t take an existing transaction and craft a different one that shares its ID.
6. Collision resistance
Section titled “6. Collision resistance”It’s infeasible to find any two different inputs m1 ≠ m2 with H(m1) = H(m2).
→ Why: this is what makes the blockchain tamper-evident. If you change even one bit of a past
block, its hash changes, which breaks the link to the next block (Part 3). Nobody can secretly swap
data while keeping the same fingerprint.
7. Avalanche effect
Section titled “7. Avalanche effect”Changing the input even slightly — a single bit — flips roughly half the output bits, in a way that looks completely random and unpredictable. → Why: it makes tampering glaringly obvious, and it makes mining a fair lottery (Part 4): you can’t “nudge” the input toward a desired hash; you can only keep trying.
How big is 2²⁵⁶, really?
Section titled “How big is 2²⁵⁶, really?”The reason “infeasible” above is not hand-waving: a 256-bit output has 2²⁵⁶ possible values. That’s about 1.16 × 10⁷⁷ — a number comparable to the estimated count of atoms in the observable universe. To find a preimage by brute force, you’d expect to try on the order of 2²⁵⁶ inputs. No amount of current or foreseeable computing power gets close. This astronomical size is why one-wayness and collision-resistance hold in practice.
Hands-on: feel the avalanche effect
Section titled “Hands-on: feel the avalanche effect”Your Mac has SHA-256 built in. Open a terminal and run:
echo -n "hello" | shasum -a 256You should get:
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824Now change just the capitalization of the first letter:
echo -n "Hello" | shasum -a 256The output is completely different — not “a little different,” totally unrelated-looking. That’s the avalanche effect with your own eyes.
Try it live
Section titled “Try it live”Type below and watch the digest recompute instantly. Change a single character and notice how many hex digits flip (shown in the caption, and highlighted where they changed) — that’s the avalanche effect, live. Everything is computed locally in your browser.
Where Bitcoin uses hashing (a preview map)
Section titled “Where Bitcoin uses hashing (a preview map)”You’ll meet all of these later; here’s the map so the pieces connect:
| Use | What gets hashed | Covered in |
|---|---|---|
| Transaction IDs (txid) | the whole transaction | Part 2 |
| Merkle root | all transactions in a block, tree-hashed | Part 1 (Merkle) / Part 3 |
| Block hash / Proof of Work | the block header | Parts 3–4 |
| Addresses | RIPEMD160(SHA256(public key)) | Parts 1 & 7 |
Bonus: double SHA-256 (“SHA256d”)
Section titled “Bonus: double SHA-256 (“SHA256d”)”Bitcoin often hashes things twice: SHA256(SHA256(x)). You’ll see this for block hashes,
txids, and Merkle nodes. The original motivation was defense against a theoretical weakness of the
SHA-2 construction (so-called length-extension attacks). For now just remember: when Bitcoin says
“the hash,” it usually means double SHA-256. We’ll point it out each time it matters.
Check your understanding
Section titled “Check your understanding”- What does it mean that a hash is deterministic, and why is that essential for nodes to agree?
- Explain preimage resistance vs collision resistance in your own words. Which one makes the blockchain tamper-evident, and why?
- What is the avalanche effect, and what two Bitcoin behaviors does it enable?
- Roughly how many possible SHA-256 outputs are there, and why does that make brute-forcing a preimage infeasible?
- Using the live tool above, change one character in the input. In one sentence, describe what happened to the digest.