Transaction IDs

Every transaction needs a name — a unique handle other transactions can point at when they spend its outputs. That handle is the transaction id, or txid. It isn’t assigned by anyone; it is derived from the transaction’s own contents by hashing. This page shows exactly how, why that’s a beautiful property, and why the legacy version of it once had a subtle, expensive flaw.

A txid is a fingerprint of the transaction

To get a transaction’s id you serialize the transaction into its canonical byte form and run it through SHA-256 twice — Bitcoin’s ubiquitous double-SHA-256:

   serialized transaction bytes
            │
            ▼
        SHA-256
            │
            ▼
        SHA-256
            │
            ▼
     32-byte digest  =  the txid

Because a hash function is deterministic and collision-resistant, this has two consequences that matter enormously:

The id is self-certifying. Anyone with the transaction bytes can recompute the id and confirm it. The name is of the thing, not assigned to it — there’s no registry to trust.
The id is tamper-evident. Change a single satoshi in any output and the bytes change, so the double-SHA-256 changes, so the txid changes. You cannot alter a transaction and keep its name.

How coins are addressed: (txid, index)

A transaction can have several outputs, so the txid alone doesn’t name a coin — it names the transaction. To point at one specific output you also need its position in the output list, the index (also called vout), counting from 0. The pair is called an outpoint:

   outpoint = (txid, index)

   txid = f1a9...e3        index = 0   →  the first output of that tx
   txid = f1a9...e3        index = 1   →  the second output (e.g. the change)

This is precisely the pointer an input carries when it spends a coin. The entire ledger is a web of these references: every input names a prior (txid, index), and the UTXO set is just the collection of outpoints that no input has yet referenced. Spending a coin = adding an input that points at its outpoint, which removes that outpoint from the unspent set.

   tx_A ─ out[1] ───────────────► spent by ─── tx_B in[0] points at (txid_A, 1)
          (a UTXO until now)                    (the coin is now consumed)

A subtle, costly flaw: signatures inside the legacy txid

Here is the catch that took years to fully resolve. In legacy (pre-SegWit) transactions, the unlocking data — the scriptSig, which contains the signature — was part of the bytes hashed into the txid. That sounds harmless until you realize a signature has several valid encodings. A third party with no key at all could take a broadcast-but-unconfirmed transaction, re-encode its signature into another valid form, and rebroadcast it. The transaction would still be valid and still spend the same coins to the same places — but its bytes changed, so its txid changed.

   original tx   ──►  txid = ABCD...
        │
   attacker tweaks the signature encoding (no key needed)
        │
        ▼
   same effect, different bytes  ──►  txid = WXYZ...   ← mutated!

This is transaction malleability, and it broke any system that referred to a transaction by an id it predicted before the transaction confirmed (early payment channels, exchanges that tracked withdrawals by txid). The fix — SegWit — moved signatures into a separate witness field that is excluded from the legacy txid computation, so the id now depends only on data the spender can’t have tampered with after signing. The full story, and the additional wtxid, live in Transaction malleability.

The thread

How do txids help untrusting strangers agree on one ledger? They give every transaction a name that anyone can independently verify and no one can forge. Because the name is the hash of the contents, two strangers who have never communicated will compute the same id for the same transaction and different ids for any two different transactions — no coordination, no authority, no trust. That shared, content-derived naming is what lets inputs unambiguously reference earlier outputs across the whole network: the ledger is stitched together entirely out of hashes that each node can check for itself.

Check your understanding

How is a txid computed, and why does that make it self-certifying and tamper-evident?
Why isn’t the txid alone enough to identify a coin — what else is needed, and what is the pair called?
Describe how the UTXO set relates to outpoints and to spending.
What is transaction malleability, and why was it possible specifically for legacy txids?
How did SegWit fix malleability, and what is the difference between a txid and a wtxid?