Chaos and Order

💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

Introduction — What a Transaction Promises
ACID — What the Four Letters Mean
Why Isolation Is Hard — Concurrency Is the Root Problem
Four Anomalies — What Can Go Wrong
The Four Standard Isolation Levels
MVCC — The Magic of Reading Without Locking
Locking versus Optimistic Concurrency
SELECT FOR UPDATE — Explicit Row Locking
PostgreSQL versus MySQL — The Defaults Differ
Practical Guidance
Wrapping Up
References

Introduction — What a Transaction Promises

Money moving through a database raises a scary question: "What happens if the server dies mid-transfer?" If a process crashes between the statement that subtracts 100 from account A and the one that adds 100 to account B, money could vanish or double. Transactions exist to make that fear go away.

The promise of a transaction is simple: treat these statements as one unit — either all of them take effect, or none of them do. Once committed, the unit is permanent; if rolled back, the whole thing never happened. Yet keeping this simple promise while many transactions run at once surfaces surprisingly subtle problems. This post confronts that subtlety head-on.

If you want to experiment with these ideas in real SQL, you can run queries in this site's SQL Playground, PostgreSQL Playground, and DuckDB Playground.

ACID — What the Four Letters Mean

The guarantees of a transaction are usually summarized by the acronym ACID. Let's look at each letter precisely.

Atomicity: A transaction is an indivisible unit. It all succeeds or all fails. You never get the "subtract" of a transfer applied without the "add."
Consistency: A transaction moves the database from one valid state to another. Constraints (foreign keys, uniqueness, checks) hold at the transaction boundary. This "consistency" means the invariants your application defines, which is a different concept from the "consistency" of distributed systems discussed later.
Isolation: Concurrently running transactions cannot see each other's intermediate state. Half of this post is about isolation. Because perfect isolation is expensive, real systems choose among several isolation levels.
Durability: A committed transaction survives even if the system later crashes. This is usually guaranteed by writing to a WAL (Write-Ahead Log) first.

Of these, atomicity and durability are mostly intuitive. The real difficulty lives in isolation, because "how much of another transaction's intermediate state do we let you see" has no single right answer — only a spectrum between performance and correctness.

Why Isolation Is Hard — Concurrency Is the Root Problem

If transactions ran one at a time in sequence, isolation would be a non-issue. The difficulty arises when multiple transactions overlap in time. The theoretically safest outcome is "serializable": even though the transactions really do run concurrently, the resulting state is identical to some order in which they ran one after another.

The problem is that enforcing perfect serializability is costly. Transactions wait on one another, concurrency drops, and throughput suffers. So databases offer a compromise. The tiers of "I'll tolerate up to this much anomaly in exchange for running faster" are exactly the isolation levels.

The key to understanding each isolation level is: "which anomalies does this level prevent, and which does it allow?" So let's define the anomalies precisely first.

Four Anomalies — What Can Go Wrong

Here are the classic read anomalies that appear when isolation is weak. Let's see each as a concrete scenario.

1. Dirty read — reading an uncommitted value. Transaction B reads a value that transaction A has not yet committed. If A later rolls back, B has read a phantom value that never truly existed.

  A: UPDATE balance = 200 (not yet committed)
  B:                          SELECT balance -> 200  (dirty!)
  A: ROLLBACK               (200 becomes a value that never was)
  B: acts on a 200 that never existed

2. Non-repeatable read — the same row read twice yields different values. Transaction B reads the same row twice, and in between another transaction A modifies and commits that row, so the two reads differ.

  B: SELECT balance -> 100
  A: UPDATE balance = 200; COMMIT
  B: SELECT balance -> 200   (same row, value changed)

3. Phantom read — the same query twice returns a different set of rows. B queries rows matching a condition (say, "balance > 50") twice, and in between A inserts (or deletes) a row matching that condition, so the number of rows in the result set changes. If a non-repeatable read is a change to the value of an existing row, a phantom is the appearance or disappearance of a row itself.

  B: SELECT count(*) WHERE balance > 50 -> 3 rows
  A: INSERT (balance = 500); COMMIT
  B: SELECT count(*) WHERE balance > 50 -> 4 rows  (a phantom row appears)

4. Write skew — each read is valid, but combined they break a rule. This one is subtler. Two transactions read disjoint rows and each performs an update; individually each honored the rule, but the combined result violates an invariant. The classic example is an "on-call doctor" rule. At least one doctor must always be on call, but if two doctors simultaneously decide "the other one is on call, so I can drop off" and each drops off, the result is zero doctors on call.

  Rule: on-call doctors must be >= 1
  Now: Alice (on call), Bob (on call) — 2

  A (Alice): SELECT count(on_call) -> 2, "Bob is there, I can drop" -> Alice off
  B (Bob):   SELECT count(on_call) -> 2, "Alice is there, I can drop" -> Bob off
  Both commit -> 0 on call. Rule violated!

Write skew is especially insidious because each transaction, viewed alone, looks perfectly legitimate. It can occur even under snapshot isolation (explained below), which is precisely why "serializable" is sometimes needed.

The Four Standard Isolation Levels

The SQL standard defines four isolation levels in terms of the anomalies above. Lower levels allow more anomalies and run faster; higher levels are safer and more expensive.

READ UNCOMMITTED: The weakest. It permits dirty reads — even uncommitted values can be seen. Rarely used in practice.
READ COMMITTED: Prevents dirty reads; you only read committed values. But non-repeatable reads and phantoms are still allowed. It is the effective default in many databases.
REPEATABLE READ: Also prevents non-repeatable reads. Rows a transaction saw at its start appear with the same value throughout. Per the standard, phantoms are allowed, though some implementations prevent them too.
SERIALIZABLE: The strongest. It prevents all anomalies, guaranteeing the result is identical to running the transactions one at a time in some order.

The standard's "level versus anomaly" relationship, in a table:

Isolation Level	Dirty Read	Non-Repeatable Read	Phantom Read
READ UNCOMMITTED	Allowed	Allowed	Allowed
READ COMMITTED	Prevented	Allowed	Allowed
REPEATABLE READ	Prevented	Prevented	Allowed (per standard)
SERIALIZABLE	Prevented	Prevented	Prevented

One important caveat: this table is only the SQL standard's definition, and real database behavior is often stronger or subtly different. For example, as we'll see, PostgreSQL's REPEATABLE READ actually prevents even the phantoms the standard permits. So "what does this level actually guarantee on my database" must always be checked against that database's own documentation.

MVCC — The Magic of Reading Without Locking

There are broadly two families of ways to implement isolation. One is locking; the other is MVCC (Multi-Version Concurrency Control). Today's major databases — PostgreSQL, MySQL (InnoDB), Oracle — all use MVCC.

The core idea of MVCC is this: don't overwrite a row; create a new version of it. When a row is updated, the old version remains and a new version is added. Each version carries information about which transaction created it. When a transaction reads data, it picks only the version that "should be visible" from its own point in time.

  Versions of row x (in time order):
    v1 (created by transaction 10)
    v2 (created by transaction 25)   <- latest

  When transaction 20 reads x:
    -> 20 cannot see the result of 25 (a future that didn't exist yet)
    -> it reads v1  (the version matching its snapshot)

The decisive advantage is that reads don't block writes and writes don't block reads. Readers simply see the old version matching their point in time; writers just create a new version. Neither has to wait, so concurrency rises dramatically. Isolation in which a transaction sees a consistent snapshot as of its start is called snapshot isolation.

There is a cost. Old versions accumulate and must be cleaned up periodically. In PostgreSQL, VACUUM does this cleanup, and if it falls behind, dead tuples pile up and performance degrades. And the write skew we saw earlier is not prevented by pure snapshot isolation. Each snapshot is internally consistent, but the conflict that arises when two snapshots' decisions combine is invisible to them.

Locking versus Optimistic Concurrency

Strategies for handling concurrent update conflicts split broadly into pessimistic and optimistic.

Pessimistic locking. Assume "a conflict will happen" and lock the data before touching it. Other transactions wait until the lock is released. It's safe when conflicts are frequent, but it risks waiting and deadlock. The SELECT ... FOR UPDATE we'll see next is the classic tool for this approach.

Optimistic concurrency control. Assume "conflicts are rare" and proceed without locking. Instead, just before committing, check "has this data changed since I read it?" and, if so, fail the transaction so it retries. A common implementation is a version column: put a version number on the row and update only when "the version I read equals the current version."

-- Optimistic locking: update only if the version is unchanged
UPDATE accounts
SET balance = 200, version = version + 1
WHERE id = 42 AND version = 7;
-- If zero rows are affected -> someone changed it in the meantime -> retry

The choice comes down to conflict frequency. When conflicts are frequent, pessimistic locking reduces wasted retries; when they're rare, the optimistic approach eliminates waiting and raises throughput. Many updates in web applications rarely conflict, so the optimistic approach fits well.

SELECT FOR UPDATE — Explicit Row Locking

Thanks to MVCC, ordinary reads don't lock, but sometimes you need "lock this row I just read so nobody else can touch it." The classic case is the read-modify-write pattern: read a balance, check it's sufficient, then debit it.

If you read with a plain SELECT here, another transaction could change the same row right after you read it, causing a lost update. SELECT ... FOR UPDATE places a write lock on the row at the moment you read it, preventing other transactions from modifying that row until yours ends.

BEGIN;
-- Read this row while locking it. Other transactions wait to modify this row.
SELECT balance FROM accounts WHERE id = 42 FOR UPDATE;
-- After the application checks the balance is sufficient
UPDATE accounts SET balance = balance - 100 WHERE id = 42;
COMMIT;  -- lock released at commit

It's worth knowing the variants of FOR UPDATE. FOR SHARE takes a shared lock, allowing other reads but blocking writes. FOR UPDATE SKIP LOCKED skips already-locked rows, which is useful for a work-queue pattern where multiple workers each grab a different job. FOR UPDATE NOWAIT errors out immediately instead of waiting if it can't acquire the lock.

One thing to watch is deadlock. If two transactions try to lock each other's rows in opposite order, they wait forever. The database detects this and forcibly fails one, so the application must be ready to catch that error and retry. A practical way to reduce deadlocks is "always lock in the same order."

PostgreSQL versus MySQL — The Defaults Differ

Here is a difference that directly affects real work: the two databases have different default isolation levels.

PostgreSQL defaults to READ COMMITTED. It sees the latest committed snapshot as of the start of each statement. So even within one transaction, different statements can see different snapshots, meaning non-repeatable reads are possible at the default.
MySQL (InnoDB) defaults to REPEATABLE READ. It takes a snapshot at the transaction's first read and keeps it throughout the transaction.

More interesting is how each engine's higher levels actually behave.

PostgreSQL's REPEATABLE READ implements true snapshot isolation and actually prevents even the phantom reads the standard permits. Write skew, however, is still possible.
PostgreSQL's SERIALIZABLE uses a technique called SSI (Serializable Snapshot Isolation), which sits atop snapshot isolation and detects dangerous dependencies, failing conflicting transactions. It thus prevents even write skew, but at the cost of possible serialization failures (errors of the 40001 kind) that the application must retry.
MySQL/InnoDB's REPEATABLE READ gives a consistent snapshot for ordinary reads, but for locking reads (FOR UPDATE, etc.) it uses gap locks to prevent phantom inserts. It's a subtle blend of snapshot isolation and locking, so even the same REPEATABLE READ behaves differently in detail from PostgreSQL.

Why does this difference matter in practice? Because the same application code can exhibit different concurrency bugs depending on which database it runs against. Code that was fine on PostgreSQL may behave differently on MySQL, or vice versa. So any logic involving concurrency must be validated with awareness of the target database's default isolation level, raising the level explicitly if needed.

Practical Guidance

Let's compress everything into practice.

First, know your default isolation level going in. PostgreSQL defaults to READ COMMITTED, MySQL to REPEATABLE READ. Not knowing this leads to confusion like "why did this value change mid-transaction?" — or the reverse.

Next, beware the read-modify-write pattern. Logic that "writes based on a value it just read" — debiting a balance, decrementing stock, reserving a seat — is a breeding ground for lost updates. Protect it explicitly with an optimistic version column or SELECT ... FOR UPDATE.

Remember write skew. Even when each transaction looks legitimate on its own, combined they can break an invariant. Pure snapshot isolation won't stop it, so for an invariant you truly need, raise to SERIALIZABLE or enforce it with explicit locks.

Add retries for serialization failures and deadlocks. SERIALIZABLE, optimistic approaches, and locking all assume "if it fails, try again." Without retry logic, these safety mechanisms become outages instead.

Finally, keep transactions short. Long transactions hold locks longer, obstruct MVCC's cleanup of old versions (delaying PostgreSQL's VACUUM), and raise the probability of conflict.

Wrapping Up

A transaction starts from the simple promise of "bundle these operations into one and process them all-or-nothing," but the moment several transactions run at once it opens onto the deep problem of isolation. There are anomalies — dirty reads, non-repeatable reads, phantoms, write skew — and how much of them you prevent splits into the four isolation levels.

Modern databases mostly provide snapshot isolation via MVCC, where "reads don't block writes," and on top of that handle conflicts with locking or optimistic versions when needed. And PostgreSQL and MySQL differ from their default isolation level down to the fine behavior of their higher levels, so knowing your actual target database's behavior matters most of all.

The heart of it is this: perfect isolation is expensive, and abandoning isolation entirely is dangerous. So we decide "which anomalies can this data tolerate," then pick the matching isolation level and concurrency strategy. Make that choice deliberately, and transactions become not an object of fear but a foundation you can trust.

References

PostgreSQL documentation, "Transaction Isolation": https://www.postgresql.org/docs/current/transaction-iso.html
MySQL documentation, "Transaction Isolation Levels": https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html
"A Critique of ANSI SQL Isolation Levels" (Berenson et al., 1995): https://www.microsoft.com/en-us/research/publication/a-critique-of-ansi-sql-isolation-levels/
Martin Kleppmann, "Designing Data-Intensive Applications" (Chapter 7, Transactions)
Jepsen, "Consistency models": https://jepsen.io/consistency