Skip to main content
Miraç AĞCABAY
Tüm yazılar
Distributed SystemsCRDTsReal-time

Building Conflict-Free Replicated Data Types

22 Mart 2025  ·  11 dk okuma

Collaborative editing tools — Figma, Linear, Notion — handle concurrent edits from multiple clients without a central lock. The mechanism behind this is often a CRDT: a Conflict-free Replicated Data Type.

CRDTs give you a mathematical guarantee: if two replicas independently apply the same set of operations in any order, they will converge to the same state. No consensus protocol required.

The Convergence Problem

Consider two clients editing a shared counter. Client A increments by 3. Client B increments by 5. They're offline. When they reconnect, the correct value is initial + 3 + 5. But if both clients sent their full state (8 and 10 starting from the same 5), last-write-wins would give you the wrong answer.

The problem isn't the network — it's the data model. A scalar value carries no information about who changed it or by how much.

G-Counter: Growth-Only

A G-Counter (grow-only counter) solves this by giving each node its own slot in a vector:

type NodeId = string;
type GCounter = Record<NodeId, number>;
 
function increment(counter: GCounter, nodeId: NodeId): GCounter {
  return {
    ...counter,
    [nodeId]: (counter[nodeId] ?? 0) + 1,
  };
}
 
function value(counter: GCounter): number {
  return Object.values(counter).reduce((sum, v) => sum + v, 0);
}
 
function merge(a: GCounter, b: GCounter): GCounter {
  const allKeys = new Set([...Object.keys(a), ...Object.keys(b)]);
  const result: GCounter = {};
  for (const key of allKeys) {
    result[key] = Math.max(a[key] ?? 0, b[key] ?? 0);
  }
  return result;
}

The merge function uses max per slot. This is idempotent (merging twice gives the same result), commutative (order doesn't matter), and associative (grouping doesn't matter). These three properties define a join-semilattice — the mathematical structure CRDTs are built on.

PN-Counter: Add and Subtract

A G-Counter only grows. To support decrement, use a PN-Counter: two G-Counters, one for increments (P) and one for decrements (N).

type PNCounter = { p: GCounter; n: GCounter };
 
function pnIncrement(c: PNCounter, nodeId: NodeId): PNCounter {
  return { ...c, p: increment(c.p, nodeId) };
}
 
function pnDecrement(c: PNCounter, nodeId: NodeId): PNCounter {
  return { ...c, n: increment(c.n, nodeId) };
}
 
function pnValue(c: PNCounter): number {
  return value(c.p) - value(c.n);
}
 
function pnMerge(a: PNCounter, b: PNCounter): PNCounter {
  return { p: merge(a.p, b.p), n: merge(a.n, b.n) };
}

The value can go negative, but the merge is still correct — each node's contribution is tracked independently.

LWW-Register: Last-Write-Wins with Causality

For mutable values (not counters), a Last-Write-Wins Register attaches a timestamp or Lamport clock to each write. Merge takes the entry with the higher timestamp.

type LWWRegister<T> = { value: T; timestamp: number; nodeId: NodeId };
 
function lwwWrite<T>(
  reg: LWWRegister<T>,
  value: T,
  nodeId: NodeId
): LWWRegister<T> {
  return { value, timestamp: Date.now(), nodeId };
}
 
function lwwMerge<T>(a: LWWRegister<T>, b: LWWRegister<T>): LWWRegister<T> {
  if (a.timestamp > b.timestamp) return a;
  if (b.timestamp > a.timestamp) return b;
  // Tie-break by node ID for determinism
  return a.nodeId > b.nodeId ? a : b;
}

Date.now() is dangerous in distributed systems — clock skew will cause you to drop legitimate writes. In production, use Hybrid Logical Clocks (HLC) which combine wall-clock time with a Lamport counter to guarantee monotonicity while staying close to real time.

Operation-Based vs State-Based

Everything above is state-based (CvRDT): nodes exchange their full state and merge it. This is simple but bandwidth-heavy for large structures.

Operation-based CRDTs (CmRDT) send deltas — just the operations. They require a reliable broadcast layer that guarantees every operation is delivered exactly once to every replica. The tradeoff: less bandwidth, more infrastructure complexity.

Most production systems (Yjs, Automerge) use hybrid approaches with delta-state encoding.

Where CRDTs Break Down

CRDTs guarantee convergence, not intention preservation. If two users concurrently move the same element to different positions, a CRDT will pick one deterministically — but neither user's full intention is preserved.

For rich text editing, pure CRDTs produce correct documents that look nothing like what the users intended. Operational Transform (OT) or intent-preserving CRDTs with richer metadata are needed for that class of problem.

CRDTs are the right tool when: you can tolerate eventual consistency, operations commute naturally, and you control both the data model and the network layer.