In the spring of 2006, a small team at Google was staring at a problem that seemed almost philosophical: what is a document, really, if two people can change it at the same time and both be right?
The answer they arrived at, and the engineering required to get there, explains why your Google Doc doesn’t explode when your colleague edits the same sentence you’re currently rewriting. It also explains a set of tradeoffs that every collaborative software system has been negotiating ever since.
The Setup
Before Google Docs, collaborative document editing worked the way most version control still works for code: you checked out a file, made changes, and checked it back in. If someone else had edited it while you held the lock, you resolved the conflict manually. This is fine for code, where developers expect friction. It is catastrophic for a consumer product where the expectation is seamlessness.
The team building what would become Google Docs inherited a product called Writely, acquired in 2006. Writely used a naive model: the last save wins. Two people editing simultaneously would silently overwrite each other’s work. That was clearly untenable at Google’s scale and ambition.
The engineers turned to a concept called Operational Transformation, or OT, which had existed in academic literature since a 1989 paper by Ellis and Gibbs. The idea is elegant in theory: instead of sending the full document state back and forth, you send operations. “Insert ‘hello’ at position 5.” “Delete character at position 12.” The server receives these operations, figures out how they interact, and transforms them so they can be applied in any order and still produce the same result.
The word “elegant” is doing a lot of work in that sentence. In practice, OT is notoriously difficult to implement correctly. The transformation functions that reconcile conflicting operations grow exponentially more complex as you add more operation types. A system that handles inserts and deletes correctly can still break catastrophically when you introduce formatting, nested structures, or rich media.
What Actually Happened
Google’s implementation of OT in Google Docs held together well enough to ship and scale. The product launched publicly in 2006, and for years the real-time collaboration it offered felt like a genuine demonstration of what networked software could do.
But the seams showed. Anyone who has used Google Docs heavily has experienced the cursor jumping unexpectedly, text appearing in the wrong place, or the document briefly going into an inconsistent state before snapping back. These aren’t bugs in the traditional sense. They’re the visible residue of OT doing its job imperfectly under latency, or edge cases that the transformation functions didn’t anticipate.
The deeper problem was architectural. OT requires a central server to act as referee. Every client sends operations to the server, which applies them in a canonical order and broadcasts the transformed versions back out. This works, but it means the server is a bottleneck. It also means that building an offline mode, where the client operates without a connection and syncs later, is genuinely hard. The “current position” of every character in the document is a function of the entire operation history, which the server holds.
A competing approach, Conflict-free Replicated Data Types (CRDTs), emerged from distributed systems research as a potential answer to this problem. CRDTs are data structures designed so that any two copies, no matter how they diverge, can always be merged without conflicts. You don’t need a central referee. You don’t need transformation functions. You need the data structure itself to encode the rules of merging.
Figma, building a collaborative design tool years after Google Docs shipped, made a different set of tradeoffs. Its architecture leans on a server-authoritative model closer to how multiplayer games work: the server holds the canonical state, clients send intentions, the server resolves them and broadcasts updates. This is less theoretically pure than CRDTs but practically very fast and predictable for design objects that behave differently than flowing text.
Notion, for its part, built on a CRDT foundation. So did the open-source collaborative editor Yjs, which has become widely used precisely because it sidesteps the complexity of OT while enabling genuinely peer-to-peer collaboration. Linear uses CRDTs for its real-time features. So does Figma’s newer infrastructure for certain features.
Why It Matters
The question of how to handle simultaneous edits is not a narrow engineering curiosity. It sits at the intersection of data integrity, user experience, and system architecture in ways that propagate outward into product decisions.
Consider what “correctness” even means here. If Alice types “the cat sat” and Bob simultaneously deletes “cat” and types “dog,” what should the document say? “The dog sat” is probably the right answer, but arriving there requires the system to understand intent, not just position. OT and CRDTs handle this differently, and neither handles it perfectly in every case. The choice of algorithm is also a choice about what kinds of failures you’re willing to accept.
There’s a parallel to the broader challenge of distributed systems. The CAP theorem, formulated by Eric Brewer, states that a distributed system can guarantee at most two of three properties: consistency, availability, and partition tolerance. Real-time collaborative editing is essentially a user-facing instance of this tradeoff. Google Docs prioritizes availability and partition tolerance (you can keep editing even under network issues) and accepts some consistency oddities. A system that prioritized strict consistency would freeze while it waited for conflicts to resolve, which is unacceptable in a consumer product.
The lesson generalizes. Every system where multiple actors touch shared state is navigating this same tradeoff, whether it’s a collaborative document, a database with concurrent transactions, or a distributed ledger. The technology you choose to resolve conflicts encodes your priorities.
What We Can Learn
First: the academic solution and the production solution are often different things. OT was theoretically sound in 1989. Implementing it at Google’s scale required years of engineering to get close to right, and it still shows cracks. CRDTs are theoretically cleaner in many ways, but they have their own performance and memory overhead tradeoffs that make them the wrong choice for some use cases.
Second: the user’s mental model matters as much as the technical model. Google Docs works not just because OT resolves conflicts, but because the interface makes the consequences of conflict resolution legible. You can see other cursors. You can see changes appearing in real time. The transparency makes the occasional glitch interpretable rather than alarming. Systems that resolve conflicts silently, without surfacing what happened, tend to erode trust over time even when they’re technically correct.
Third: the hardest problems in software are often the ones that look solved. Collaborative text editing looked like a solved problem after Google Docs shipped. But Figma’s success, and the rise of CRDT-based tooling, suggests that the solution space was much larger than one implementation. The software nobody rewrites is usually the most critical, and the approaches nobody revisits are often the ones most worth questioning.
The question of what happens when two users edit the same file at exactly the same moment has a clean answer only in textbooks. In production, it’s a negotiation between mathematical guarantees, latency budgets, user expectations, and the particular failure modes your users will tolerate. Google made one set of choices. Figma made another. Both shipped products that work. The choices are not equivalent, and the consequences have shaped what each product can and cannot do.
That’s the real lesson: the conflict resolution algorithm isn’t a detail you pick at the start and forget. It’s a commitment to a particular vision of what your product is.