When two services hold conflicting versions of the same fact, most teams treat it as a bug to fix. It’s actually a design decision you already made, whether you knew it or not. The conflict isn’t the problem. The problem is that you never declared who owns the truth.
Distributed systems don’t have a single clock
The root cause of most data disagreements between microservices is deceptively simple: two services wrote to their own data stores at slightly different times, and something failed between those writes. Maybe a network partition swallowed an event. Maybe a consumer was down when a message was published. Maybe two services both tried to update the same logical record within the same second and neither knew about the other.
This is not a bug in your code. It’s a consequence of the CAP theorem (the proof, formalized by Eric Brewer, that a distributed system can guarantee at most two of three properties: consistency, availability, and partition tolerance). When you split a monolith into services, you implicitly traded strong consistency for availability and fault isolation. The disagreement you’re seeing is the invoice.
The question is never “how do I prevent services from disagreeing” because you can’t, not without giving up the resilience that made microservices appealing. The real question is: when they disagree, which one wins, and does your system know the answer?
Most systems have an owner. Most teams haven’t written it down.
In practice, every shared piece of data has a single authoritative source. The order service knows the canonical state of an order. The inventory service knows what’s actually in stock. The billing service knows what was charged. These aren’t suggestions; they’re the logical consequence of which service is responsible for the lifecycle of that data.
The failure mode I see most often is services that read each other’s data and then cache it locally without a clear invalidation strategy. Service A stores a copy of a user’s subscription tier. Service B updates that tier. Service A’s cache is now wrong, and it will stay wrong until some TTL (time-to-live, a cache expiry duration) expires or someone flushes it manually. When a customer calls in angry, the support team sees two different subscription states depending on which service their tooling queries.
The fix isn’t technical. It’s a declaration: the subscription service owns subscription state. Everyone else either queries it directly or accepts that their cached copy might be stale and designs their logic accordingly. Write it in your architecture doc. Make it boring.
“Last write wins” is a policy, not a default
Some teams reach for Last Write Wins (LWW) as a conflict resolution strategy. The idea is that whichever write has the latest timestamp takes precedence. This sounds reasonable until you remember that clocks in distributed systems are not synchronized. Two servers can disagree on the current time by hundreds of milliseconds, which is more than enough to make the “newer” write the wrong one.
Vector clocks and CRDTs (Conflict-free Replicated Data Types, data structures specifically designed to merge concurrent edits without conflicts) exist precisely because timestamp ordering is unreliable. Amazon’s Dynamo paper, published in 2007, described this problem in detail and proposed vector clocks as a solution for their shopping cart data. The insight there was that some conflicts are genuinely unresolvable by the system and need to be surfaced to the application layer, or in that case, the user.
If you’re using LWW today, you’re not resolving conflicts, you’re suppressing them and hoping the suppressed write was the one you didn’t need.
Event sourcing doesn’t save you, but it does give you better forensics
Event sourcing (the pattern of storing state as an append-only log of events rather than overwriting a current value) is sometimes proposed as the solution to data disagreements. The thinking goes: if every state change is recorded as an immutable event, you can always reconstruct what happened and why.
This is true, and it’s genuinely useful. But event sourcing shifts the problem rather than eliminating it. Now instead of two services having conflicting current states, you have two event logs that need to be merged in the right order to produce a consistent projection. Projection is the process of replaying events to compute a current state. If your event ordering is wrong, your projection is wrong.
What event sourcing actually gives you is auditability and replayability. You can answer “what did Service A think the state was at 14:32:07?” without guessing. That’s not nothing. It makes debugging distributed disagreements dramatically faster. But it’s a better flashlight, not a map out of the cave.
The counterargument
Some architects argue that strong consistency (all reads returning the most recent write, always) is achievable in microservices if you use distributed transactions via the saga pattern or two-phase commit (2PC). The argument is that you shouldn’t accept eventual consistency as a given; you should design your way out of it.
They’re not wrong that these tools exist. They’re wrong that the tradeoff is free. 2PC requires a coordinator that can fail, and when it does, participating services can be left holding locks indefinitely. Sagas (a sequence of local transactions with compensating rollbacks for failure cases) avoid locking but require you to write and test every compensating transaction, which is substantial engineering work that most teams underestimate by a factor of three.
Strong consistency across service boundaries is possible. It’s just expensive enough that most systems are better served by embracing eventual consistency, declaring data ownership clearly, and handling the edge cases explicitly rather than pretending they won’t occur.
Own the conflict or it will own you
Two microservices disagreeing about the same data isn’t a sign that your architecture is broken. It’s a sign that your architecture is distributed, which is what you chose. The systems that handle this gracefully aren’t the ones that prevented disagreement. They’re the ones where every engineer can answer, without looking anything up, which service owns which data and what happens when a dependent service has a stale copy.
That answer lives in your design documents, your on-call runbooks, and your code. If it’s not written down, it doesn’t exist, and the next conflict will feel like a surprise even though it was always inevitable.
Decide who owns the truth. Write it down. Build everything else around that decision.