A Decade of CAP

Mar 2016

Consistency, availability, and tolerance to network partitions: you can have at most two of these properties for any shared-data system.
Eric Brewer

Distributed systems folks think that everyone should read FLP, but engineers in the trenches know the truth: the CAP theorem is king. (Whether or not the many alternate formulations of the safety-liveness tradeoff are somehow better than CAP is beside the point. They lost this round of marketing, so they’re not widely-known enough to be a useful lingua franca.) Despite all the competition, Brewer’s conjecture is the most common framework for working programmers to think through the trade-offs inherent in distributed systems.

But even though we name-check CAP all the time, most of us haven’t fully grokked Gilbert and Lynch’s proof of Brewer’s conjecture. If you’re not sure exactly what consistency and availability mean, you’re not alone; I recently gave a talk called A Decade of CAP to a group of engineers at work. (The slides for my talk include a great illustration of linearizability that I stole from Martin Kleppman’s blog. Thanks, Martin!) Even in a group of experienced engineers, not everyone was familiar with the details of the paper. (If you’d prefer, you can also download a PDF version of the slides.)

It’s trendy to criticize CAP these days, but I find most of the criticism unconvincing (though interesting). CAP’s weakness is simple: its notions of consistency and availability are so stringent, and the system it proposes is so adversarial, that the final result doesn’t shed much light on our everyday work. Put differently, most useful systems are neither CP nor AP. That doesn’t make CAP any less true, it just means that reality is complicated. Until an alternative emerges that’s both simpler and more useful, I’ll continue to anchor my architecture discussions in CAP, and so should you.

Readable References

If you’re interested in distributed systems but don’t know where to start, you may find some of these resources useful. Most of them take great pains to avoid the dense jargon that makes formal distributed systems writing so hard to approach.