The CAP theorem says that consistency, availability, and partition-tolerance cannot all be guaranteed within the same system.
Consistency means that views of data should remain the same among distributed nodes, even when operations are done to the data. Availability means that data should be readily available. Partition-tolerance is the ability of they system to live through cutoffs in the network.
The theorem is basically proved as follows. Assume we have two nodes, N1 and N2. Assume that N1 does a write on object v, causing the value to change from v0 to v1. Assume, around the same time, N2 does a read on the object v. N2's copy of v needs to be synchronized with N1's change before N2 does the read. If we assume consistency (N2 should read v as v1), it is impossible to guarantee both availability and partition-tolerance because everything we can do to guarantee that N2's read happens after the synchronization (blocking, central management, etc.) effects either availability or partition-tolerance. So, all three properties cannot be guaranteed at the same time.
I think this actually should have great impact on how programmers write code for clusters and perhaps consider the BASE model more seriously.
No comments:
Post a Comment