The CAP theorem provides system designers with a choice between three guarantees: consistency, availability, and partition tolerance. While partitions are rare, there is an incredible range of flexibility for handling partitions and recovering from them. Thus, our goal is to allow combinations of consistency and availability and not worry about choosing one over the other. Such an approach refines the perceived limitations of CAP, paving the way for operating during a partitioning process and for recovery afterwards.

Take for example two data-grid instances replicating states between each other. Replication is broken when these two instances are on opposite sides of a network partition. Allowing at least one instance to accept updates to its state will cause the replica to become inconsistent. This can happen if an instance is still accessible to clients within the network, thus compromising consistency but preserving availability. Likewise, if the choice is to preserve consistency, the data-grid instances must act as if they are unavailable (to updates), thus compromising availability. In some sense, we can reason about creating choices that focus on availability first and consistency second. For example, allow ‘reads’ but deny ‘updates’ during a network partition.

Because network partitions are rare, CAP should allow perfect consistency and availability most of the time, but when partitions are present, a strategy that detects partitions and explicitly accounts for them is in order. This strategy should have three steps: detect partitions, enter an explicit partition mode that can limit some operations, and initiate a recovery process to restore consistency and compensate for mistakes made during a partition.

To recognize a network partition, we can use a quorum to assist in bookkeeping the discovery and availability of instances. When an instance is found in the minority, availability is limited to a subset of operations. When network partitioning is resolved, consistency is restored through in-memory state recovery mechanisms.

XAPs’ in-memory data-grid uses ZooKeeper as a centralized service for maintaining discovered instances, and allowing distributed synchronization to assist in quorum based decisions. This approach allows us to preserve consistency, while losing availability only to the minority side of the partition. Availability will be restored immediately once a majority has been regained. Not having to deal with levels of consistency frees us from application nuances.


Take XAP for a spin!


On the other hand, some applications require perfect availability, and be allowed to mitigate the consistent state upon network recovery. XAP uses a heuristic approach in the face of possible loss of consistent data. Thus during a network-partition, data-grid instances allow updates to state risking inconsistency between data-partition replicas. When network-partitioning is resolved, XAP reaches resolution through a variety of heuristic factors. Some of these factors include resolving the instance with the most consistent replication state, resolving instances with an active replication channel, resolving state from the oldest instance first, or resolving by unique instance IDs. This automatic recovery is best suited for applications where loss of state can be compensated for by other means (or discarded). Otherwise, manual intervention may be in-order. In this case XAP will allow data-grid instances to be available during a network-partition, but once they are resolved, updates to state are denied until a manual compensation is performed, preserving consistency through the hands of an operator.

With all these options available, system designers should choose the best configuration for their individual needs, thus ensuring that they achieve availability and consistency in the presence of network-partitions.

Availability and Consistency in the Presence of Partitions
Meron Avigdor on Linkedin
Meron Avigdor
System Architect @ GigaSpaces
At GigaSpaces we apply human-centered design approach to what we develop!
Tagged on: