USING
EVENTUAL CONSISTENCY TO IMPROVE THE PERFORMANCE OF DISTRIBUTED GRAPH
COMPUTATION IN KEY-VALUE STORES
Key-value stores have gained
increasing popularity due to their fast performance and simple data model. A
key-value store usually consists of multiple replicas located in different
geographical regions to provide higher availability and fault tolerance.
Consequently, a protocol is employed to ensure that data are consistent across
the replicas. The CAP theorem states the impossibility of simultaneously
achieving three desirable properties in a distributed system, namely
consistency, availability, and network partition tolerance. Since failures are
a norm in distributed systems and the capability to maintain the service at an
acceptable level in the presence of failures is a critical dependability and
business requirement of any system, the partition tolerance property is a
necessity. Consequently, the trade-off between consistency and availability
(performance) is inevitable. Strong consistency is attained at the cost of slow
performance and fast performance is attained at the cost of weak consistency,
resulting in a spectrum of consistency models suitable for different needs.
Among the consistency models, sequential consistency and eventual consistency
are two common ones. The former is easier to program with but suffers from poor
performance
whereas the latter suffers from potential
data anomalies while providing higher performance.
In this dissertation, we focus on
the problem of what a designer should do if he/she is asked to solve a problem
on a key-value store that provides eventual consistency. Specifically, we are
interested in the approaches that allow the designer to run his/her
applications on an eventually consistent key-value store and handle data
anomalies if they occur during the computation. To that end, we investigate two
options: (1) Using detect-rollback approach, and (2) Using stabilization
approach. In the first option, the designer identifies a correctness predicate,
say _, and continues to run the application as if it was running on sequential
consistency, as our system monitors _. If _ is violated (because the underlying
key-value store provides eventual consistency), the system rolls back to a
state where _ holds and the computation is resumed from there. In the second
option, the data anomalies are treated as state perturbations and handled by
the convergence property of stabilizing algorithms.
We choose LinkedIn’s
Voldemort key-value store as the example key-value store for our study. We run
experiments with several graph-based applications on Amazon AWS platform to
evaluate the benefits of the two approaches. From the experiment results, we
observe that overall, both approaches provide benefits to the applications when
compared to running the applications on sequential consistency. However,
stabilization provides higher benefits, especially in the aggressive
stabilization mode which trades more perturbations for no locking overhead. The
results suggest that while there is some cost associated with making an
algorithm stabilizing, there may be a substantial benefit in revising an
existing algorithm for the problem at hand to make it stabilizing and reduce
the overall runtime under eventual consistency. There are several directions of
extension. For the detect-rollback approach, we are working to develop a more
general rollback mechanism for the applications and improve the efficiency and
accuracy of the monitors. For the stabilization approach, we are working to
develop an analytical model for the benefits of eventual consistency in
stabilizing programs. Our current work focuses on silent stabilization and we
plan to extend our approach to other variations of stabilization
Paper:
Return to the publication list
Return to the Sandeep's home page