The Complexity of Adding Failsafe Fault-Tolerance
Sandeep S. Kulkarni and Ali Ebnenasir
Abstract
In this paper, we focus our attention on the problem of automating the addition of failsafe fault-tolerance where fault-tolerance is added
to an existing (fault-intolerant) program. A failsafe fault-tolerant program satisfies its specification (including safety and liveness) in
the absence of faults. And, in the presence of faults, it satisfies its safety specification. We present a somewhat
unexpected result that, in general, the problem of adding failsafe fault-tolerance in distributed programs is
NP-hard. Towards this end, we reduce the 3-SAT problem to the problem of adding failsafe fault-tolerance. We also identify a class of
specifications, {\em monotonic specifications} and a class of programs, {\em monotonic programs}.
Given a (positive) monotonic specification and a (negative) monotonic program, we
show that failsafe fault-tolerance can be added in polynomial time. We note that the monotonicity restrictions are met
for commonly encountered problems such as Byzantine agreement, distributed consensus, and atomic commitment.
Finally, we argue that the restrictions on the specifications and programs are
necessary to add failsafe fault-tolerance in polynomial time; we prove that if only one of these conditions is satisfied,
the addition of failsafe fault-tolerance is still NP-hard.
Paper:
Slides
BibTeX Entry
@Article{ke02,
author = {S.~S.~Kulkarni and A.~Ebnenasir},
title = {The Complexity of Adding
Failsafe Fault-Tolerance},
journal = {International Conference on
Distributed Computing Systems},
year = {2002},
OPTkey = {},
OPTvolume = {},
OPTnumber = {},
OPTpages = {},
OPTmonth = {},
OPTnote = {},
OPTannote = {}
}
Return to the publication list
Return to the Sandeep's home page