Year of Publication
International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS)
In this paper, we focus on the problem of automated addition of fault-tolerance to an existing fault-intolerant real-time program. We consider three levels of fault-tolerance, namely nonmasking, failsafe, and masking, based on safety and liveness properties satisfied in the presence of faults. More specifically, a nonmasking (respectively, failsafe, masking) program satisfies liveness (respectively, safety, both safety and liveness) in the presence of faults. For failsafe and masking fault-tolerance, we consider two additional levels, soft and hard, based on satisfaction of timing constraints in the presence of faults. We present a polynomial time algorithm (in the size of the input program’s region graph) that adds bounded-time recovery from an arbitrary given set of states to another arbitrary set of states. Using this algorithm, we propose a sound and complete synthesis algorithm that transforms a fault-intolerant real-time program into a nonmasking fault-tolerant program. Furthermore, we introduce sound and complete algorithms for adding soft/hard-failsafe fault-tolerance. For reasons of space, our results on addition of soft/hard-masking fault-tolerance are presented in a technical report.