Year of Publication
International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS)
This paper introduces a theory of fault recovery for component-based models. We specify a model in terms of a set of atomic components incrementally composed and synchronized by a set of glue operators. We define what it means for such models to provide a recovery mechanism, so that the model converges to its normal behavior in the presence of faults (e.g., in self-stabilizing systems). We present a sufficient condition for incrementally composing components to obtain models that provide fault recovery. We identify corrector components whose presence in a model is essential to guarantee recovery after the occurrence of faults. We also formalize component-based models that effectively separate recovery from functional concerns. We also show that any model that provides fault recovery can be transformed into an equivalent model, where functional and recovery tasks are modularized in different components.