Dr. Sandeep S. Kulkarni - Software Page
Softwares for:
Composing Distributed Fault-tolerance Components1
Authors: Sandeep S. Kulkarni, Karun N. Biyani, and Umamaheswaran
Arumugam.
Email: {sandeep, biyanika, arumugam}@cse.msu.edu;
Web Page: http://www.cse.msu.edu/~{sandeep,
biyanika, arumugam}
This
paper focuses on the dynamic composition of distributed fault-tolerance components.
In order to adapt to the changing environment conditions, these components need
to be dynamically added, removed or replaced. The need for adaptation arises
due to the fact that there are often several fault-tolerance components that
are suitable for adding fault-tolerance to a fault, and the choice of the
component depends on the environment. Since a distributed fault-tolerance
component has a fraction installed at every process, replacing such a component
requires that the replacement be done consistently. More specifically, the
component fraction should not be removed while other component fractions or the
underlying application depends on it. The addition, removal and replacement of
distributed components should satisfy three properties, namely, (1) atomicity,
i.e., if a distributed component is added or removed, then all or none of its
component fractions are composed with or removed from respective processes, (2)
minimal blocking, i.e., during the addition and removal of a distributed
component, the blocking of the application should be minimal, and (3) synchronization,
i.e., while the component replacement is being performed, processes that are
using the old component should not interact with those that are using the new
component.
The dynamic addition, removal and replacement of distributed components is based on the concept of distributed reset protocol.
Application of the Framework for Message Communication
This simple application demonstrates most of the features including the availability of multiple distributed fault-tolerance components, the need for dynamic composition and dependency among component fractions. Two fault-tolerance components are developed for dealing with message losses.
1. Proactive component - Forward error correction (FEC) based component. The component fractions of this component exhibit acyclic dependency for removal.
2. Reactive component - Acknowledgement based component. The component fractions of this component exhibit acyclic dependency with blocking.
The code for the message communication application is available here.
Application of the Framework for Routing in Siesta
Siesta is a Simple NEST (Network Embedded Software Technology) Application Simulator. Siesta is developed by the Institute of Software Integrated Systems, Vanderbilt University. We implemented our framework in this simulator. AlternateRoute component was developed for dealing with node/router failures. This component routes a message in alternate links if the current next hop of the message is failed.
The code for the routing application is available here.
Automatic Addition of Fault-Tolerance to Distributed Programs: A Framework1
Authors: Sandeep S. Kulkarni and Ali Ebnenasir
Email: {sandeep, ebnenasi}@cse.msu.edu; Web Page: http://www.cse.msu.edu/~{sandeep,ebnenasi}
The identification of all the faults that a program is subject to is often difficult at the design time. Hence, as we recognize new classes of faults, we need to incrementally add fault-tolerance to programs. Since verifying the correctness of a fault-tolerant program after the addition of fault-tolerance can be expensive, automatic addition of fault-tolerance can help the designers in adding fault-tolerance to intolerant programs (as an automatically synthesis program will be correct by construction). To automatically synthesis a fault-tolerant program, we begin by a fault-intolerant program, its specification, its invariant, and a class of faults. Then, we systematically add fault-tolerance to the fault-intolerant program for the given faults.
It is shown that the automatic addition of fault-tolerance to distributed programs is NP-hard. An approach for dealing with the complexity involved in the automatic addition of fault-tolerance is to develop heuristics. As we develop new heuristics, (i) we need to integrate all these heuristics in a framework that helps developers to synthesis a fault-tolerant distributed program, and also (ii) we need to evaluate the applicability of our heuristics in the synthesis of a fault-tolerant program. In order to achieve these goals, we have designed and implemented a framework for automatic addition of fault-tolerance to distributed programs.
As the program
model in our framework, we use Dijkstra's guarded command language. Thus, the
user should specify each process of the fault-intolerant program as a set of
guarded command actions. Also, the faults, that a program is subject to, should
be represented by guarded commands. The output fault-tolerant program will also
be presented in guarded commands.
In our model, each program consists of a finite set of processes
and variables. The domain of each variable is also finite. We represent each
state of the program by a value assignment to all the variables. All possible
value assignments to the program variables determine the state space of a
program. Since the domain of each variable is finite, the state space of the
program will be finite. Also, some read/write restrictions determine the
restrictions of every process with respect to reading or writing on each
program variable.
For the implementation, we have chosen Java environment. Thus, our
framework is platform independent. Our framework is an extension and
modification to Chippada’s work (He implemented some data structures for the
synthesis of a canonical version of Byzantine agreement program.). For example,
in his implementation, the output fault-tolerant program is represented as a
list of the states and transitions, which is difficult to understand. We have
implemented the output part of the user interface so that we can represent the
output fault-tolerant program by guarded command language. Also, we have added
new heuristics for dealing with deadlock states during the synthesis.
Acknowledgement:
Thanks to Arun K. Chippada who designed and implemented the initial data structures required for the framework.
In this application, we have synthesized an agreement program that consists of three non-general processes and one general process, and the program tolerates Byzantine faults. To reach the code of our framework that has been used in this example, please click here.
TDMA Service for Sensor Networks1
Authors: Sandeep S. Kulkarni and Mahesh (Umamaheswaran)
Arumugam.
Email: {sandeep, arumugam}@cse.msu.edu;
Web Page: http://www.cse.msu.edu/~{sandeep,
arumugam}
This
paper focuses on design, development and performance evaluation of our TDMA
based MAC protocol for sensor networks. We have implemented the algorithm as a
middleware service on MICA motes and also, simulated it using Prowler.
TDMA Implementation on MICA [HSW+00, CHB+01]
TDMA is implemented on the MICA 2 motes developed by University of California, Berkeley.
The middleware code is available here.
TDMA Simulation on Prowler [SVM+03]
The simulation results presented in this paper are performed on a MATLAB based simulation environment called prowler, a probabilistic wireless network simulator, written by the Institute of Software Integrated Systems at Vanderbilt University. The simulator is available at http://www.isis.vanderbilt.edu/projects/nest/prowler/.
The simulation code is available here.
1 This work was partially
supported by NSF CAREER CCR-0092724, DARPA grant OSURS01-C-1901, ONR grant
N00014-01-1-0744, and a grant from Michigan State University.
Any opinions, findings, and
conclusions or recommendations expressed in this material are those of the
author(s) and do not necessarily reflect the views of the funding agencies. Use
the software available in this page at your own risk. The authors are not
responsible for any problems and damages.
Last Updated: September 29, 2003