Dr. Sandeep S. Kulkarni - Software Page

Softwares for:


Composing Distributed Fault-tolerance Components1     

Authors: Sandeep S. Kulkarni, Karun N. Biyani, and Umamaheswaran Arumugam.

Email: {sandeep, biyanika, arumugam}@cse.msu.edu;        Web Page: http://www.cse.msu.edu/~{sandeep, biyanika, arumugam}

This paper focuses on the dynamic composition of distributed fault-tolerance components. In order to adapt to the changing environment conditions, these components need to be dynamically added, removed or replaced. The need for adaptation arises due to the fact that there are often several fault-tolerance components that are suitable for adding fault-tolerance to a fault, and the choice of the component depends on the environment. Since a distributed fault-tolerance component has a fraction installed at every process, replacing such a component requires that the replacement be done consistently. More specifically, the component fraction should not be removed while other component fractions or the underlying application depends on it. The addition, removal and replacement of distributed components should satisfy three properties, namely, (1) atomicity, i.e., if a distributed component is added or removed, then all or none of its component fractions are composed with or removed from respective processes, (2) minimal blocking, i.e., during the addition and removal of a distributed component, the blocking of the application should be minimal, and (3) synchronization, i.e., while the component replacement is being performed, processes that are using the old component should not interact with those that are using the new component.

The dynamic addition, removal and replacement of distributed components is based on the concept of distributed reset protocol.

Application of the Framework for Message Communication

This simple application demonstrates most of the features including the availability of multiple distributed fault-tolerance components, the need for dynamic composition and dependency among component fractions. Two fault-tolerance components are developed for dealing with message losses.

1. Proactive component - Forward error correction (FEC) based component. The component fractions of this component exhibit acyclic dependency for removal.

2. Reactive component - Acknowledgement based component. The component fractions of this component exhibit acyclic dependency with blocking.

The code for the message communication application is available here.

Application of the Framework for Routing in Siesta

Siesta is a Simple NEST (Network Embedded Software Technology) Application Simulator. Siesta is developed by the Institute of Software Integrated Systems, Vanderbilt University. We implemented our framework in this simulator. AlternateRoute component was developed for dealing with node/router failures. This component routes a message in alternate links if the current next hop of the message is failed.

The code for the routing application is available here.


Automatic Addition of Fault-Tolerance to Distributed Programs: A Framework1

Authors: Sandeep S. Kulkarni and Ali Ebnenasir

Email: {sandeep, ebnenasi}@cse.msu.edu;        Web Page: http://www.cse.msu.edu/~{sandeep,ebnenasi}

The identification of all the faults that a program is subject to is often difficult at the design time. Hence, as we recognize new classes of faults, we need to incrementally add fault-tolerance to programs. Since verifying the correctness of a fault-tolerant program after the addition of fault-tolerance can be expensive, automatic addition of fault-tolerance can help the designers in adding fault-tolerance to intolerant programs (as an automatically synthesis program will be correct by construction).   To automatically synthesis a fault-tolerant program, we begin by a fault-intolerant program, its specification, its invariant, and a class of faults. Then, we systematically add fault-tolerance to the fault-intolerant program for the given faults.

It is shown that the automatic addition of fault-tolerance to distributed programs is NP-hard. An approach for dealing with the complexity involved in the automatic addition of fault-tolerance is to develop heuristics. As we develop new heuristics, (i) we need to integrate all these heuristics in a framework that helps developers to synthesis a fault-tolerant distributed program, and also (ii) we need to evaluate the applicability of our heuristics in the synthesis of a fault-tolerant program. In order to achieve these goals, we have designed and implemented a framework for automatic addition of fault-tolerance to distributed programs.

As the program model in our framework, we use Dijkstra's guarded command language. Thus, the user should specify each process of the fault-intolerant program as a set of guarded command actions. Also, the faults, that a program is subject to, should be represented by guarded commands. The output fault-tolerant program will also be presented in guarded commands. 

In our model, each program consists of a finite set of processes and variables. The domain of each variable is also finite. We represent each state of the program by a value assignment to all the variables. All possible value assignments to the program variables determine the state space of a program. Since the domain of each variable is finite, the state space of the program will be finite. Also, some read/write restrictions determine the restrictions of every process with respect to reading or writing on each program variable.

For the implementation, we have chosen Java environment. Thus, our framework is platform independent. Our framework is an extension and modification to Chippada’s work (He implemented some data structures for the synthesis of a canonical version of Byzantine agreement program.). For example, in his implementation, the output fault-tolerant program is represented as a list of the states and transitions, which is difficult to understand. We have implemented the output part of the user interface so that we can represent the output fault-tolerant program by guarded command language. Also, we have added new heuristics for dealing with deadlock states during the synthesis.


Thanks to Arun K. Chippada who designed and implemented the initial data structures required for the framework.

 Example:  Automatic Synthesis of Byzantine Agreement Program

In this application, we have synthesized an agreement program that consists of three non-general processes and one general process, and the program tolerates Byzantine faults.  To reach the code of our framework that has been used in this example, please click here.



TDMA Service for Sensor Networks1     

Authors: Sandeep S. Kulkarni and Mahesh (Umamaheswaran) Arumugam.

Email: {sandeep, arumugam}@cse.msu.edu;        Web Page: http://www.cse.msu.edu/~{sandeep, arumugam}

This paper focuses on design, development and performance evaluation of our TDMA based MAC protocol for sensor networks. We have implemented the algorithm as a middleware service on MICA motes and also, simulated it using Prowler.

TDMA Implementation on MICA [HSW+00, CHB+01]

TDMA is implemented on the MICA 2 motes developed by University of California, Berkeley.

The middleware code is available here.

TDMA Simulation on Prowler [SVM+03]

The simulation results presented in this paper are performed on a MATLAB based simulation environment called prowler, a probabilistic wireless network simulator, written by the Institute of Software Integrated Systems at Vanderbilt University. The simulator is available at http://www.isis.vanderbilt.edu/projects/nest/prowler/.

The simulation code is available here.



1 This work was partially supported by NSF CAREER CCR-0092724, DARPA grant OSURS01-C-1901, ONR grant N00014-01-1-0744, and a grant from Michigan State University.
   Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies. Use the software available in this page at your own risk. The authors are not responsible for any problems and damages.

Last Updated: September 29, 2003