Reinforcement Learning Repository at MSU
Demos and Implementation (Domains)
This section contains programs which demonstrate
reinforcement learning in action, as an illustration of the concepts and
common algorithms. These programs might provide a useful
starting place for the implementation of reinforcement learning to solve
real
problems and advance research in this area. Wherever possible, source
code is included.
Please note that use of this software is restricted; you must
read this license agreement and agree to its terms
before downloading any
software from this site. Downloading the software is considered consent
to the terms.
If you would like to contribute source code or make
suggestions for
improvement of what is included here, send to Natalia Hernandez or Sridhar Mahadevan.
Cart-Pole Problem
Simulation of the cart and pole dynamic
system and
a procedure for learning to balance the pole. Both are described in
Barto, Sutton, and Anderson, "Neuronlike Adaptive Elements That Can
Solve
Difficult Learning Control Problems," IEEE Trans. Syst., Man, Cybern.,
Vol. SMC-13, pp. 834--846, Sept.--Oct. 1983 Written by Rich Sutton.
Source code: cpole.tar
(16 K, requires C compiler)
Cell Phone
Interactive Java demonstration illustrating the
improvement
gained by applying RL to the problem of Dynamic Channel
Allocation in Cellular Telephones, by Satinder Singh at the University
of Colorado
Elevator
Fortran simulation of an elevator, written by James Lewis, and provided by
Christos Cassandras at UMASS ECE Dept. The reinforcement learning
addition to the elevator simulation was implemented by Bob Crites, CS
Dept. UMass. and John McNulty
and is described in the paper
Improving
Elevator Performance Using Reinforcement Learning.
Source code: elevator.tar.gz
(284 K) or
elevator.tar
(814
K). Both require a C compiler and the f2c library
to convert Fortran to c, as it incorporates c random number handling
routines).
Grid World
This program is a simulation of learning the goal of moving to a
user-defined square of a grid. It uses Q-learning, and was written by
Sridhar Mahadevan.
Source code:grid.tar (72 K;
requires C compiler and X11 libraries)
Machine Maintenance
CSIM simulation of a production system which integrates SMART, a
model-free average-reward algorithm, to determine the optimal machine
maintenance policy. It was written by Nicholas Marchalleck and Abhijit
Gosavi, and is described in Self-Improving
Factory Simulation using Continuous-Time Average-Reward Reinforcement
Learning by Mahadevan et. al.
Source code: maint.tar
(268 K; requires CSIM v.17 and C++
compiler)
MDP Q-learning: implements Q-learning on a given MDP, using
semi-uniform exploration.
Source code: mdp-q.tar
(64 K, requires GNU C compiler)
Mountain-Car Problem:
Simulation of a car learning the proper acceleration to get up a mountain.
It uses Q-learning with CMAC as a function approximator. It is described
in (among other papers) Generalization
in Reinforcement Learning: Successful Examples Using Sparse Coarse
Coding by Rich Sutton, and was developed by Sridhar
Mahadevan.
Source code: mcar.tar
(157K; requires X11 libraries and C++ compiler)
Network Routing: Demonstrates a RL network-routing algorithm
written by Justin Boyan and Michael Littman. Described in
Packet Routing in Dynamically Changing Networks: A Reinforcement
Learning Approach
Advances in
Neural Information Processing Systems (Postscript - 155KB)
Source code: network-router.tar
(222 K); requires C compiler,
wish windowing shell (part of Tcl)
Proposed Standard for Reinforcement Learning Software
This standard,
developed by Rich Sutton and Juan Carlos Santamaria, is intended to
facilitate RL research and development, and is available for C++ and
Common Lisp.