Gradient Descent Approaches to Neural-Net-Based Solutions of
the Hamilton-Jacobi-Bellman Equation.
Munos, Remi , Leemon Baird, Andrew MooreGradient Descent Approaches to Neural-Net-Based Solutions of
the Hamilton-Jacobi-Bellman Equation.
IJCNN'99
( gzipped Postscript - 128KB )
Abstract: In this paper we investigate new approaches to
dynamic-programming-based optimal control of continuous
time-and-space systems. We use neural networks to approximate
the solution to the Hamilton-Jacobi-Bellman (HJB) equation which is,
in the deterministic case studied here, a first-order, non-linear,
partial differential equation. We derive the gradient descent rule
for integrating this equation inside the domain, given the conditions
on the boundary. We apply this approach to the ``Car-on-the-hill''
which is a two-dimensional highly non-linear control problem.
We discuss the results obtained and point out a low quality of
approximation of the value function and of the derived control.
We attribute this bad approximation to the fact that the HJB equation
has many generalized solutions (i.e. differentiable almost
everywhere) other than the value function, and our gradient descent
method converges to one among these functions, thus possibly failing
to find the correct value function. We illustrate this limitation
on a simple one-dimensional control problem.