Interdisciplinary Machine Learning in Science and Engineering ()
- Suman Chakravorty, Department of Aerospace Engineering
- A Decoupling Principle in Stochastic Optimal Control and Its Implications
The problem of Stochastic Optimal Control is ubiquitous in Robotics and Control since it is the fundamental formulation for decision-making under uncertainty. The answer to the problem can be computed by solving an associated Dynamic Programming (DP) problem. Unfortunately, the DP paradigm is also synonymous with the infamous "Curse of Dimensionality (COD)," a phrase coined by the discoverer of the Dynamic Programming paradigm, Richard Bellman, nearly 60 years ago, to capture the fact that the computational complexity of solving a DP problem grows exponentially in the dimension of the state space of the problem.
In this talk, we will introduce a newly discovered paradigm in stochastic optimal control, called "Decoupling," that allows us to separate the design of the open and closed loops of a stochastic optimal control problem with continuous control space. This Decoupled solution allows us to break the COD inherent in DP problems, while remaining near-optimal, to third order, to the true stochastic control. The implications of the Decoupled design are examined in the context of Model Predictive Control (MPC) and Reinforcement Learning (RL). We shall introduce two algorithms, called the Trajectory Optimized Perturbation Feedback Control (T-PFC), and the Decoupled Data based Control (D2C), for the MPC and RL problems, respectively. We shall also examine the consequences of the decoupling principle in partially observed/belief space planning problems and present the Trajectory Optimized Linear Quadratic Gaussian (T-LQG) algorithm.