Perturbation Feedback Approaches in Stochastic Optimal Control: Applications to Model-Based and Model-Free Problems in Robotics

Parunandi, Karthikeya Sharma

dc.contributor.advisor	Chakravorty, Suman
dc.contributor.advisor	Kalathil, Dileep
dc.creator	Parunandi, Karthikeya Sharma
dc.date.accessioned	2020-08-26T19:24:10Z
dc.date.available	2020-08-26T19:24:10Z
dc.date.created	2019-12
dc.date.issued	2019-10-18
dc.date.submitted	December 2019
dc.identifier.uri	https://hdl.handle.net/1969.1/188787
dc.description.abstract	Decision making under uncertainty is an important problem in engineering that is traditionally approached differently in each of the Stochastic optimal control, Reinforcement learning and Motion planning disciplines. One prominent challenge that is common to all is the ‘curse of dimensionality’ i.e, the complexity of the problem scaling exponentially as the state dimension increases. As a consequence, traditional stochastic optimal control methods that attempt to obtain an optimal feedback policy for nonlinear systems are computationally intractable. This thesis explores the application of a near-optimal decoupling principle to obtain tractable solutions in both model-based and model-free problems in robotics. The thesis begins with the derivation of a near-optimal decoupling principle between the open loop plan and the closed loop linear feedback gains, based on the analysis performed with the second-order expansion of the cost-to-go function. This leads to a deterministic perturbation feedback control based solution to fully observable stochastic optimal control problems. Basing on this idea of near-optimal decoupling, a model-based trajectory optimization algorithm called the ‘Trajectory-optimized Perturbation Feedback Controller’ (T-PFC) is proposed. Rather than aiming to solve for the general optimal policy, this algorithm solves for an open-loop trajectory first, followed by the feedback that is automatically entailed by the algorithm from the open-loop plan. The performance is compared against a set of baselines in several difficult robotic planning and control examples that show near identical performance to non-linear model predictive control (NMPC) while requiring much lesser computational effort. Next, we turn on to the investigation of the model-free version of the problem, where a policy is learnt from the data, without incorporating system’s theoretical model. We present a novel decoupled data-based control (D2C) algorithm that addresses this problem using a decoupled ‘open loop - closed loop’ approach. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system. Then, a closed loop control is developed around this open loop trajectory by linearization of the dynamics about this nominal trajectory. By virtue of linearization, a linear quadratic regulator based algorithm is used for the demonstration of the closed loop control. Simulation performance suggests a significant reduction in training time compared to other state of the art reinforcement learning algorithms. Finally, an alternative method for solving the open-loop trajectory in D2C is presented (called as ‘D2C-2.0’). Stemming from the idea of model-based ‘Differential Dynamic Programming’ (DDP), it possesses second-order convergence property (under certain assumptions) and hence is significantly faster to compute the solution than the original D2C algorithm. An efficient way of sampling from the environment to convert it to a model-free algorithm, along with the suitable line-search and regularization schemes are presented. Comparisons are made with the original version of D2C and a state-of-the-art reinforcement learning algorithm using a variety of examples in the MuJoCo simulator. In conclusion, limitations for each of the above methods are discussed and accordingly, some possible directions have been provided for the future work.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Reinforcement Learning	en
dc.subject	Stochastic Optimal Control	en
dc.subject	Motion Planning	en
dc.subject	Trajectory Optimization	en
dc.subject	Robotics	en
dc.title	Perturbation Feedback Approaches in Stochastic Optimal Control: Applications to Model-Based and Model-Free Problems in Robotics	en
dc.type	Thesis	en
thesis.degree.department	Aerospace Engineering	en
thesis.degree.discipline	Aerospace Engineering	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Master of Science	en
thesis.degree.level	Masters	en
dc.contributor.committeeMember	Shell, Dylan
dc.contributor.committeeMember	Majji, Manoranjan
dc.type.material	text	en
dc.date.updated	2020-08-26T19:24:11Z
local.etdauthor.orcid	0000-0003-2733-3385

Files in this item

Name:: PARUNANDI-THESIS-2019.pdf
Size:: 467.1Kb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record