Reinforcement Learning Control with Approximation of Time-Dependent Agent Dynamics

Kirkpatrick, Kenton

dc.contributor.advisor	Valasek, John
dc.creator	Kirkpatrick, Kenton
dc.date.accessioned	2013-10-03T15:01:39Z
dc.date.available	2013-10-03T15:01:39Z
dc.date.created	2013-05
dc.date.issued	2013-04-30
dc.date.submitted	May 2013
dc.identifier.uri	https://hdl.handle.net/1969.1/149493
dc.description.abstract	Reinforcement Learning has received a lot of attention over the years for systems ranging from static game playing to dynamic system control. Using Reinforcement Learning for control of dynamical systems provides the benefit of learning a control policy without needing a model of the dynamics. This opens the possibility of controlling systems for which the dynamics are unknown, but Reinforcement Learning methods like Q-learning do not explicitly account for time. In dynamical systems, time-dependent characteristics can have a significant effect on the control of the system, so it is necessary to account for system time dynamics while not having to rely on a predetermined model for the system. In this dissertation, algorithms are investigated for expanding the Q-learning algorithm to account for the learning of sampling rates and dynamics approximations. For determining a proper sampling rate, it is desired to find the largest sample time that still allows the learning agent to control the system to goal achievement. An algorithm called Sampled-Data Q-learning is introduced for determining both this sample time and the control policy associated with that sampling rate. Results show that the algorithm is capable of achieving a desired sampling rate that allows for system control while not sampling “as fast as possible”. Determining an approximation of an agent’s dynamics can be beneficial for the control of hierarchical multiagent systems by allowing a high-level supervisor to use the dynamics approximations for task allocation decisions. To this end, algorithms are investigated for learning first- and second-order dynamics approximations. These algorithms are respectively called First-Order Dynamics Learning and Second-Order Dynamics Learning. The dynamics learning algorithms are evaluated on several examples that show their capability to learn accurate approximations of state dynamics. All of these algorithms are then evaluated on hierarchical multiagent systems for determining task allocation. The results show that the algorithms successfully determine appropriated sample times and accurate dynamics approximations for the agents investigated.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Reinforcement Learning	en
dc.subject	Q-learning	en
dc.subject	Control	en
dc.subject	Dynamics	en
dc.subject	Multiagent	en
dc.subject	Machine Learning	en
dc.subject	Artificial Intelligence	en
dc.subject	Sampling	en
dc.subject	Sampled-Data Systems	en
dc.subject	System Identification	en
dc.title	Reinforcement Learning Control with Approximation of Time-Dependent Agent Dynamics	en
dc.type	Thesis	en
thesis.degree.department	Aerospace Engineering	en
thesis.degree.discipline	Aerospace Engineering	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Doctor of Philosophy	en
thesis.degree.level	Doctoral	en
dc.contributor.committeeMember	Bhattacharya, Raktim
dc.contributor.committeeMember	Chakravorty, Suman
dc.contributor.committeeMember	Ioerger, Thomas
dc.type.material	text	en
dc.date.updated	2013-10-03T15:01:39Z

Files in this item

Name:: KIRKPATRICK-DISSERTATION-2013.pdf
Size:: 2.897Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record