Time Oriented Anytime Motion Planning with Temporal Difference Learning

Grogan, Matthew David

dc.contributor.advisor	Kalafatis, Stavros
dc.contributor.advisor	Shell, Dylan
dc.creator	Grogan, Matthew David
dc.date.accessioned	2019-10-16T21:09:40Z
dc.date.available	2019-10-16T21:09:40Z
dc.date.created	2019-05
dc.date.issued	2019-04-08
dc.date.submitted	May 2019
dc.identifier.uri	https://hdl.handle.net/1969.1/185077
dc.description.abstract	Anytime algorithms are a class of algorithm which are interruptible and whose solution quality improves with time, tending towards an optimal solution. In other words, there is a non-decreasing relationship between time invested in computation and solution quality. Algorithms of this nature are clearly relevant to the problem of robotic motion planning. When the state space is large or high-dimensional as is the case for many real applications, an optimal trajectory may take prohibitively long to compute. Using an anytime approach allows for a sub-optimal yet feasible solution to be returned in a reasonable amount of time, after which further time can be spent on improvement. When the nature of this improvement is deﬁned by something like path length or mechanical work, the trade off between time and solution quality must be engineered for a speciﬁc context. However, if solution quality is determined by the length of time required to perform a given motion, then there is a well deﬁned relationship between time committed to computation and time spent on navigation. When the objective is to have the robot arrive at its goal in the shortest amount of time possible, there will be a threshold after which time invested in computation is not sufﬁciently rewarded in terms of path improvement. This optimal computation duration varies greatly for any given environment and start/goal conﬁguration. Additionally, the planner need not decide on a computation duration upfront; the state of the planner and quality of the solution can be observed throughout the process yielding sequential and episodic data. These two facts suggest that deciding when to end a computation phase and begin a navigation phase can be posed as a reinforcement learning problem. In this work, we present a motion planner that can be trained to minimize the overall time spent on both computation and navigation. To this end, we utilize anytime motion planning techniques as well as reinforcement learning algorithms. The performance of this planner will be evaluated for a variety of simulated environments.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Motion planning	en
dc.subject	reinforcement learning	en
dc.title	Time Oriented Anytime Motion Planning with Temporal Difference Learning	en
dc.type	Thesis	en
thesis.degree.department	Educational Psychology	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Texas A & M University	en
thesis.degree.name	Master of Science	en
thesis.degree.level	Masters	en
dc.contributor.committeeMember	Hou, I-Hong
dc.type.material	text	en
dc.date.updated	2019-10-16T21:09:40Z
local.etdauthor.orcid	0000-0002-2753-2926

Files in this item

Name:: GROGAN-THESIS-2019.pdf
Size:: 1007.Kb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record