Optimal Control of Perimeter Patrol Using Reinforcement Learning

Walton, Zachary

dc.creator	Walton, Zachary
dc.date.accessioned	2011-08-08T22:48:45Z
dc.date.accessioned	2011-08-09T01:32:41Z
dc.date.available	2011-08-08T22:48:45Z
dc.date.available	2011-08-09T01:32:41Z
dc.date.created	2011-05
dc.date.issued	2011-08-08
dc.date.submitted	May 2011
dc.identifier.uri	https://hdl.handle.net/1969.1/ETD-TAMU-2011-05-9520
dc.description.abstract	Unmanned Aerial Vehicles (UAVs) are being used more frequently in surveillance scenarios for both civilian and military applications. One such application addresses a UAV patrolling a perimeter, where certain stations can receive alerts at random intervals. Once the UAV arrives at an alert site it can take two actions: 1. Loiter and gain information about the site. 2. Move on around the perimeter. The information that is gained is transmitted to an operator to allow him to classify the alert. The information is a function of the amount of time the UAV is at the alert site, also called the dwell time, and the maximum delay. The goal of the optimization is to classify the alert so as to maximize the expected discounted information gained by the UAV's actions at a station about an alert. This optimization problem can be readily solved using Dynamic Programming. Even though this approach generates feasible solutions, there are reasons to experiment with different approaches. A complication for Dynamic Programming arises when the perimeter patrol problem is expanded. This is that the number of states increases rapidly when one adds additional stations, nodes, or UAVs to the perimeter. This in effect greatly increases the computation time making the determination of the solution intractable. The following attempts to alleviate this problem by implementing a Reinforcement Learning technique to obtain the optimal solution, more specifically Q-Learning. Reinforcement Learning is a simulation-based version of Dynamic Programming and requires lesser information to compute sub-optimal solutions. The effectiveness of the policies generated using Reinforcement Learning for the perimeter patrol problem have been corroborated numerically in this thesis.	en
dc.format.mimetype	application/pdf
dc.language.iso	en_US
dc.subject	Unmanned Aerial Vehicles	en
dc.subject	Dynamic Programming	en
dc.subject	Reinforcement Learning	en
dc.title	Optimal Control of Perimeter Patrol Using Reinforcement Learning	en
dc.type	Thesis	en
thesis.degree.department	Mechanical Engineering	en
thesis.degree.discipline	Mechanical Engineering	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Master of Science	en
thesis.degree.level	Masters	en
dc.contributor.committeeMember	Swaroop, Darbha
dc.contributor.committeeMember	Rathinam, Sivakumar
dc.contributor.committeeMember	Chakravorty, Suman
dc.type.genre	thesis	en
dc.type.material	text	en

Files in this item

Name:: WALTON-THESIS.pdf
Size:: 615.1Kb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record