Show simple item record

dc.creatorWalton, Zachary
dc.date.accessioned2011-08-08T22:48:45Z
dc.date.accessioned2011-08-09T01:32:41Z
dc.date.available2011-08-08T22:48:45Z
dc.date.available2011-08-09T01:32:41Z
dc.date.created2011-05
dc.date.issued2011-08-08
dc.date.submittedMay 2011
dc.identifier.urihttps://hdl.handle.net/1969.1/ETD-TAMU-2011-05-9520
dc.description.abstractUnmanned Aerial Vehicles (UAVs) are being used more frequently in surveillance scenarios for both civilian and military applications. One such application addresses a UAV patrolling a perimeter, where certain stations can receive alerts at random intervals. Once the UAV arrives at an alert site it can take two actions: 1. Loiter and gain information about the site. 2. Move on around the perimeter. The information that is gained is transmitted to an operator to allow him to classify the alert. The information is a function of the amount of time the UAV is at the alert site, also called the dwell time, and the maximum delay. The goal of the optimization is to classify the alert so as to maximize the expected discounted information gained by the UAV's actions at a station about an alert. This optimization problem can be readily solved using Dynamic Programming. Even though this approach generates feasible solutions, there are reasons to experiment with different approaches. A complication for Dynamic Programming arises when the perimeter patrol problem is expanded. This is that the number of states increases rapidly when one adds additional stations, nodes, or UAVs to the perimeter. This in effect greatly increases the computation time making the determination of the solution intractable. The following attempts to alleviate this problem by implementing a Reinforcement Learning technique to obtain the optimal solution, more specifically Q-Learning. Reinforcement Learning is a simulation-based version of Dynamic Programming and requires lesser information to compute sub-optimal solutions. The effectiveness of the policies generated using Reinforcement Learning for the perimeter patrol problem have been corroborated numerically in this thesis.en
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.subjectUnmanned Aerial Vehiclesen
dc.subjectDynamic Programmingen
dc.subjectReinforcement Learningen
dc.titleOptimal Control of Perimeter Patrol Using Reinforcement Learningen
dc.typeThesisen
thesis.degree.departmentMechanical Engineeringen
thesis.degree.disciplineMechanical Engineeringen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameMaster of Scienceen
thesis.degree.levelMastersen
dc.contributor.committeeMemberSwaroop, Darbha
dc.contributor.committeeMemberRathinam, Sivakumar
dc.contributor.committeeMemberChakravorty, Suman
dc.type.genrethesisen
dc.type.materialtexten


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record