Show simple item record

dc.contributor.advisorValasek, John
dc.creatorBera, Ritwik
dc.date.accessioned2023-02-07T16:03:32Z
dc.date.available2024-05-01T06:06:43Z
dc.date.created2022-05
dc.date.issued2022-01-11
dc.date.submittedMay 2022
dc.identifier.urihttps://hdl.handle.net/1969.1/197116
dc.description.abstractRecent successes in the field of robot learning have combined reinforcement learning algorithms and deep neural networks. Yet, reinforcement learning has not been widely applied to robotics and real world scenarios. This can be attributed to the fact that current state-of-the-art, end-to-end reinforcement learning approaches usually require an impractically large number of data samples to converge to a satisfactory policy. They are often subject to catastrophic failures during training since they need to be able to ’explore’ their state space. The environmental interactions require some finite time each and coupled with the risky exploration stage in RL, imitation learning is a much more feasible candidate for learning on hardware platforms. However, imitation learning’s simplest form, behavior cloning, also requires a fairly large number of demonstration trajectories to learn an effective policy. Humans are able to accomplish the same real-world tasks with much fewer samples that may be provided in the form of demonstrations. Just a few partial demonstrations in the form of corrective action are often enough to achieve reliable performance. This suggests that humans implicitly utilize other modalities of information, such as eye-gaze, apart from just motor actuation data. This thesis investigates both the incorporation of such modalities in training data, as well as when and how this training data is collected, such as through corrective action, partial demonstrations etc. The need to perform multiple experiments across both hardware and simulation-based platforms with different observation and action-space configurations creates a need for a software library that is modular, extensible and universal in design. One of the primary contributions of this thesis is the development of a unique software library that enables single-operator, fault-tolerant operation of data collection experiments on vehicle platforms. It permits a safe rollout of machine learning models as controllers while keeping a human-in-the-loop. This thesis demonstrates how the usage of certain fundamental principles of program design and software architectural patterns results in a high level of common code execution across experimental configurations. The high level of common code execution is demonstrated through measurement of configuration-agnostic code coverage across different experimental runs and is shown to exceed 50%. This simplifies the setup phase for any experiment involving machine learning and autonomous systems. The same software library is used to perform the experiments investigating effects of additional human sensory modalities in the training data as well as sampling schemes to collect data through interventions. Improvements in key metrics are such as task completion rate are observed when eye-gaze is incorporated as a training signal. Task completion rate is shown to increase by 13.7 percentage points with the incorporation of eye-gaze. Data sampling changes by shifting to intervention-guided data collection also results in higher task-completion rates (65 % as opposed to 35 % earlier). This improvement in task completion rate is accompanied with a lower expert data requirement. When compared to an intervention-free baseline, the intervention-guided data collection routine needs 27.1% fewer expert samples while having a more proficient policy.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectmachine learning
dc.subjectrobotics
dc.subjectgaze
dc.subjectautonomy
dc.titleA Modular Framework for Training Autonomous Systems via Human Interaction
dc.typeThesis
thesis.degree.departmentAerospace Engineering
thesis.degree.disciplineAerospace Engineering
thesis.degree.grantorTexas A&M University
thesis.degree.nameMaster of Science
thesis.degree.levelMasters
dc.contributor.committeeMemberChamitoff, Gregory E
dc.contributor.committeeMemberKalathil, Dileep
dc.contributor.committeeMemberSelva, Daniel
dc.type.materialtext
dc.date.updated2023-02-07T16:03:32Z
local.embargo.terms2024-05-01
local.etdauthor.orcid0000-0002-8601-909X


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record