Exploring Deep Reinforcement Learning Techniques for Autonomous Navigation

Potter, Vincent T

dc.creator	Potter, Vincent T
dc.date.accessioned	2022-08-09T17:04:04Z
dc.date.available	2022-08-09T17:04:04Z
dc.date.created	2022-05
dc.date.submitted	May 2022
dc.identifier.uri	https://hdl.handle.net/1969.1/196571
dc.description.abstract	This paper contains research into efficient autonomous navigation algorithms powered by deep reinforcement learning. These algorithms enable a mobile robot to perform waypoint tracking in an indoor environment. The robot does not contain a map of the environment nor does it know the location of the waypoints. A reward function is used to encourage behaviors in the robot that lead it closer to the goal. This is an active area of research encouraged by recent advancements in neural networks applied to sequential decision making. The reinforcement learning algorithms utilize LiDAR and IMU sensors in order to navigate the unknown environment by calculating the robot’s current state and what its next action should be. At each step the action that is most likely to yield the maximum reward is sent to the robot in order to follow the sequential targets along the path to the final goal location. I use a low-fidelity custom simulator based on the Dubins Path along with a high-fidelity 3D simulator, Gazebo, to train various policies. The Dubins simulator is constructed from Python and executes very fast, while Gazebo requires more resources but is very advanced. After training is complete, ROS is used to deploy the RL policy onto the physical robot and convert the action commands into linear and angular velocities that can be understood by the robot’s hardware/motors. The TurtleBot3 Burger is the robot being used for evaluation in the real world. Often times, there is a severe drop in performance between the simulator and the real world so this is also monitored and factored into performance. Dense and sparse reward functions are explored in order to mimic various real-world scenarios where the reward is not always known at every step. Finally, Deep Q-Learning, Trust Region Policy Optimization, and a new RL algorithm called Learning Online with Guidance Offline are implemented and tested throughout the course of the research.
dc.format.mimetype	application/pdf
dc.subject	Deep Reinforcement Learning
dc.subject	Robot Operating System
dc.title	Exploring Deep Reinforcement Learning Techniques for Autonomous Navigation
dc.type	Thesis
thesis.degree.department	Electrical & Computer Engineering
thesis.degree.discipline	Computer Engineering, Electrical Engineering Track
thesis.degree.grantor	Undergraduate Research Scholars Program
thesis.degree.name	B.S.
thesis.degree.level	Undergraduate
dc.contributor.committeeMember	Kalathil, Dileep
dc.type.material	text
dc.date.updated	2022-08-09T17:04:04Z

Files in this item

Name:: POTTER-FINALTHESIS-2022.pdf
Size:: 1.374Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Undergraduate Research Scholars Capstone (2006–present)

Show simple item record