Show simple item record

dc.contributor.advisorLi, Peng
dc.creatorThulasiraman, Kumaran
dc.date.accessioned2015-10-29T19:41:46Z
dc.date.available2017-08-01T05:37:39Z
dc.date.created2015-08
dc.date.issued2015-08-09
dc.date.submittedAugust 2015
dc.identifier.urihttps://hdl.handle.net/1969.1/155519
dc.description.abstractThis thesis presents a new reinforcement learning mechanism suitable to be employed in artificial spiking neural networks of leaky integrate-and-fire (LIF) or Izhikevich neurons. The proposed mechanism is upgraded from, and closely built upon the learning algorithm introduced by Florian, in which local synaptic plasticity is based on the relative spike-timing of the pre and post-synaptic neurons (STDP), and is modulated by a global reinforcement signal. This work introduces and deals with multiple challenges identified in existing reinforcement learning schemes, that includes the distal reward problem, the spatial credit assignment problem and the response numbness problem. A number of improvements, that are inspired either from the biological elements or from similar implementations in non-spiking neural networks, are suggested to handle these challenges, and are validated through biologically-inspired experiments. The notion and implementation of attentional feedback that handles the spatial credit assignment problem during synaptic reinforcement are introduced. The effects of attenuated rewards, which gate network learning after satisfactory reinforcement is achieved, are also demonstrated. This aids in the exploration of the agent to discover other rewardable behaviors during learning. A spike-rate based input encoding scheme termed as balanced-pair binary state (BPBS) encoding, and a corresponding methodology for response selection are also introduced to improve network stability and confidence in response selection. The proposed techniques are validated using multiple biologically-inspired single agent as well as multi-agent game-theoretic experimental tasks. The single-agent tasks include exclusive OR (XOR) function reproduction and a bot walking task. The multi-agent interactive and cooperative tasks demonstrated include the general-sum iterated prisoners' dilemma (IPD) game problem and the distributed SensorNetwork problem from the NIPS '05 reinforcement learning benchmarks. The results and findings discussed in this work validate that the proposed improvements to existing implementations of reinforcement learning could, in fact, lead to better brain-like learning and behavior in artificial agents.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectreinforcement learningen
dc.subjectspiking neural networksen
dc.subjectdopamine-modulateden
dc.subjectSTDPen
dc.subjecten
dc.titleEnhanced Reinforcement Learning with Attentional Feedback and Temporally Attenuated Distal Rewardsen
dc.typeThesisen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameMaster of Scienceen
thesis.degree.levelMastersen
dc.contributor.committeeMemberSprintson, Alex
dc.contributor.committeeMemberChoe, Yoonsuck
dc.type.materialtexten
dc.date.updated2015-10-29T19:41:48Z
local.embargo.terms2017-08-01
local.etdauthor.orcid0000-0002-8870-9756


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record