Enhanced Reinforcement Learning with Attentional Feedback and Temporally Attenuated Distal Rewards
MetadataShow full item record
This thesis presents a new reinforcement learning mechanism suitable to be employed in artificial spiking neural networks of leaky integrate-and-fire (LIF) or Izhikevich neurons. The proposed mechanism is upgraded from, and closely built upon the learning algorithm introduced by Florian, in which local synaptic plasticity is based on the relative spike-timing of the pre and post-synaptic neurons (STDP), and is modulated by a global reinforcement signal. This work introduces and deals with multiple challenges identified in existing reinforcement learning schemes, that includes the distal reward problem, the spatial credit assignment problem and the response numbness problem. A number of improvements, that are inspired either from the biological elements or from similar implementations in non-spiking neural networks, are suggested to handle these challenges, and are validated through biologically-inspired experiments. The notion and implementation of attentional feedback that handles the spatial credit assignment problem during synaptic reinforcement are introduced. The effects of attenuated rewards, which gate network learning after satisfactory reinforcement is achieved, are also demonstrated. This aids in the exploration of the agent to discover other rewardable behaviors during learning. A spike-rate based input encoding scheme termed as balanced-pair binary state (BPBS) encoding, and a corresponding methodology for response selection are also introduced to improve network stability and confidence in response selection. The proposed techniques are validated using multiple biologically-inspired single agent as well as multi-agent game-theoretic experimental tasks. The single-agent tasks include exclusive OR (XOR) function reproduction and a bot walking task. The multi-agent interactive and cooperative tasks demonstrated include the general-sum iterated prisoners' dilemma (IPD) game problem and the distributed SensorNetwork problem from the NIPS '05 reinforcement learning benchmarks. The results and findings discussed in this work validate that the proposed improvements to existing implementations of reinforcement learning could, in fact, lead to better brain-like learning and behavior in artificial agents.
Thulasiraman, Kumaran (2015). Enhanced Reinforcement Learning with Attentional Feedback and Temporally Attenuated Distal Rewards. Master's thesis, Texas A & M University. Available electronically from