Efficient and Highly Scalable Techniques to Accelerate Bayesian Markov Decision Process Computation
Abstract
Markov Decision Process (MDP) is a kernel model for solving sequential decision making problems, when the system behaviour is stochastic. Bayesian Markov Decision Process (BMDP) can be applied in the special case in which the system model is uncertain. Both MDP and BMDP must repeatedly conduct exhaustive search for a non-stationary policy, and thus entail exponential computational complexity. This has hindered their practical application to date.
In this thesis, we develop several computation techniques to overcome the obstacle of the exponential runtime complexity as well as the exponential memory requirement for both the MDP and BMDP problems. To reduce the runtime complexity, we investigate acceleration techniques, using the Graphic Processing Unit (GPU) platform, which allows massive parallelism. Our GPU-based acceleration techniques are applied with two different MDP approaches: the Optimal Bayesian Robust (OBR) policy and the Forward Search Sparse Sampling (FSSS) method. However, since the GPU utilizes a Single Instruction Multiple Data (SMID) computation paradigm and (B)MDP has an inherent issue of “curse of dimensionality”, the GPU-based solution has an exponential memory complexity issue. To overcome the memory storage impediment, we first exploit the fact that in many practical problems the system model is likely to be sparse. Exploiting this, we develop a novel Duplex Sparse Storage (DSS) scheme in this thesis. Another approach we develop to reduce the memory is a highly memory-efficient representation for the (B)MDP system model using Binary Decision Diagram (BDD) based sampling. For both DSS and BDD-based sampling approaches, we develop corresponding (B)MDP solvers on a heterogeneous CPU-GPU platform.
To further improve the efficiency of BMDP, we develop a new Scaled Population (SP) based arithmetic computation approach that achieves considerable improvements over existing Stochastic Computing (SC) techniques. Note that the SP arithmetic approach can be used in applications other than BMDP as well. SP arithmetic introduces scaling operations that significantly reduce the numerical errors as compared to SC. Besides, the SP arithmetic erases the inherent serialization associated with stochastic computing, thereby significantly improving the computational delays.
Our experiments show that the GPU-based parallel computation techniques reduce the runtime of (BMDP) by two orders of magnitude over sequential implementations. The DSS and BDD-based sampling approaches reduce the memory utilization by 4.1× and two orders of magnitude compared with the use of floating point numbers, respectively. The SP arithmetic achieves a 31.89% improvement over SC in terms of the accuracy for BMDP, and also improves the accuracy and the utilization of hardware resources for other applications such as single multiplication/addition, matrix inner production and MNIST image classification.
Subject
Bayesian Markov Decision ProcessData compression
Graphic Processing Units
Parallel Computing
DSS
Binary Decision Diagram
Scaled Population Arithmetic
Approximate Computing
Citation
Zhou, He (2019). Efficient and Highly Scalable Techniques to Accelerate Bayesian Markov Decision Process Computation. Doctoral dissertation, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /189248.