Using Write Buffers in Systolic Array Architectures to Mitigate the Number of Memory Access Produced by Row Stationary Dataflows

Peralta Velazquez, Daniel U

Abstract

New applications of Deep Neural Networks are being designed such as fraud detection, short term weather precipitation forecasts, and cancer prognosis prediction. Nonetheless, their respective models are getting more complex with an increasing number of depth layers. These models require millions of computations that conventional CPU and GPU architectures will take a significant amount of computational time. The data distribution of these models is well known; they mostly consist of dot product operations between inputs and filters. Applications such as self-driving cars required fast response time and accurate predictions. Current research introduces accelerator architectures based on 2D systolic arrays as they provide high efficiency in performing multiplication and accumulation operations. Computational and power cost define performance, memory accesses attribute the highest cost to current architecture models. In order to enhance the performance of DNN accelerators, parallelism is extracted by breaking convolution into partial computations at the expense of segmenting output memory accesses. This thesis explores the implementation of an accumulator microarchitecture component based on column pipelined adder trees with the purpose of collecting and aggregating output computed values based on destination address. The results of this work showed a 3.3x and 2.15x speedup for Tiny-YOLO and AlexNet CNN using a 32x64 Systolic Array. Through the reduction of computed values developers will be able to explore novel data mappings to extract parallelism based on data locality.

URI

https://hdl.handle.net/1969.1/194328

Subject

Domain Specific Architectures
Edge Computing
Parallel Processing
Near-data Processing
CNN Accelerator
Accumulator
Buffer
Cycle Based Simulator
Systolic Array
Compilers
Concurrency

Collections

Undergraduate Research Scholars Capstone (2006–present)

Citation

Peralta Velazquez, Daniel U (2021). Using Write Buffers in Systolic Array Architectures to Mitigate the Number of Memory Access Produced by Row Stationary Dataflows. Undergraduate Research Scholars Program. Available electronically from https : / /hdl .handle .net /1969 .1 /194328.