A Parallel FPGA SVD Accelerator with Optimized On-chip Data Reuse

Wang, Yu

dc.contributor.advisor	Li, Peng
dc.contributor.advisor	Rajendran, Jeyavijayan
dc.creator	Wang, Yu
dc.date.accessioned	2020-09-11T18:51:20Z
dc.date.available	2021-12-01T08:42:58Z
dc.date.created	2019-12
dc.date.issued	2019-11-26
dc.date.submitted	December 2019
dc.identifier.uri	https://hdl.handle.net/1969.1/189219
dc.description.abstract	In recent years, the emerging of new machine learning algorithms and the fast development of available hardware allow people to perform complex recognition and generation tasks of images, natural languages and speaking voices. Although the general-purpose CPUs are continuously being optimized, it still takes massive time to finish training or even perform recognition with the growing size of datasets. The pressing need for fast, energy and area efficient hardware platforms make research community and the industry focus on developing dedicated hardware accelerators for running computationally intensive data processing algorithms. Singular value decomposition (SVD) is a fundamental computation kernel which has been wildly applied in pattern recognition, matrix compression and signal processing. However, limited by the high computation complexity, it usually is the most time-consuming part in many data pre-processing schemes and machine learning algorithms. While many existing works have proposed their parallel computing architectures for SVD, these works are either limited by the strict requirement of matrix sizes, or the flexibility and scalability of their architectures. Moreover, the data movement issue, which serves as a critical bottleneck in parallel computation architectures for SVD, has rarely been discussed. In this thesis, we propose a new Maximum Data Sharing ordering (MDS ordering) and corresponding Field Programmable Logic Array (FPGA) architecture to maximize the data reuse on-chip and can significantly reduce the bandwidth requirement. The proposed reconfigurable SVD engine can decompose matrices with arbitrary sizes much larger than existing solutions and can reach a speed up of 80X to 300X compared with the results of Eigen package running on high performance CPU.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	FPGA	en
dc.subject	SVD	en
dc.title	A Parallel FPGA SVD Accelerator with Optimized On-chip Data Reuse	en
dc.type	Thesis	en
thesis.degree.department	Electrical and Computer Engineering	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Master of Science	en
thesis.degree.level	Masters	en
dc.contributor.committeeMember	Kim, Eun Jung
dc.type.material	text	en
dc.date.updated	2020-09-11T18:51:20Z
local.embargo.terms	2021-12-01
local.etdauthor.orcid	0000-0002-9345-4911

Files in this item

Name:: WANG-THESIS-2019.pdf
Size:: 1.251Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record