Show simple item record

dc.contributor.advisorLi, Peng
dc.contributor.advisorRajendran, Jeyavijayan
dc.creatorWang, Yu
dc.date.accessioned2020-09-11T18:51:20Z
dc.date.available2021-12-01T08:42:58Z
dc.date.created2019-12
dc.date.issued2019-11-26
dc.date.submittedDecember 2019
dc.identifier.urihttps://hdl.handle.net/1969.1/189219
dc.description.abstractIn recent years, the emerging of new machine learning algorithms and the fast development of available hardware allow people to perform complex recognition and generation tasks of images, natural languages and speaking voices. Although the general-purpose CPUs are continuously being optimized, it still takes massive time to finish training or even perform recognition with the growing size of datasets. The pressing need for fast, energy and area efficient hardware platforms make research community and the industry focus on developing dedicated hardware accelerators for running computationally intensive data processing algorithms. Singular value decomposition (SVD) is a fundamental computation kernel which has been wildly applied in pattern recognition, matrix compression and signal processing. However, limited by the high computation complexity, it usually is the most time-consuming part in many data pre-processing schemes and machine learning algorithms. While many existing works have proposed their parallel computing architectures for SVD, these works are either limited by the strict requirement of matrix sizes, or the flexibility and scalability of their architectures. Moreover, the data movement issue, which serves as a critical bottleneck in parallel computation architectures for SVD, has rarely been discussed. In this thesis, we propose a new Maximum Data Sharing ordering (MDS ordering) and corresponding Field Programmable Logic Array (FPGA) architecture to maximize the data reuse on-chip and can significantly reduce the bandwidth requirement. The proposed reconfigurable SVD engine can decompose matrices with arbitrary sizes much larger than existing solutions and can reach a speed up of 80X to 300X compared with the results of Eigen package running on high performance CPU.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectFPGAen
dc.subjectSVDen
dc.titleA Parallel FPGA SVD Accelerator with Optimized On-chip Data Reuseen
dc.typeThesisen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameMaster of Scienceen
thesis.degree.levelMastersen
dc.contributor.committeeMemberKim, Eun Jung
dc.type.materialtexten
dc.date.updated2020-09-11T18:51:20Z
local.embargo.terms2021-12-01
local.etdauthor.orcid0000-0002-9345-4911


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record