Show simple item record

dc.contributor.advisorKim, Eun Jung
dc.creatorHuang, Jiayi
dc.date.accessioned2021-02-02T21:42:11Z
dc.date.available2022-08-01T06:53:13Z
dc.date.created2020-08
dc.date.issued2020-07-22
dc.date.submittedAugust 2020
dc.identifier.urihttps://hdl.handle.net/1969.1/192321
dc.description.abstractThe onset of big data and deep learning applications, mixed with conventional general-purpose programs, have driven computer architecture to embrace heterogeneity with specialization. With the ever-increasing interconnected chip components, future architectures are required to operate under a stricter power budget and process emerging big data applications efficiently. Interconnection network as the communication backbone thus is facing the grand challenges of limited power envelope, data movement and performance scaling. This dissertation provides interconnect solutions that are specialized to application requirements towards power-/energy-efficient and high-performance computing for heterogeneous architectures. This dissertation examines the challenges of network-on-chip router power-gating techniques for general-purpose workloads to save static power. A voting approach is proposed as an adaptive power-gating policy that considers both local and global traffic status through router voting. In addition, low-latency routing algorithms are designed to guarantee performance in irregular power-gating networks. This holistic solution not only saves power but also avoids performance overhead. This research also introduces emerging computation paradigms to interconnects for big data applications to mitigate the pressure of data movement. Approximate network-on-chip is proposed to achieve high-throughput communication by means of lossy compression. Then, near-data processing is combined with in-network computing to further improve performance while reducing data movement. The two schemes are general to play as plug-ins for different network topologies and routing algorithms. To tackle the challenging computational requirements of deep learning workloads, this dissertation investigates the compelling opportunities of communication algorithm-architecture co-design to accelerate distributed deep learning. MultiTree allreduce algorithm is proposed to bond with message scheduling with network topology to achieve faster and contention-free communication. In addition, the interconnect hardware and flow control are also specialized to exploit deep learning communication characteristics and fulfill the algorithm needs, thereby effectively improving the performance and scalability. By considering application and algorithm characteristics, this research shows that interconnection network can be tailored accordingly to improve the power-/energy-efficiency and performance to satisfy heterogeneous computation and communication requirements.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectInterconnection networken
dc.subjectcomputer architectureen
dc.subjectheterogeneous architectureen
dc.subjectcommunication accelerationen
dc.subjectinterconnect specializationen
dc.titleEfficient Interconnection Network Design for Heterogeneous Architecturesen
dc.typeThesisen
thesis.degree.departmentComputer Science and Engineeringen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberJimenez, Daniel
dc.contributor.committeeMemberDa Silva, Dilma
dc.contributor.committeeMemberHu, Jiang
dc.type.materialtexten
dc.date.updated2021-02-02T21:42:12Z
local.embargo.terms2022-08-01
local.etdauthor.orcid0000-0003-4011-6668


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record