Communication and Computation Optimizations in Modular Systems-On-Chip

Majumder, Pritam

dc.contributor.advisor	Kim, Eun Jung
dc.contributor.advisor	Jimenez, Daniel
dc.creator	Majumder, Pritam
dc.date.accessioned	2023-02-07T16:06:35Z
dc.date.available	2024-05-01T06:05:38Z
dc.date.created	2022-05
dc.date.issued	2022-03-04
dc.date.submitted	May 2022
dc.identifier.uri	https://hdl.handle.net/1969.1/197161
dc.description.abstract	In our modern world where everyone is always connected through internet, terabytes of data gets generated at every moment through online activities like communication on social media, on-line banking, online shopping, browsing, streaming, telemedicine, information on global activities, weather forecasting, astronomy etc. Current e-commerce and science community are heavily de-pendent on this data, creating huge demand for quick data processing through machine learning methods for on time decision making and also for enhancing our knowledge regarding science and universe. To meet the demand, currently we rely on scale-out systems like cloud servers, usually equipped with GPUs as general purpose accelerators, often realized with SoCs. For designing large scale systems, design modularity offers cheaper SoC with inherent integration complexity, like network deadlock involving multiple modules, in spite of each individual module being deadlock-free. Based on our first observation, the deadlock in modular SoC is formed by forming circular channel dependency involving two or more modules in the SoC. Further, mixing traffics originated from different modules may block each other in a circular fashion, which results in deadlock. We propose a deadlock avoidance technique for any modular SoC to make the integration deadlock free with minimum overhead. We evaluate our theory of deadlock freedom using full-system SoC simulation constituted of independently designed CPUs and GPUs through interposer network. The routing used in CPU, GPU, or in interposer are completely in-dependent of each other, experience deadlock while exposed to high workload. Our technique successfully avoids the deadlock with much lesser performance and energy overhead than the state-of-the-art turn-restriction-based SoC deadlock solution. In addition, the excessive data movement in these large systems, tackled by near-memory processing (NMP) has further scope for improvement in their data and computation mapping, as reducing data movement does not ensure optimized operation cost. We propose a reinforcement learning (RL) based solution to improve the cost of operations by improving the data and computation mapping in the memory network for NMP. The solution constituted of two main components, (1) the formulation of the mapping optimization as an RL problem that involves selecting enough information from the system to form states, deciding the actions based on the desired mapping outcome, and properly calibrate feedback aiming towards the goal of performance improvement,(2) the realization of the information collection system and efficient implementation of data and computation remapping. We integrate RL framework with system simulation framework and show that our technique added as a plug-in module in the system can significantly improve the performance of the NMP techniques. Finally we describe the simulation framework in details, integrated with reinforcement learning, which we develop from scratch to evaluate our proposed solution.
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	data movement
dc.subject	deadlock
dc.subject	SoC
dc.subject	HMC
dc.subject	HBM
dc.subject	Operation cost
dc.subject	memory network
dc.subject	Processing in memory
dc.subject	Reinforcement Learning
dc.subject	Remote Control (RC)
dc.subject	AIMM
dc.title	Communication and Computation Optimizations in Modular Systems-On-Chip
dc.type	Thesis
thesis.degree.department	Computer Science and Engineering
thesis.degree.discipline	Computer Engineering
thesis.degree.grantor	Texas A&M University
thesis.degree.name	Doctor of Philosophy
thesis.degree.level	Doctoral
dc.contributor.committeeMember	Walker, Duncan Henry M.
dc.contributor.committeeMember	Muzahid, Abdullah
dc.contributor.committeeMember	Gratz, Paul
dc.type.material	text
dc.date.updated	2023-02-07T16:06:36Z
local.embargo.terms	2024-05-01
local.etdauthor.orcid	0000-0002-0313-7526

Files in this item

Name:: MAJUMDER-DISSERTATION-2022.pdf
Size:: 2.835Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record