Show simple item record

dc.contributor.advisorKim, Eun Jung
dc.contributor.advisorJimenez, Daniel
dc.creatorMajumder, Pritam
dc.date.accessioned2023-02-07T16:06:35Z
dc.date.available2024-05-01T06:05:38Z
dc.date.created2022-05
dc.date.issued2022-03-04
dc.date.submittedMay 2022
dc.identifier.urihttps://hdl.handle.net/1969.1/197161
dc.description.abstractIn our modern world where everyone is always connected through internet, terabytes of data gets generated at every moment through online activities like communication on social media, on-line banking, online shopping, browsing, streaming, telemedicine, information on global activities, weather forecasting, astronomy etc. Current e-commerce and science community are heavily de-pendent on this data, creating huge demand for quick data processing through machine learning methods for on time decision making and also for enhancing our knowledge regarding science and universe. To meet the demand, currently we rely on scale-out systems like cloud servers, usually equipped with GPUs as general purpose accelerators, often realized with SoCs. For designing large scale systems, design modularity offers cheaper SoC with inherent integration complexity, like network deadlock involving multiple modules, in spite of each individual module being deadlock-free. Based on our first observation, the deadlock in modular SoC is formed by forming circular channel dependency involving two or more modules in the SoC. Further, mixing traffics originated from different modules may block each other in a circular fashion, which results in deadlock. We propose a deadlock avoidance technique for any modular SoC to make the integration deadlock free with minimum overhead. We evaluate our theory of deadlock freedom using full-system SoC simulation constituted of independently designed CPUs and GPUs through interposer network. The routing used in CPU, GPU, or in interposer are completely in-dependent of each other, experience deadlock while exposed to high workload. Our technique successfully avoids the deadlock with much lesser performance and energy overhead than the state-of-the-art turn-restriction-based SoC deadlock solution. In addition, the excessive data movement in these large systems, tackled by near-memory processing (NMP) has further scope for improvement in their data and computation mapping, as reducing data movement does not ensure optimized operation cost. We propose a reinforcement learning (RL) based solution to improve the cost of operations by improving the data and computation mapping in the memory network for NMP. The solution constituted of two main components, (1) the formulation of the mapping optimization as an RL problem that involves selecting enough information from the system to form states, deciding the actions based on the desired mapping outcome, and properly calibrate feedback aiming towards the goal of performance improvement,(2) the realization of the information collection system and efficient implementation of data and computation remapping. We integrate RL framework with system simulation framework and show that our technique added as a plug-in module in the system can significantly improve the performance of the NMP techniques. Finally we describe the simulation framework in details, integrated with reinforcement learning, which we develop from scratch to evaluate our proposed solution.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectdata movement
dc.subjectdeadlock
dc.subjectSoC
dc.subjectHMC
dc.subjectHBM
dc.subjectOperation cost
dc.subjectmemory network
dc.subjectProcessing in memory
dc.subjectReinforcement Learning
dc.subjectRemote Control (RC)
dc.subjectAIMM
dc.titleCommunication and Computation Optimizations in Modular Systems-On-Chip
dc.typeThesis
thesis.degree.departmentComputer Science and Engineering
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorTexas A&M University
thesis.degree.nameDoctor of Philosophy
thesis.degree.levelDoctoral
dc.contributor.committeeMemberWalker, Duncan Henry M.
dc.contributor.committeeMemberMuzahid, Abdullah
dc.contributor.committeeMemberGratz, Paul
dc.type.materialtext
dc.date.updated2023-02-07T16:06:36Z
local.embargo.terms2024-05-01
local.etdauthor.orcid0000-0002-0313-7526


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record