Show simple item record

dc.contributor.advisorGratz, Paul V
dc.creatorAlbarakat, Laith Mohammad
dc.date.accessioned2018-02-05T21:10:56Z
dc.date.available2018-02-05T21:10:56Z
dc.date.created2017-08
dc.date.issued2017-07-28
dc.date.submittedAugust 2017
dc.identifier.urihttps://hdl.handle.net/1969.1/165781
dc.description.abstractTo take advantage of the processing power in the Chip Multiprocessors design, applications must be divided into semi-independent processes that can run concur- rently on multiple cores within a system. Therefore, programmers must insert thread synchronization semantics (i.e. locks, barriers, and condition variables) to synchro- nize data access between processes. Indeed, threads spend long time waiting to acquire the lock of a critical section. In addition, a processor has to stall execution to wait for load data accesses to complete. Furthermore, there are often independent instructions which include load instructions beyond synchronization semantics that could be executed in parallel while a thread waits on the synchronization semantics. The conveniences of the cache memories come with some extra cost in Chip Multiprocessors. Cache Coherence mechanisms address the Memory Consistency problem. However, Cache Coherence adds considerable overhead to memory accesses. Having aggressive prefetcher on different cores of a Chip Multiprocessor can definitely lead to significant system performance degradation when running multi-threaded applications. This result of prefetch-demand interference when a prefetcher in one core ends up pulling shared data from a producing core before it has been written, the cache block will end up transitioning back and forth between the cores and result in useless prefetch, saturating the memory bandwidth and substantially increase the latency to critical shared data. We present a hardware prefetcher that enables large performance improvements from prefetching in Chip Multiprocessors by significantly reducing prefetch-demand interference. Furthermore, it will utilize the time that a thread spends waiting on syn- chronization semantics to run ahead of the critical section to speculate and prefetch independent load instruction data beyond the synchronization semantics.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectprefetcheren
dc.titleMultithreading Aware Hardware Prefetching for Chip Multiprocessorsen
dc.typeThesisen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameMaster of Scienceen
thesis.degree.levelMastersen
dc.contributor.committeeMemberHou, I-Hong
dc.contributor.committeeMemberJimenez, Daniel A
dc.type.materialtexten
dc.date.updated2018-02-05T21:10:57Z
local.etdauthor.orcid0000-0001-9644-6565


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record