Show simple item record

dc.contributor.advisorReddy, A. L. Narasimha
dc.creatorChou, Chih-Chieh
dc.date.accessioned2023-12-20T19:50:08Z
dc.date.available2023-12-20T19:50:08Z
dc.date.created2020-05
dc.date.issued2020-04-10
dc.date.submittedMay 2020
dc.identifier.urihttps://hdl.handle.net/1969.1/200777
dc.description.abstractEmerging Non-Volatile Memories (NVM), such as phase-change memory (PCM), NVDIMM, and 3D XPoint, have byte-addressability and low latency, close to that of main memory, together with the non-volatility of storage devices. NVM can be treated as both memory as well as a storage device in the systems. These emerging technologies have great potential to improve the system/application performance as well as to provide scalability and reliability of applications and services. Much prior work focused on new system designs using NVM and has already shown promising performance improvements. In this work, we try to employ NVM with different directions and approaches. First, we construct a user space library to virtualize and share NVM between multiple processes and applications. The reason to construct a user space library is because of performance, which can be significantly improved by avoiding context switch overheads from system calls. NVM has DRAM-like latency. So, accessing NVM through the kernel space as before would result in too much context switching overhead, and therefore it would squander the low-latency provided by NVM. Our library tries to access NVM mostly in user space, and only enters kernel space whenever necessary. We have shown that our novel user space library incurs less than 10% overhead while providing the properties of virtualization and sharing of NVM. Also, recently emerging interconnect fabrics, such as Gen-Z, provide high bandwidth, together with exceptionally low latency. These concurrently emerging technologies are making possible new system architectures in the data centers including systems with Fabric-Attached Memories (FAMs). FAMs can serve to create scalable, high-bandwidth, distributed, shared, byte-addressable, and non-volatile memory pools at a rack scale. We propose FAM-aware, checkpoint-based, post-copy live migration mechanism to improve the performance of application migration. We have implemented our prototype of mechanism in a Linux open source checkpoint tool, CRIU (Checkpoint/Restore In Userspace). According to our evaluation results, compared to the existing CRIU approach, our FAM-aware post-copy can improve the total migration time by at least 15%, the busy time by at least 33% and down time by at least 23%, and can let the migrated application perform at least 12% better during migration as well as our batching checkpoint can improve average checkpoint time by 15%. We look at the problem of integrating low latency NVM tightly in the memory hierarchy. We consider the problem of reducing the costs of page faults for moving data between DRAM and NVM. Since NVM can provide much low latency, so, in addition to the overheads of context switches from system calls, the overheads of context switches from page faults seem too high, too. We approach this problem through a hardware, software co-design approach by extending the CPU hardware page walker and enhancing the Linux kernel. We develop a page pre-allocation mechanism to pre-allocate pages before a page fault happens, where the hardware page walker executes some required operations during the page faults, and a kernel background thread is triggered to periodically handle the post-page fault operations after page faults. We show that the critical path latency of a page fault, improved by our new hardware and kernel, can reduce to 2.1% for write and 12% for read, compared with that of existing kernel page fault exception handler. We can improve the execution time for some benchmarks by about 5.6%.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectnon-volatile memory
dc.subjectfabric-attached memory
dc.subjectbackground thread
dc.subjectlive migration
dc.subjectuser space library
dc.titleOptimizing Emerging Memory Systems for Performance
dc.typeThesis
thesis.degree.departmentElectrical and Computer Engineering
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorTexas A&M University
thesis.degree.nameDoctor of Philosophy
thesis.degree.levelDoctoral
dc.contributor.committeeMemberGratz, Pual V.
dc.contributor.committeeMemberNarayanan, Krishna
dc.contributor.committeeMemberKim, Eun Jung
dc.type.materialtext
dc.date.updated2023-12-20T19:50:09Z
local.etdauthor.orcid0000-0002-3094-6951


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record