Principled Approaches to Last-Level Cache Management
Loading...
Date
2017-07-20
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Memory is a critical component of all computing systems. It represents a fundamental
performance and energy bottleneck. Ideally, memory aspects such as energy cost, performance,
and the cost of implementing management techniques would scale together with
the size of all different computing systems; unfortunately this is not the case. With the upcoming
trends in applications, new memory technologies, etc., scaling becomes a bigger
a problem, aggravating the performance bottleneck that memory represents. A memory
hierarchy was proposed to alleviate the problem. Each level in the hierarchy tends to have
a decreasing cost per bit, an increased capacity, and a higher access time compared to its
previous level. Preferably all data will be stored in the fastest level of memory, unfortunately,
faster memory technologies tend to be associated with a higher manufacturing
cost, which often limits their capacity. The design challenge is, to determine which is the
frequently used data, and store it in the faster levels of memory.
A cache is a small, fast, on-chip chunk of memory. Any data stored in main memory
can be stored in the cache. For many programs, a typical behavior is to access data that has
been accessed previously. Taking advantage of this behavior, a copy of frequently accessed
data is kept in the cache, in order to provide a faster access time next time is requested.
Due to capacity constrains, it is likely that all of the frequently reused data cannot fit in
the cache, because of this, cache management policies decide which data is to be kept in
the cache, and which in other levels of the memory hierarchy. Under an efficient cache
management policy, an encouraging amount of memory requests will be serviced from a
fast on-chip cache.
The disparity in access latency between the last-level cache and main memory motivates
the search for efficient cache management policies. There is a great amount of recently proposed work that strives to utilize cache capacity in the most favorable to performance
way possible. Related work focus on optimizing the performance of caches focusing
on different possible solutions, e.g. reduce miss rate, consume less power, reducing
storage overhead, reduce access latency, etc.
Our work focus on improving the performance of last-level caches by designing policies
based on principles adapted from other areas of interest. In this dissertation, we
focus on several aspects of cache management policies, we first introduce a space-efficient
placement and promotion policy which goal is to minimize the updates to the replacement
policy state on each cache access. We further introduce a mechanism that predicts whether
a block in the cache will be reused, it feeds different features from a block to the predictor
in order to increase the correlation of a previous access to a future access. We later introduce
a technique that tweaks traditional cache indexing, providing fast accesses to a vast
majority of requests in the presence of a slow access memory technology such as DRAM.
Description
Keywords
LLC, replacement