Abstract
A multithreaded data cache for a hyperscalar processor is designed and optimized in this study. The data cache can support two simultaneous requests from a single thread at each cycle. It is assumed that the multithreaded processor using the data cache can generate at most two requests from a single thread at each cycle and then it switches to another thread and repeats the operation. The data cache can handle separate requests from different threads at each cycle. The cache is lockup-free or non-blocking which allows it to serve the request from one thread while servicing the misses from another. The miss penalty is reduced by using a data forwarding technique which will forward the missed data to the CPU as soon as the cache fetches it from memory. The cache can support one outstanding request per thread. So, only one new request from one thread will not be generated unless the previous request has been satisfied. A simulation model of the data cache is developed by using Verilog Hardware Description Language. Trace-driven simulation is carried out to optimize the cache for this high performance processor.
Shahnaz, Munira (1995). Design of a multithreaded data cache for a hyperscalar processor. Master's thesis, Texas A&M University. Available electronically from
https : / /hdl .handle .net /1969 .1 /ETD -TAMU -1995 -THESIS -S53.