Abstract
Memory architecture is an important component in a distributed shared-memory parallel computer. This thesis studies three shared-memory architectures-Non-Uniform Memory Access (NUMA) with full-mapped directories, Cache-Only Memory Architecture (COMA) with full-mapped directories, and COMA with directories based on a new design using binomial trees. The three architectures were implemented in the Proteus execution driven simulator. Proteus simulated the execution of three applications taken from the SPLASH-2 suite of benchmark parallel programs. Six sets of simulations were run. These simulations provided performance data for a range of values of important design parameters. The parameters studied were page size, block size, number of processors, memory controller speed, cache size and interconnection network topology. These simulations have two major benefits. First, they aid in choosing the best values for key design parameters. Second, these simulations facilitate the direct comparison of COMA vs. NUMA as well as the two directory designs. The simulations show that both COMA architectures generally perform better than NUMA. COMA proved to be less sensitive to suboptimum choices of primary cache and block sizes. In most cases the COMA with full-mapped directories performed a little better than with binomial trees. However, the binomial tree directories require significantly less hardware (eleven versus sixty-four bite per block for the machines simulated in this thesis).
Holzrichter, Michael Warren (1995). Performance evaluation of NUMA and COMA distributed shared-memory multiprocessors. Master's thesis, Texas A&M University. Available electronically from
https : / /hdl .handle .net /1969 .1 /ETD -TAMU -1995 -THESIS -H654.