Virtual Memory Streaming and Sorting in MapReduce Applications
Abstract
In the age of fast growing technology, massive storage, and cluster computing, efficient big-data processing algorithms are in high demand. MapReduce is one of the programming models that enables massive-scale cluster technology around the world. Despite significant public efforts, the open-source implementation of MapReduce – Apache Hadoop – is cumbersome, complex, and inefficient. The purpose of this research is to improve the performance of Hadoop, specifically its sorting component, by developing a single-pass, streambased multithreaded bucket sort. Our new set of algorithms has the potential to influence the future of data-centric computing.
Citation
Yao, Yuan (2019). Virtual Memory Streaming and Sorting in MapReduce Applications. Undergraduate Research Scholars Program. Available electronically from https : / /hdl .handle .net /1969 .1 /166469.