Virtual Memory Streaming and Sorting in MapReduce Applications

dc.creator	Yao, Yuan
dc.date.accessioned	2018-05-23T15:32:53Z
dc.date.available	2018-05-23T15:32:53Z
dc.date.created	2019-05
dc.date.submitted	May 2019
dc.identifier.uri	https://hdl.handle.net/1969.1/166469
dc.description.abstract	In the age of fast growing technology, massive storage, and cluster computing, efficient big-data processing algorithms are in high demand. MapReduce is one of the programming models that enables massive-scale cluster technology around the world. Despite significant public efforts, the open-source implementation of MapReduce – Apache Hadoop – is cumbersome, complex, and inefficient. The purpose of this research is to improve the performance of Hadoop, specifically its sorting component, by developing a single-pass, streambased multithreaded bucket sort. Our new set of algorithms has the potential to influence the future of data-centric computing.	en
dc.format.mimetype	application/pdf
dc.subject	storage	en
dc.subject	cluster computing	en
dc.subject	MapReduce	en
dc.subject	algorithms	en
dc.subject	single-pass	en
dc.subject	streambased multithread sort	en
dc.title	Virtual Memory Streaming and Sorting in MapReduce Applications	en
dc.type	Thesis	en
thesis.degree.department	Computer Science & Engineering	en
thesis.degree.discipline	Computer Science	en
thesis.degree.grantor	Undergraduate Research Scholars Program	en
thesis.degree.name	BS	en
thesis.degree.level	Undergraduate	en
dc.contributor.committeeMember	Loguinov, Dmitri
dc.type.material	text	en
dc.date.updated	2018-05-23T15:32:54Z