dc.creator | Yao, Yuan | |
dc.date.accessioned | 2018-05-23T15:32:53Z | |
dc.date.available | 2018-05-23T15:32:53Z | |
dc.date.created | 2019-05 | |
dc.date.submitted | May 2019 | |
dc.identifier.uri | https://hdl.handle.net/1969.1/166469 | |
dc.description.abstract | In the age of fast growing technology, massive storage, and cluster computing, efficient big-data processing algorithms are in high demand. MapReduce is one of the programming models that enables massive-scale cluster technology around the world. Despite significant public efforts, the open-source implementation of MapReduce – Apache Hadoop – is cumbersome, complex, and inefficient. The purpose of this research is to improve the performance of Hadoop, specifically its sorting component, by developing a single-pass, streambased multithreaded bucket sort. Our new set of algorithms has the potential to influence the future of data-centric computing. | en |
dc.format.mimetype | application/pdf | |
dc.subject | storage | en |
dc.subject | cluster computing | en |
dc.subject | MapReduce | en |
dc.subject | algorithms | en |
dc.subject | single-pass | en |
dc.subject | streambased multithread sort | en |
dc.title | Virtual Memory Streaming and Sorting in MapReduce Applications | en |
dc.type | Thesis | en |
thesis.degree.department | Computer Science & Engineering | en |
thesis.degree.discipline | Computer Science | en |
thesis.degree.grantor | Undergraduate Research Scholars Program | en |
thesis.degree.name | BS | en |
thesis.degree.level | Undergraduate | en |
dc.contributor.committeeMember | Loguinov, Dmitri | |
dc.type.material | text | en |
dc.date.updated | 2018-05-23T15:32:54Z | |