Scalable Parallel Algorithms for Massive Scale-free Graphs

Pearce, Roger Allan

dc.contributor.advisor	Amato, Nancy M
dc.creator	Pearce, Roger Allan
dc.date.accessioned	2014-05-13T17:30:27Z
dc.date.available	2015-12-01T06:31:20Z
dc.date.created	2013-12
dc.date.issued	2013-12-05
dc.date.submitted	December 2013
dc.identifier.uri	https://hdl.handle.net/1969.1/151937
dc.description.abstract	Efficiently storing and processing massive graph data sets is a challenging problem as researchers seek to leverage “Big Data” to answer next-generation scientific questions. New techniques are required to process large scale-free graphs in shared, distributed, and external memory. This dissertation develops new techniques to parallelize the storage, computation, and communication for scale-free graphs with high-degree vertices. Our work facilitates the processing of large real-world graph datasets through the development of parallel algorithms and tools that scale to large computational and memory resources, overcoming challenges not addressed by existing techniques. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers, clusters with local non-volatile memory, and shared memory systems. We present three novel techniques to address scaling challenges in processing large scale-free graphs. We apply an asynchronous graph traversal technique using prioritized visitor queues that is capable of tolerating data latencies to the external graph storage media and message passing communication. To accommodate large high-degree vertices, we present an edge list partitioning technique that evenly partitions graphs containing high-degree vertices. Finally, we propose a technique we call distributed delegates that distributes and parallelizes the storage, computation, and communication when processing high-degree vertices. The edges of high-degree vertices are distributed, providing additional opportunities for parallelism not present in existing methods. We apply our techniques to multiple graph algorithms: Breadth-First Search, Single Source Shortest Path, Connected Components, K-Core decomposition, Triangle Counting, and Page Rank. Our experimental study of these algorithms demonstrates excellent scalability on supercomputers, clusters with non-volatile memory, and shared memory systems. Our study includes multiple synthetic scale-free graph models, the largest of which has trillion edges, and real-world input graphs. On a supercomputer, we demonstrate scalability up to 131K processors, and improve the best known Graph500 results for IBM BG/P Intrepid by 15%.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	parallel algorithms	en
dc.subject	graph algorithms	en
dc.subject	scale-free graphs	en
dc.subject	graph partitioning	en
dc.title	Scalable Parallel Algorithms for Massive Scale-free Graphs	en
dc.type	Thesis	en
thesis.degree.department	Computer Science and Engineering	en
thesis.degree.discipline	Computer Science	en
thesis.degree.grantor	Texas A & M University	en
thesis.degree.name	Doctor of Philosophy	en
thesis.degree.level	Doctoral	en
dc.contributor.committeeMember	Choe, Yoonsuck
dc.contributor.committeeMember	Rauchwerger, Lawrence
dc.contributor.committeeMember	Adams, Marvin L
dc.contributor.committeeMember	Gokhale, Maya
dc.type.material	text	en
dc.date.updated	2014-05-13T17:30:27Z
local.embargo.terms	2015-12-01

Files in this item

Name:: PEARCE-DISSERTATION-2013.pdf
Size:: 896.4Kb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record