Show simple item record

dc.contributor.advisorAmato, Nancy M
dc.creatorPearce, Roger Allan
dc.date.accessioned2014-05-13T17:30:27Z
dc.date.available2015-12-01T06:31:20Z
dc.date.created2013-12
dc.date.issued2013-12-05
dc.date.submittedDecember 2013
dc.identifier.urihttp://hdl.handle.net/1969.1/151937
dc.description.abstractEfficiently storing and processing massive graph data sets is a challenging problem as researchers seek to leverage “Big Data” to answer next-generation scientific questions. New techniques are required to process large scale-free graphs in shared, distributed, and external memory. This dissertation develops new techniques to parallelize the storage, computation, and communication for scale-free graphs with high-degree vertices. Our work facilitates the processing of large real-world graph datasets through the development of parallel algorithms and tools that scale to large computational and memory resources, overcoming challenges not addressed by existing techniques. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers, clusters with local non-volatile memory, and shared memory systems. We present three novel techniques to address scaling challenges in processing large scale-free graphs. We apply an asynchronous graph traversal technique using prioritized visitor queues that is capable of tolerating data latencies to the external graph storage media and message passing communication. To accommodate large high-degree vertices, we present an edge list partitioning technique that evenly partitions graphs containing high-degree vertices. Finally, we propose a technique we call distributed delegates that distributes and parallelizes the storage, computation, and communication when processing high-degree vertices. The edges of high-degree vertices are distributed, providing additional opportunities for parallelism not present in existing methods. We apply our techniques to multiple graph algorithms: Breadth-First Search, Single Source Shortest Path, Connected Components, K-Core decomposition, Triangle Counting, and Page Rank. Our experimental study of these algorithms demonstrates excellent scalability on supercomputers, clusters with non-volatile memory, and shared memory systems. Our study includes multiple synthetic scale-free graph models, the largest of which has trillion edges, and real-world input graphs. On a supercomputer, we demonstrate scalability up to 131K processors, and improve the best known Graph500 results for IBM BG/P Intrepid by 15%.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectparallel algorithms
dc.subjectgraph algorithms
dc.subjectscale-free graphs
dc.subjectgraph partitioning
dc.titleScalable Parallel Algorithms for Massive Scale-free Graphs
dc.typeThesis
thesis.degree.departmentComputer Science and Engineering
thesis.degree.disciplineComputer Science
thesis.degree.grantorTexas A & M University
thesis.degree.nameDoctor of Philosophy
thesis.degree.levelDoctoral
dc.contributor.committeeMemberChoe, Yoonsuck
dc.contributor.committeeMemberRauchwerger, Lawrence
dc.contributor.committeeMemberAdams, Marvin L
dc.contributor.committeeMemberGokhale, Maya
dc.type.materialtext
dc.date.updated2014-05-13T17:30:27Z
local.embargo.terms2015-12-01


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record