Show simple item record

dc.contributor.advisorDavis, Tim
dc.creatorMahmoudi Aznaveh, Mohsen
dc.date.accessioned2023-09-18T16:38:34Z
dc.date.available2023-09-18T16:38:34Z
dc.date.created2022-12
dc.date.issued2022-12-05
dc.date.submittedDecember 2022
dc.identifier.urihttps://hdl.handle.net/1969.1/198633
dc.description.abstractWe are introducing a new sparse direct solver with a parallel multifrontal algorithm. The algorithm is designed by borrowing the symbolic analysis phase and some ideas from UMFPACK. UMFPACK is a part of the SuiteSparse package and appears as a built-in routine in MATLAB. Although the general idea of the algorithm is the same, we designed a new algorithm from scratch that is more amenable to parallelism. The new algorithm is right-looking multifrontal with rectangular fronts and uses the sparsest pivot like UMFPACK; however, its data structures are different, and the factorization is more coarse-grained. One of the most significant contributions of this work is the algorithm that cuts various dependencies of the data structure. The only dependency comes from the matrix’s pattern from the elimination tree. Independent tasks can start working in different computing cores using OpenMP tasking. Each task can call BLAS kernels, specifically matrix multiplication and triangular solve. Therefore, there is nested parallelism in this algorithm. To better manage hardware resources, we can exploit parallel BLAS only if we have more computing cores than tasks. In practice, the performance depends on the BLAS library and the input matrix. It is better to have a mixed strategy to have both parallel BLAS with parallel fronts. Therefore, our algorithm is less affected by parallel BLAS and shows good performance compared to UMFPACK. Data structure and memory management are different in ParU, so it needs less memory to solve a system. The current algorithm is implemented for a shared memory environment. However, the algorithm can be an excellent candidate to be implemented in a distributed environment. Moreover, while there are parallel BLAS calls (typically larger than what UMFPACK has), it is also an excel-lent candidate for using hardware accelerators like GPUs.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectSparse linear algebra
dc.subjectMatrix computation
dc.titleParU: A Task Based Parallel Multifrontal and Unsymmetric Sparse LU Factorization
dc.typeThesis
thesis.degree.departmentComputer Science and Engineering
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorTexas A&M University
thesis.degree.nameDoctor of Philosophy
thesis.degree.levelDoctoral
dc.contributor.committeeMemberSarin, Vivek
dc.contributor.committeeMemberSong, Dezhen
dc.contributor.committeeMemberFoucart, Simon
dc.type.materialtext
dc.date.updated2023-09-18T16:38:35Z
local.etdauthor.orcid0000-0003-4860-4762


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record