Show simple item record

dc.contributor.advisorYoon, Byung-Jun
dc.creatorSahraeian, Sayed 1983-
dc.date.accessioned2013-10-02T21:26:35Z
dc.date.available2015-05-01T05:57:08Z
dc.date.created2013-05
dc.date.issued2013-01-07
dc.date.submittedMay 2013
dc.identifier.urihttp://hdl.handle.net/1969.1/149225
dc.description.abstractComparative analysis of genomic data investigates the relationship of genome structure and function across different biological species to shed light on their similarities and differences. In this dissertation, we study two important problems in comparative genomics, namely comparative sequence analysis and comparative network analysis. In the comparative sequence analysis, we study the multiple sequence alignment of protein and DNA sequences as well as the structural alignment of multiple RNA sequences. For closely related sequences, multiple sequence alignment can be efficiently performed through progressive techniques. However, for divergent sequences it is very challenging to predict an accurate alignment. Here, we introduce PicXAA, an efficient non-progressive technique for multiple protein and DNA sequence alignment. We also further extend PicXAA to PicXAA-R for structural alignment of RNA sequences. PicXAA and PicXAA-R greedily build up the alignment from sequence regions with high local similarity, thereby yielding an accurate global alignment that effectively captures local similarities among sequences. As another important research area in comparative genomics, we also investigate the comparative network analysis problem. Translation of increasing number of large-scale biological networks into meaningful biological insights requires efficient computational techniques. One such example is network querying, which aims to identify subnetwork regions in a large target network that are similar to a given query network. Here, we introduce an efficient algorithm for querying large-scale biological networks, called RESQUE. RESQUE adopts a semi-Markov random walk model to probabilistically estimate the correspondence scores between nodes that belong to different networks. The target network is iteratively reduced based on the estimated correspondence scores until the best matching subnetwork emerges. The proposed network querying scheme is computationally efficient, can handle any network query with an arbitrary topology, and yields accurate querying results. We also extend the idea used in RESQUE to develop an efficient algorithm for alignment of multiple large-scale biological networks, called SMETANA. SMETANA outperforms state-of- the-art network alignment techniques, in terms of both computational efficiency and alignment accuracy. The accomplished studies have enabled us to provide a coherent framework for probabilistic approach to comparative analysis of biological sequences and networks. Such a probabilistic framework helps us employ rigorous mathematical schemes to find accurate and efficient solutions to these problems.
dc.format.mimetypeapplication/pdf
dc.subjectNetwork alignment
dc.subjectNetwork querying
dc.subjectSequence alignment
dc.subjectComparative genomics
dc.titleProbabilistic Approaches in Comparative Analysis of Biological Networks and Sequences
dc.typeThesis
thesis.degree.departmentElectrical and Computer Engineering
thesis.degree.disciplineElectrical Engineering
thesis.degree.grantorTexas A&M University
thesis.degree.nameDoctor of Philosophy
thesis.degree.levelDoctoral
dc.contributor.committeeMemberDougherty, Edward
dc.contributor.committeeMemberChamberland, Jean-Francois
dc.contributor.committeeMemberSze, Sing-Hoi
dc.contributor.committeeMemberYan, Catherine
dc.type.materialtext
dc.date.updated2013-10-02T21:26:35Z
local.embargo.terms2015-05-01


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record