Probabilistic Approaches in Comparative Analysis of Biological Networks and Sequences

Sahraeian, Sayed 1983-

The full text of this item is not available at this time because the student has placed this item under an embargo for a period of time. The Libraries are not authorized to provide a copy of this work during the embargo period, even for Texas A&M users with NetID.

Show simple item record

dc.contributor.advisor	Yoon, Byung-Jun
dc.creator	Sahraeian, Sayed 1983-
dc.date.accessioned	2013-10-02T21:26:35Z
dc.date.available	2015-05-01T05:57:08Z
dc.date.created	2013-05
dc.date.issued	2013-01-07
dc.date.submitted	May 2013
dc.identifier.uri	https://hdl.handle.net/1969.1/149225
dc.description.abstract	Comparative analysis of genomic data investigates the relationship of genome structure and function across different biological species to shed light on their similarities and differences. In this dissertation, we study two important problems in comparative genomics, namely comparative sequence analysis and comparative network analysis. In the comparative sequence analysis, we study the multiple sequence alignment of protein and DNA sequences as well as the structural alignment of multiple RNA sequences. For closely related sequences, multiple sequence alignment can be efficiently performed through progressive techniques. However, for divergent sequences it is very challenging to predict an accurate alignment. Here, we introduce PicXAA, an efficient non-progressive technique for multiple protein and DNA sequence alignment. We also further extend PicXAA to PicXAA-R for structural alignment of RNA sequences. PicXAA and PicXAA-R greedily build up the alignment from sequence regions with high local similarity, thereby yielding an accurate global alignment that effectively captures local similarities among sequences. As another important research area in comparative genomics, we also investigate the comparative network analysis problem. Translation of increasing number of large-scale biological networks into meaningful biological insights requires efficient computational techniques. One such example is network querying, which aims to identify subnetwork regions in a large target network that are similar to a given query network. Here, we introduce an efficient algorithm for querying large-scale biological networks, called RESQUE. RESQUE adopts a semi-Markov random walk model to probabilistically estimate the correspondence scores between nodes that belong to different networks. The target network is iteratively reduced based on the estimated correspondence scores until the best matching subnetwork emerges. The proposed network querying scheme is computationally efficient, can handle any network query with an arbitrary topology, and yields accurate querying results. We also extend the idea used in RESQUE to develop an efficient algorithm for alignment of multiple large-scale biological networks, called SMETANA. SMETANA outperforms state-of- the-art network alignment techniques, in terms of both computational efficiency and alignment accuracy. The accomplished studies have enabled us to provide a coherent framework for probabilistic approach to comparative analysis of biological sequences and networks. Such a probabilistic framework helps us employ rigorous mathematical schemes to find accurate and efficient solutions to these problems.	en
dc.format.mimetype	application/pdf
dc.subject	Network alignment	en
dc.subject	Network querying	en
dc.subject	Sequence alignment	en
dc.subject	Comparative genomics	en
dc.title	Probabilistic Approaches in Comparative Analysis of Biological Networks and Sequences	en
dc.type	Thesis	en
thesis.degree.department	Electrical and Computer Engineering	en
thesis.degree.discipline	Electrical Engineering	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Doctor of Philosophy	en
thesis.degree.level	Doctoral	en
dc.contributor.committeeMember	Dougherty, Edward
dc.contributor.committeeMember	Chamberland, Jean-Francois
dc.contributor.committeeMember	Sze, Sing-Hoi
dc.contributor.committeeMember	Yan, Catherine
dc.type.material	text	en
dc.date.updated	2013-10-02T21:26:35Z
local.embargo.terms	2015-05-01

Files in this item

Name:: SAHRAEIAN-DISSERTATION-2013.pdf
Size:: 3.186Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record