Abstract
The work reported here describes methods for automating the structure determination of proteins from electron density maps. To begin, a summary of crystallography, including the history and methods of collecting diffraction data, are presented. An explanation of the process of collecting X-ray diffraction data for use in estimating the phases for the calculation of an electron density map is presented. This is followed by a discussion of several types of methods for the interpretation of electron density maps that have been previously presented and tested and these are discussed in relation to the method behind the program presented here, TEXTAL. TEXTAL is a modeling program which uses electron density pattern-recognition to extract amino acids from a database that are used for the construction of a model of a protein molecule. The method requires the knowledge of α-carbon coordinates. The first step in the method is to locate the C[]coordinates with high confidence. To do this, a program is presented that uses the machine learning technique of decisions tree learning in order to predict the locations of the C[]coordinates. A description of the construction of the decision tree and the methods behind decision tree learning are included. The program was tested on several maps with different decision trees and the results were encouraging, but also suffered fundamental problems. This is followed by a description of post-processing requirements for the models constructed by TEXTAL to be completed. An implementation of a post-processing step is described and results are given. The conclusion describes the entire automated process presented here and the impact it may have on the field of crystallography.
Holton, Thomas Raymond (2000). Using pattern recognition in electron density map interpretation. Master's thesis, Texas A&M University. Available electronically from
https : / /hdl .handle .net /1969 .1 /ETD -TAMU -2000 -THESIS -H64.