Texas A&M University LibrariesTexas A&M University LibrariesTexas A&M University Libraries
    • Help
    • Login
    OAKTrust
    View Item 
    •   OAKTrust Home
    • Colleges and Schools
    • Graduate and Professional School
    • Electronic Theses, Dissertations, and Records of Study (2002– )
    • View Item
    •   OAKTrust Home
    • Colleges and Schools
    • Graduate and Professional School
    • Electronic Theses, Dissertations, and Records of Study (2002– )
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Analytic Study of Performance of Error Estimators for Linear Discriminant Analysis with Applications in Genomics

    Thumbnail
    View/ Open
    ZOLLANVARI-DISSERTATION.pdf (1.714Mb)
    Date
    2012-02-14
    Author
    Zollanvari, Amin
    Metadata
    Show full item record
    Abstract
    Error estimation must be used to find the accuracy of a designed classifier, an issue that is critical in biomarker discovery for disease diagnosis and prognosis in genomics and proteomics. This dissertation is concerned with the analytical formulation of the joint distribution of the true error of misclassification and two of its commonly used estimators, resubstitution and leave-one-out, as well as their marginal and mixed moments, in the context of the Linear Discriminant Analysis (LDA) classification rule. In the first part of this dissertation, we obtain the joint sampling distribution of the actual and estimated errors under a general parametric Gaussian assumption. Exact results are provided in the univariate case and an accurate approximation is obtained in the multivariate case. We show how these results can be applied in the computation of conditional bounds and the regression of the actual error, given the observed error estimate. In practice the unknown parameters of the Gaussian distributions, which figure in the expressions, are not known and need to be estimated. Using the usual maximum-likelihood estimates for such parameters and plugging them into the theoretical exact expressions provides a sample-based approximation to the joint distribution, and also sample-based methods to estimate upper conditional bounds. In the second part of this dissertation, exact analytical expressions for the bias, variance, and Root Mean Square (RMS) for the resubstitution and leave-one-out error estimators in the univariate Gaussian model are derived. All probabilistic characteristics of an error estimator are given by the knowledge of its joint distribution with the true error. Partial information is contained in their mixed moments, in particular, their second mixed moment. Marginal information regarding an error estimator is contained in its marginal moments, in particular, its mean and variance. Since we are interested in estimator accuracy and wish to use the RMS to measure that accuracy, we desire knowledge of the second-order moments, marginal and mixed, with the true error. In the multivariate case, using the double asymptotic approach with the assumption of knowing the common covariance matrix of the Gaussian model, analytical expressions for the first moments, second moments, and mixed moment with the actual error for the resubstitution and leave-one-out error estimators are derived. The results provide accurate small sample approximations and this is demonstrated in the present situation via numerical comparisons. Application of the results is discussed in the context of genomics.
    URI
    https://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8685
    Subject
    Error Estimation
    Linear Discriminant Analysis
    Genomics
    Joint Distribution
    Double Asymptotic Analysis
    Cross-moments
    Resubstitution
    Leave-one-out
    Collections
    • Electronic Theses, Dissertations, and Records of Study (2002– )
    Citation
    Zollanvari, Amin (2010). Analytic Study of Performance of Error Estimators for Linear Discriminant Analysis with Applications in Genomics. Doctoral dissertation, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /ETD -TAMU -2010 -12 -8685.

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Advanced Search

    Browse

    All of OAKTrustCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDepartmentTypeThis CollectionBy Issue DateAuthorsTitlesSubjectsDepartmentType

    My Account

    LoginRegister

    Statistics

    View Usage Statistics
    Help and Documentation

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV