Show simple item record

dc.contributor.advisorHart, Jeffrey D.
dc.creatorLee, Hyuneui
dc.date.accessioned2018-02-05T16:50:43Z
dc.date.available2019-08-01T06:51:27Z
dc.date.created2017-08
dc.date.issued2017-05-30
dc.date.submittedAugust 2017
dc.identifier.urihttps://hdl.handle.net/1969.1/165749
dc.description.abstractA goodness-of-fit (gof) problem, i.e., testing whether observed data come from a specific distribution is one of the important problems in statistics, and various tests for checking distributional assumptions have been suggested. Most tests are for one data set with a large enough sample sizes. However, this research focuses on the gof problem when there are a large number of small data sets. In other words, we assume that the number of data sets p increases to infinity and the sample size of each small data set n is finite. In this dissertation, we will denote p and n as the number of data sets and the sample sizes of each data sets, respectively. Since the primary interest of this dissertation is testing whether every small data set comes from a known parametric family of distributions with different parameters, it is important to choose a gof test invariant to parameters of unknown distribution. Hence, as a basic approach, we suggest applying empirical distribution function (edf) based gof tests to every small data set and then combining P-values to obtain a single test. Two P-value combining methods, moment based tests and smoothing based tests, are suggested and their pros and cons are discussed. Especially, the two moment based tests, Edgington's method and Fisher's method, are compared with respect to Pitman efficiency and asymptotic power. We also find conditions that guarantee that the asymptotic null distribution of moment based tests based on empirical P-values is the same as that based on exact P-values. When the null is a location and scale family, there is no difficulty in applying the suggested test procedures. However, when the null is not a location and scale family, edf-based tests may depend on unknown parameters. To handle such a problem, we suggest using unconditional P-values and this requires an additional step of estimating the distribution of unknown parameters. Several issues related to estimating the distribution of unknown parameters and obtaining unconditional P-values are also discussed. The performance of suggested test procedures are investigated via simulations and these procedures are applied to microarray data.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectGoodness-of-fit testen
dc.subjectMicroarray dataen
dc.subjectFisher's methoden
dc.subjectEdgington's methoden
dc.subjectSmoothing based testsen
dc.titleGoodness-of-Fit Test for Large Number of Small Data Setsen
dc.typeThesisen
thesis.degree.departmentStatisticsen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberMueller-Harknett, Uschi
dc.contributor.committeeMemberSang, Huiyan
dc.contributor.committeeMemberWu, Ximing
dc.type.materialtexten
dc.date.updated2018-02-05T16:50:44Z
local.embargo.terms2019-08-01
local.etdauthor.orcid0000-0001-8539-1650


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record