Show simple item record

dc.contributor.advisorGutierrez-Osuna, Ricardo
dc.creatorFelps, Daniel
dc.date.accessioned2012-10-19T15:28:36Z
dc.date.accessioned2012-10-22T18:05:57Z
dc.date.available2012-10-19T15:28:36Z
dc.date.available2012-10-22T18:05:57Z
dc.date.created2011-08
dc.date.issued2012-10-19
dc.date.submittedAugust 2011
dc.identifier.urihttps://hdl.handle.net/1969.1/ETD-TAMU-2011-08-9760
dc.description.abstractThe objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we present two methods of accent conversion. The first assumes that the voice quality/identity of speech resides in the glottal excitation, while the linguistic content is contained in the vocal tract transfer function. Accent conversion is achieved by convolving the glottal excitation of a non-native speaker with the vocal tract transfer function of a native speaker. The result is perceived as 60 percent less accented, but it is no longer identified as the same individual. The second method of accent conversion selects segments of speech from a corpus of non-native speech based on their acoustic or articulatory similarity to segments from a native speaker. We predict that articulatory features provide a more speaker-independent representation of speech and are therefore better gauges of linguistic similarity across speakers. To test this hypothesis, we collected a custom database containing simultaneous recordings of speech and the positions of important articulators (e.g. lips, jaw, tongue) for a native and non-native speaker. Resequencing speech from a non-native speaker based on articulatory similarity with a native speaker achieved a 20 percent reduction in accent. The approach is particularly appealing for applications in pronunciation training because it modifies speech in a way that produces realistically achievable changes in accent (i.e., since the technique uses sounds already produced by the non-native speaker). A second contribution of this dissertation is the development of subjective and objective measures to assess the performance of accent conversion systems. This is a difficult problem because, in most cases, no ground truth exists. Subjective evaluation is further complicated by the interconnected relationship between accent and identity, but modifications of the stimuli (i.e. reverse speech and voice disguises) allow the two components to be separated. Algorithms to measure objectively accent, quality, and identity are shown to correlate well with their subjective counterparts.en
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.subjectspeech processingen
dc.subjectvoice conversionen
dc.subjectaccent conversionen
dc.titleArticulatory-based Speech Processing Methods for Foreign Accent Conversionen
dc.typeThesisen
thesis.degree.departmentComputer Science and Engineeringen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberHammond, Tracy
dc.contributor.committeeMemberJi, Jim
dc.contributor.committeeMemberMitchell, J. Lawrence
dc.type.genrethesisen
dc.type.materialtexten


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record