Articulatory-based Speech Processing Methods for Foreign Accent Conversion

Felps, Daniel

dc.contributor.advisor	Gutierrez-Osuna, Ricardo
dc.creator	Felps, Daniel
dc.date.accessioned	2012-10-19T15:28:36Z
dc.date.accessioned	2012-10-22T18:05:57Z
dc.date.available	2012-10-19T15:28:36Z
dc.date.available	2012-10-22T18:05:57Z
dc.date.created	2011-08
dc.date.issued	2012-10-19
dc.date.submitted	August 2011
dc.identifier.uri	https://hdl.handle.net/1969.1/ETD-TAMU-2011-08-9760
dc.description.abstract	The objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we present two methods of accent conversion. The first assumes that the voice quality/identity of speech resides in the glottal excitation, while the linguistic content is contained in the vocal tract transfer function. Accent conversion is achieved by convolving the glottal excitation of a non-native speaker with the vocal tract transfer function of a native speaker. The result is perceived as 60 percent less accented, but it is no longer identified as the same individual. The second method of accent conversion selects segments of speech from a corpus of non-native speech based on their acoustic or articulatory similarity to segments from a native speaker. We predict that articulatory features provide a more speaker-independent representation of speech and are therefore better gauges of linguistic similarity across speakers. To test this hypothesis, we collected a custom database containing simultaneous recordings of speech and the positions of important articulators (e.g. lips, jaw, tongue) for a native and non-native speaker. Resequencing speech from a non-native speaker based on articulatory similarity with a native speaker achieved a 20 percent reduction in accent. The approach is particularly appealing for applications in pronunciation training because it modifies speech in a way that produces realistically achievable changes in accent (i.e., since the technique uses sounds already produced by the non-native speaker). A second contribution of this dissertation is the development of subjective and objective measures to assess the performance of accent conversion systems. This is a difficult problem because, in most cases, no ground truth exists. Subjective evaluation is further complicated by the interconnected relationship between accent and identity, but modifications of the stimuli (i.e. reverse speech and voice disguises) allow the two components to be separated. Algorithms to measure objectively accent, quality, and identity are shown to correlate well with their subjective counterparts.	en
dc.format.mimetype	application/pdf
dc.language.iso	en_US
dc.subject	speech processing	en
dc.subject	voice conversion	en
dc.subject	accent conversion	en
dc.title	Articulatory-based Speech Processing Methods for Foreign Accent Conversion	en
dc.type	Thesis	en
thesis.degree.department	Computer Science and Engineering	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Doctor of Philosophy	en
thesis.degree.level	Doctoral	en
dc.contributor.committeeMember	Hammond, Tracy
dc.contributor.committeeMember	Ji, Jim
dc.contributor.committeeMember	Mitchell, J. Lawrence
dc.type.genre	thesis	en
dc.type.material	text	en

Files in this item

Name:: FELPS-DISSERTATION.pdf
Size:: 3.787Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record