The Role of Spectral Information in Foreign-Accented Speech Perception
Date
Authors
ORCID
Journal Title
Journal ISSN
Volume Title
Publisher
item.page.doi
Abstract
Source signals, vocal tract resonances and articulatory movements encode talker-specific spectral information that allows for appropriate adjustment of a listener’s perceptual system to the acoustic characteristics of a particular talker. This implicit learning of talker-specific properties is known as talker normalization. Talker normalization requires prior experience and also structured knowledge about pronunciation variation across talkers that share the same native accent to guide perception. This process becomes difficult when the talker has an accent that is perceived as foreign. Although research suggests that listeners can adapt to foreign accents, the time-course and specificity of adaptation remain unclear, especially when listeners attend to speech produced by multiple alternating foreign-accented talkers. This dissertation focuses on the role of spectral cues in the perception of foreign-accented speech. While many factors contribute to the perception of foreign-accented speech, spectral cues are of particular interest because they play an important role in talker-specific phonetic recalibration in native speech to accommodate variations in vocal tract size across talkers. Through a series of experiments, we tested the hypothesis that listeners rely on talker-specific spectral cues when adapting to foreign-accented speech. We assessed the contribution of spectral resolution to the intelligibility of foreign-accented speech by varying the number of spectral channels in a tone vocoder. We also tested listeners’ abilities to discriminate between native- and foreign-accented speech to determine the effect of reduced spectral resolution on accent detection. Results showed a greater decrease in intelligibility when spectral resolution was reduced for foreign-accented speech compared to native-accented speech. Listeners also found it harder to detect a foreign accent with spectrally reduced speech. We extended these findings by investigating the effects of changing the talker from trial to trial, a manipulation that produces a reduction in intelligibility when compared to holding the talker constant within each block of trials. We hypothesized that limiting spectral resolution when listeners were exposed to multiple foreign-accented talkers would cause a further decrease in intelligibility. This prediction was confirmed, supporting the idea that detailed spectral resolution helps to maintain the intelligibility of foreign-accented speech when listeners are exposed to multiple interleaved talkers. Listeners were able to adapt with increased exposure if they heard a single foreign-accented talker, though not to the extent observed with unprocessed natural speech. Performance was higher for native-accented speech, with no difference between single- and multiple-talker conditions. Finally, we investigated how spectral shifting of foreign-accented speech would affect intelligibility by scaling the fundamental frequency and spectral envelope to simulate multiple talkers. Consistent with results for spectrally reduced speech, intelligibility was lower in the multiple-foreign-accented talker condition compared to the single-talker condition. Introducing frequency shifts produced a drop in intelligibility to levels observed in the multiple-talker condition. Results indicate that listeners depend on spectral cues when perceiving foreign-accented speech, and that spectral information is especially important when listening to speech spoken by different foreign-accented talkers. The results support a model of foreign-accented speech perception that relies on spectral cues to adjust to the deviations between foreign-accented and native speech.