Audiovisual perception of lexical stress: Beat gestures and articulatory cues

Abstract

Human communication is inherently multimodal. Auditory speech, but also visual cues can be used to understand another talker. Most studies of audiovisual speech perception have focused on the perception of speech segments (i.e., speech sounds). However, less is known about the influence of visual information on the perception of suprasegmental aspects of speech like lexical stress. In two experiments, we investigated the influence of different visual cues (e.g., facial articulatory cues and beat gestures) on the audiovisual perception of lexical stress. We presented auditory lexical stress continua of disyllabic Dutch stress pairs together with videos of a speaker producing stress on the first or second syllable (e.g., articulating VOORnaam or voorNAAM). Moreover, we combined and fully crossed the face of the speaker producing lexical stress on either syllable with a gesturing body producing a beat gesture on either the first or second syllable. Results showed that people successfully used visual articulatory cues to stress in muted videos. However, in audiovisual conditions, we were not able to find an effect of visual articulatory cues. In contrast, we found that the temporal alignment of beat gestures with speech robustly influenced participants’ perception of lexical stress. These results highlight the importance of considering suprasegmental aspects of language in multimodal contexts.

Type
Publication
Language and Speech, doi:10.1177/00238309241258162
Ronny Bujok
Ronny Bujok
PhD student

My research interests include audiovisual speech perception, gestures, and lexical stress.

Hans Rutger Bosker
Hans Rutger Bosker
Assistant Professor

My research interests include speech perception, audiovisual integration, and prosody.