A gradient effect of hand beat timing on spoken word recognition

Abstract

Visual cues play a key role in speech perception. Beat gestures (i.e., simple up-and-down hand movements) usually co-occur with prominence in speech. Previous studies found that hand beat timing can indicate word stress. The present study further examines whether hand beat timing influences spoken word recognition in a gradient fashion. On watching videos of a native speaker of Dutch uttering a disyllabic word voornaam while making a hand beat, 40 participants needed to decide if they heard the word with initial (VOORnaam, “first name”) or final stress (voorNAAM, “respectable”). Crucially, nine beat apex timings were equally distributed between the pitch peaks of the two syllables. Results exhibited a gradient effect of hand beat timing on stress perception, which appeared not to be susceptible to brief pretest feedback implying that visual cues should be ignored. Our findings provide novel evidence for audiovisual interaction and can inform gesture generation in conversational agents.

Type
Publication
In Proceedings of Interspeech 2025, 3793-3797, doi:10.21437/Interspeech.2025-116
Chengjia Ye
Chengjia Ye
PhD student

My research interests mainly include speech perception, speech production, prosody and audiovisual integration.

Hans Rutger Bosker
Hans Rutger Bosker
Assistant Professor

My research interests include speech perception, audiovisual integration, and prosody.