In this paper we introduced the opportunities ofered by multimodal coordination and integration of multimedia elements with robot’speech, showing examples of their use in the context of robotto-human communication. In particular, we focused on the Pepper robot, a humanoid robot equipped with a tablet on its chest. The goal of this research was to formalise, implement, and experimentally evaluate various multimodal integration and coordination strategies, as: the coordination of the images to be displayed on the tablet’s screen within a spoken sentence, the modifcation of the spoken sentence pronunciation depending on the multimedia elements to be displayed, and the amount and size of these elements. Our main goal is to use multimodal communication to make the robot message more efective and comprehensible and to augment its communication possibilities combining voice, written text, and correlated images and animations. This approach has been tested by means of an online evaluation with 41 users. We simulated a robot-to-human communication by using prerecorded videos. This preliminary experiment gave some signifcant results regarding strategies related to the coordination between robot’s speech and multimedia appearances, word’s pronunciation and its relation to related image’s display, image’s display depending on modifers.
Multimodal Strategies for Robot-to-Human Communication
Massimo Donini;Cristina Gena;Alessandro Mazzei
2024-01-01
Abstract
In this paper we introduced the opportunities ofered by multimodal coordination and integration of multimedia elements with robot’speech, showing examples of their use in the context of robotto-human communication. In particular, we focused on the Pepper robot, a humanoid robot equipped with a tablet on its chest. The goal of this research was to formalise, implement, and experimentally evaluate various multimodal integration and coordination strategies, as: the coordination of the images to be displayed on the tablet’s screen within a spoken sentence, the modifcation of the spoken sentence pronunciation depending on the multimedia elements to be displayed, and the amount and size of these elements. Our main goal is to use multimodal communication to make the robot message more efective and comprehensible and to augment its communication possibilities combining voice, written text, and correlated images and animations. This approach has been tested by means of an online evaluation with 41 users. We simulated a robot-to-human communication by using prerecorded videos. This preliminary experiment gave some signifcant results regarding strategies related to the coordination between robot’s speech and multimedia appearances, word’s pronunciation and its relation to related image’s display, image’s display depending on modifers.| File | Dimensione | Formato | |
|---|---|---|---|
|
3610978.3640686.pdf
Accesso aperto
Tipo di file:
POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione
1.95 MB
Formato
Adobe PDF
|
1.95 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



