25th ACM International Conference on Multimodal Interaction
(9-13 October 2023)



Call for Bids 2025



Registration (closed)



Grand Challenges

Presentation Instruction

Doctoral Consortium



Companion Proceedings

Camera-Ready Instructions

Call for Sponsors

Call for Papers

Guidelines for Authors

Guidelines for Reviewers

Call for Blue Sky Papers

Call for Late Breaking Results

Call for Demonstrations
and Exhibits

Call for Doctoral Consortium

Call for Tutorials

Important Dates


Conference venue

Platinum Sponsor

Bronze Sponsor


Institutional Sponsors

The Third International Workshop on Automated Assessment of Pain (AAP)

Workshop summary

Pain typically is measured by patient self-report, but self-reported pain is difficult to interpret and may be impaired or in some circumstances not possible to obtain. For instance, in patients with restricted verbal abilities such as neonates, young children, and in patients with certain neurological or psychiatric impairments (e.g., dementia). Additionally, the subjectively experienced pain may be partly or even completely unrelated to the somatic pathology of tissue damage and other disorders. Therefore, the standard self-assessment of pain does not always allow for an objective and reliable assessment of the quality and intensity of pain. Given individual differences among patients, their families, and healthcare providers, pain often is poorly assessed, underestimated, and inadequately treated. To improve assessment of pain, objective, valid, and efficient assessment of the onset, intensity, and pattern of occurrence of pain is necessary. To address these needs, several efforts have been made in machine learning and computer vision community for automatic and objective assessment of pain from video as a powerful alternative to self-reported pain.

The workshop aims to bring together interdisciplinary researchers working in field of automatic multimodal assessment of pain (using video and physiological signals). A key focus of the workshop is the translation of laboratory work into clinical practice.

Workshop Page


Workshop Orgainizers
  • Zakia Hammal, The Robotics Institute, Carnegie Mellon University, USA.
  • Steffen Walter, University Hospital Ulm, Germany.
  • Nadia Berthouze, University College London, UK
GENEA: Generation and Evaluation of Non-verbal Behaviour for Embodied Agents
Workshop summary

Embodied Social Artificial Intelligence in the form of conversational virtual humans and social robots are becoming key aspects of human-machine interaction. For several decades, researchers from varying fields such as human-computer interaction and robotics, have been proposing methods and models to generate non-verbal behaviour for conversational agents in the form of facial expressions, gestures, and gaze. This workshop aims at bringing together these researchers. The aim of the workshop is to stimulate discussions on how to improve both generation methods and the evaluation of the results, and spark an exchange of ideas and cue possible collaborations. The workshop is planned to be held in-person.

Workshop page


Workshop Orgainizers
  • Youngwoo Yoon, ETRI, South Korea
  •  Taras Kucherenko, SEED – Electronic Arts, Sweden
  •  Pieter Wolfert, IDLab Ghent University – imec, Belgium
  •  Rajmund Nagy, KTH Royal Institute of Technology, Sweden
  •  Jieyeon Woo, Sorbonne University, France
  •  Gustav Eje Henter, KTH Royal Institute of Technology, Sweden
4th ICMI Workshop on Bridging Social Sciences and AI for Understanding Child Behaviour
Workshop Summary

Child behaviour is a topic of wide scientific interest, among many different disciplines including social and behavioural sciences and artificial intelligence (AI). Yet, knowledge from these different disciplines is not integrated to its full potential, owing to among others the dissemination of knowledge in different outlets (journals, conferences) and different practices. In this workshop, we aim to connect these fields and fill the gaps between science and technology capabilities to address topics such as: using AI (e.g. audio, visual, textual signal processing and machine learning) to better understand and model child behavioural and developmental processes, challenges and opportunities in large-scale child behaviour analysis, implementing explainable ML/AI on sensitive child data, etc. We also welcome contributions on new child-behaviour related multimodal corpora and preliminary experiments on them.

Workshop Page


Workshop Orgainizers
  • Heysem Kaya, Utrecht University, the Netherlands
  • Anouk Neerincx, Utrecht University, the Netherlands
  • Maryam Najafian, MIT, United States
  • Saeid Safavi, University of Surrey, United Kingdom
4th workshop on social affective multimodal interaction for health
Workshop Summary

This workshop is looking for works describing how multimodal technology can be used in healthcare for measuring and training social-affective interactions. Sensing technology analyzes users’ behaviors and physiological signals (heart rate, EEG, etc.). Various signal processing and machine learning methods can be used for such prediction tasks. Beyond sensing, it is also important to analyze human behaviors and model and implement training methods (e.g., by virtual agents, relevant scenarios, and design appropriate and personalized feedback about social skills performance). Such social signal processing and tools can be applied to measure and reduce social stress in everyday situations, including public speaking at schools and workplaces. Target populations include depression, schizophrenia, autism spectrum disorder, and a much larger group of social pathological phenomena.

Workshop page


Workshop Organizers
  • Hiroki Tanaka (Nara Institute of Science and Technology, Japan)
  • Satoshi Nakamura (Nara Institute of Science and Technology, Japan)
  • Jean-Claude Martin (LISN-CNRS, Université Paris-Saclay, France)
  • Catherine Pelachaud (ISIR, CNRS, Sorbonne University, France)
4th International Workshop on Multimodal Affect and Aesthetic Experience – MAAE 2023Workshop Summary

The term “aesthetic experience” corresponds to the inner state of a person exposed to the form and content of artistic objects. Quantifying and interpreting the aesthetic experience of people in different situations can contribute towards (a) creating a context and (b) better understanding people’s affective reactions to different aesthetic stimuli. Focusing on different types of artistic content, such as movies, music, literature, urban art, ancient artwork, and modern interactive technology, the goal of this workshop is to enhance the interdisciplinary collaboration among researchers coming from the following domains: affective computing, affective neuroscience, aesthetics, human-robot/computer interaction, digital urban environment, digital archaeology, digital art, art, culture, multimedia, and addictive games.

Workshop Page



Workshop Organizers
  • Michal Muszynski (IBM Research Europe, Switzerland)
  • Theodoros Kostoulas (University of the Aegean, Greece)
  • Leimin Tian (Monash University, Australia)
  • Edgar Roman-Rangel (Instituto Tecnologico Autonomo de México, Mexico)
  • Theodora Chaspari (Texas A&M University, USA)
  • Panos Amelidis (Bournemouth University, UK)
The 5th Workshop on Modeling Socio-Emotional and Cognitive Processes from Multimodal Data in the Wild (MSECP-Wild)

Workshop Summary

The ability to automatically estimate human users’ thoughts and feelings during interactions is crucial for adaptive intelligent technology (e.g., social robots or tutoring systems). Not only can it improve user understanding, but it also holds the potential for novel scientific insights. However, creating robust models for predictions and adaptation in real-world applications remains an open problem. The MSECP-Wild workshop series discusses this challenge in a multidisciplinary forum. This workshop iteration will put a thematic focus on ethical considerations when developing technology for inferring and responding to internal states in the wild (e.g., privacy, consent, or bias). As such, apart from contributions to overcoming technical and conceptual challenges for this type of multimodal analysis in general, we particularly encourage the submission of work that facilitates understanding and addressing ethical challenges in the wild. Overall, we aim for a program providing important impulses for discussions of the state-of-the-art and opportunities for future research.

Workshop Page


Workshop Organizers
  • Bernd Dudzik (Delft University of Technology)
  • Tiffany Matej Hrkalovic (Free University Amsterdam)
  • Dennis Küster (University of Bremen)
  • David St-Onge (École de Technologie Supérieure)
  • Felix Putze (University of Bremen)
  • Laurence Devillers (LISN-CNRS/Sorbonne University)
Multimodal, interactive interfaces for education
Workshop Summary

From tablets to augmented reality, from virtual avatars, to social robots, multimodal, interactive interfaces emerged as interesting tools to facilitate learning in educational scenarios by improving the engagement and the motivation of students towards the learning activities. Research has also shown the potential of such technologies as convenient instruments for the personalization of teaching strategies to the learning styles of each student, proposing appropriate feedback and adapted levels of difficulty in a wider set of learning activities. Such kind of tailoring might become a new, particularly helpful resource to meet the demands of students with special needs. The development and the adoption of such interfaces in educational scenarios relies on new pedagogies and new didactics co-designed in an interdisciplinary effort involving engineers, psychologists, cognitive scientists, educators, students, and families. The aim of this workshop is to provide a venue for all the involved stakeholders to present scientific advances on the design, the development, and the adoption of multimodal, interactive interfaces for education, fostering discussions, ideas and interdisciplinary collaborations between researchers on the domain as well as industrial partners.

Workshop Page


Workshop Organizers
  • Daniel C. Tozadore (Swiss Federal Institute of Technology Lausanne – EPFL, Switzerland)
  • Lise Aubin (Hôpital de la Pitié-Salpêtrière, France)
  • Soizic Gauthier (Forward College, France)
  • Barbara Bruno (Swiss Federal Institute of Technology Lausanne – EPFL, Switzerland)
  • Salvatore M. Anzalone (Université Paris 8, France)
ACE: how Artificial Character Embodiment shapes user behaviour in multi-modal interactions
Workshop Summary

The body shapes the mind: bodily representations structure the way humans perceive the world and the way they perceive other people. Cognitive sciences and social sciences altogether have stressed the importance of embodiment in social interaction, highlighting how interacting with others influences how we behave, perceive and think. As the sense of embodiment can be defined as the ensemble of sensations that arise in conjunction with being inside, having, and controlling a body, it definitely influences self-perceptions and actions regarding one’s own avatar, but also our social behaviours with embodied intelligent agents such as virtual humans and robots.

The topic is multidisciplinary by nature: embodiment can affect both human-human and human-agent (either virtual or robotic) interactions and this influence can arise through different sensory modalities. For instance, in virtual environments, users may experience what is known as the Proteus effect, a well-known phenomenon where the appearance of users’ avatars influence their behaviour, but whose underlying cognitive processes are still not clear. In human-robot and human-agent interactions, the level of anthropomorphism can impact human reactions and behaviours during the interaction (e.g., uncanny valley of visual appearance or motions that disturb responses and sense of presence in virtual reality). These phenomena are not only of interest for the design of artificial characters, either virtual or robotic, but could also help to shed light on social behaviour and cognition, providing new tools and experimental perspectives.

The ACE workshop aims to bring together researchers, practitioners and experts on the topic of embodiment, to analyse and foster discussion on its effects on user behaviour in multi-modal interaction contexts. Objectives are to stimulate multidisciplinary discussions on the topic, to share recent progress, and to provide participants with a forum to debate current and future challenges. Contributions from computational, neuroscientific and psychological perspectives, as well as technical applications, will be welcome.

Workshop Page


Workshop Organizers
  • Beatrice Biancardi, CESI LINEACT (bbiancardi@cesi.fr)
  • Thomas Janssoone, Enchanted Tools
  • Geoffrey Gorisse, Arts et Métiers Insitute of Technology
  • Pierre Raimbaud, ENISE, Ecole Centrale de Lyon
  • Eleonora Ceccaldi, CasaPaganini InfoMus, University of Genoa
  • Sara Falcone, University of Twente
  • Anna Martin, CESI LINEACT
  • Silvia Ferrando, CasaPaganini InfoMus, University of Genoa
Multimodal Conversational Agents for People with Neurodevelopmental Disorders
Workshop Summary

Neurodevelopmental Disorders (NDD) are a group of conditions with onset in the developmental period characterized by cognitive, social, and communication deficits. NDD includes intellectual developmental disorder, global developmental delay, communication disorders, autism spectrum disorder, attention-deficit/hyperactivity disorder (ADHD), neurodevelopmental motor disorders, and specific learning disorders.

In recent years, there has been a growing interest in utilizing conversational agents – i.e., software enabling access to information and services through voice – to provide therapeutic interventions for individuals with NDD, and multimodality is one of the key factors driving the success of these tools. However, numerous questions and challenges related to the design, development, and evaluation of conversational agents for people with NDD still need to be answered.

The MCAPND workshop (we know, the name is a bit of a tongue-twister!) aims to bring together researchers and practitioners in multimodal conversational agents for individuals with NDD. We aim to promote a multidisciplinary discourse on the latest research, technologies, and applications and provide a stage for participants to share their findings, exchange ideas, and identify future research directions.

Workshop Page


Workshop Organizers
  • Dr. Fabio Catania, MIT and Politecnico di Milano
  • Prof. Franca Garzotto, Politecnico di Milano
  • Prof. Satrajit Ghosh, MIT and Harvard University
  • Prof. Thomas F. Quatieri, MIT and Harvard University
  • Prof. Benjamin R. Cowan, University College Dublin
  • Tanya Talkar, MIT and Harvard University