24th ACM International Conference on Multimodal Interaction
(7-11 Nov 2022)


Late Breaking Results

Call for Sponsors

Call for Papers

Guidelines for Authors

Grand Challenges


Call for Demonstrations
and Exhibits

Call for Doctoral Consortium





Important Dates


Preconference Workshop

Trip to Mysuru

Visa Process

Call for Workshops

Guidelines for Reviewers

Camera-Ready Instructions

Conference Venue


About Bangalore

Platinum Sponsor


Gold Sponsor


Silver Sponsor


Bronze Sponsor

Workshop on Multimodal Affect and Aesthetic Experience

The term “aesthetic experience” corresponds to the inner state of a person exposed to the form and content of artistic objects. Quantifying and interpreting the aesthetic experience of people in different contexts can contribute towards (a) creating context and (b) better understanding people’s affective reactions to different aesthetic stimuli. Focusing on different types of artistic content, such as movies, music, literature, urban art, ancient artwork, and modern interactive technology, the goal of this workshop is to enhance the interdisciplinary collaboration among researchers coming from the following domains: affective computing, aesthetics, human-robot/computer interaction, digital archaeology and art, culture, addictive games.



  • Theodoros Kostoulas (University of the Aegean, Greece)
  • Michal Muszynski (IBM Research Europe, Switzerland)
  • Leimin Tian (Monash University, Australia)
  • Edgar Roman-Rangel (Instituto Tecnologico Autonomo de México, Mexico)
  • Theodora Chaspari (Texas A&M University, USA)
  • Panos Amelidis (Bournemouth University, UK)
The GENEA Workshop 2022: The 3rdWorkshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents
Workshop summary

Embodied Social Artificial Intelligence in the form of conversational virtual humans and social robots are becoming key aspects of human-machine interaction. For several decades, researchers from varying fields such as human-computer interaction and robotics, have been proposing methods and models to generate non-verbal behaviour for conversational agents in the form of facial expressions, gestures, and gaze. This workshop aims at bringing together these researchers. The aim of the workshop is to stimulate discussions on how to improve both generation methods and the evaluation of the results, and spark an exchange of ideas and cue possible collaborations. 

Workshop page


  • Pieter Wolfert MSc, IDLab Ghent University – imec, Ghent Belgium 
  • Dr. Taras Kucherenko, SEED – Electronic Arts, Stockholm Sweden
  • Dr. Youngwoo Yoon, ETRI, South Korea
  • Dr. Zerrin Yumak, Utrecht University, Utrecht, The Netherlands
  • Dr. Gustav Eje Henter, KTH Royal Institute of Technology, Stockholm, Sweden
  • Carla Viegas, CMU, USA
2nd International Workshop on Deep Video Understanding
Workshop Summary

Deep video understanding is a difficult task which requires systems to develop a deep analysis and understanding of the relationships between different entities in video, to use known information to reason about other, more hidden information, and to populate a knowledge graph (KG) with all acquired information. To work on this task, a system should take into consideration all available modalities (speech, image/video, and in some cases text). The aim of this workshop is to push the limits of multimodal extraction, fusion, and analysis techniques to address the problem of analysing long duration videos holistically and extracting useful knowledge to utilize it in solving different types of queries. The target knowledge includes both visual and non-visual elements. As videos and multimedia data are getting more and more popular and usable by users in different domains, the research, approaches and techniques we aim to be applied in this workshop will be very relevant in the coming years and near future.

Workshop Page


Workshop Orgainizers
  • Keith Curtis, National Institute of Standards and Technology, USA
  • George Awad,National Institute of Standards and Technology, USA
  • Shahzad Rajput, Georgetown University &National Institute of Standards and Technology, USA
MSECP-Wild: The 4th Workshop on Modeling Socio-Emotional and Cognitive Processes from Multimodal Data In-the-Wild
Workshop Summary

The ability to automatically infer relevant aspects of human users’ thoughts and feelings is crucial for technologies to adapt their behaviors in complex interactions intelligently (e.g., for user models in social robots or tutoring systems). Research on multimodal analysis of behavioral and physiological data has demonstrated the potential for estimates of a broad range of internal states and processes, such as a person’s mood or attentional engagement. However, despite progress, constructing robust enough models for deployment in real-world applications remains an open problem. The MSECP-Wild workshop serves as a multidisciplinary forum to present and discuss research on addressing this challenge. It is a concerted effort to stimulate joint research projects, exchange methods and critically discuss current and future investigations. We welcome the presentation of evaluation studies, theoretical considerations, data corpora, and novel modeling approaches. In this iteration, the workshop focuses particularly on addressing variations in contextual conditions as a challenge for accurate predictions of internal states and processes, notably within social settings (e.g., conversations in a group). Submissions specifically relating to this topic will be given priority for presentation. Similarly, we encourage all submissions to reflect on context-related challenges/limitations in their work.

Workshop page


Workshop Organizers
  • Bernd Dudzik (Delft University of Technology)
  • Dennis Küster (University of Bremen)
  • David St-Onge (École de Technologie Supérieure)
  • Felix Putze (University of Bremen)
3rd Workshop on Social Affective Multimodal Interaction for Health (SAMIH)Workshop Summary
This workshop is looking for works describing how interactive, multimodal technology such as virtual agents can be used in social skills training for measuring and training social-affective interactions. Sensing technology now enables analyzing user’s behaviors and physiological signals (heart-rate, EEG, etc,). Various signal processing and machine learning methods can be used for such prediction tasks. Beyond sensing, it is also important to analyze human behaviors and model and implement training methods (e.g. by virtual agents, social robots, relevant scenarios, design appropriate and personalized feedback about social skills performance). Such social signal processing and tools can be applied to measure and reduce social stress in everyday situations, including public speaking at schools and workplaces. Target populations include depression, Social Anxiety Disorder (SAD), Schizophrenia, Autism Spectrum Disorder (ASD), but also a much larger group of different social pathological phenomena.
Workshop Page


Workshop Organizers
  • Hiroki Tanaka (Nara Institute of Science and Technology, Japan)
  • Satoshi Nakamura (Nara Institute of Science and Technology, Japan)
  • Kazuhiro Shidara (Nara Institute of Science and Technology, Japan)
  • Jean-Claude Martin (CNRS-LISN, Université Paris Saclay, France)
  • Catherine Pelachaud (CNRS-ISIR, Sorbonne University, France)
3rd Workshop on Bridging Social Sciences and AI for Understanding Child Behavior
Workshop Summary
Child behaviour is a topic of wide scientific interest, among many different disciplines including social and behavioural sciences and artificial intelligence (AI). Yet, knowledge from these different disciplines is not integrated to its full potential, owing to among others the dissemination of knowledge in different outlets (journals, conferences) and different practices. In this workshop, we aim to connect these fields and fill the gaps between science and technology capabilities to address topics such as: using AI (e.g. audio, visual, textual signal processing and machine learning) to better understand and model child behavioural and developmental processes, challenges and opportunities in large-scale child behaviour analysis, implementing explainable ML/AI on sensitive child data, etc. We also welcome contributions on new child-behaviour related multimodal corpora and preliminary experiments on them.
Workshop Page


Workshop Organizers
  • Heysem Kaya, Utrecht University, the Netherlands
  • Anika van der Klis, Utrecht University, the Netherlands
  • Maryam Najafian, MIT, United States
  • Saeid Safavi, University of Surrey, United Kingdom