The ICMI workshop program aims to provide researchers with a more informal and discussion oriented forum to discuss emerging topics in multimodal interaction or revisit established research areas from a new angle. This year we selected the following seven workshops to be held on the last day of the conference:

  1. 1st Workshop on Embodied Interaction with Smart Environments (EISE) [Proceedings]
  2. 2nd International Workshop on Advancements in Social Signal Processing for Multimodal Interaction (ASSP4MI@ICMI2016) [Proceedings]
  3. 2nd workshop on Emotion Representations and Modelling for Companion Systems (ERM4CT 2016) [Proceedings]
  4. Multimodal Virtual and Augmented Reality (MVAR 2016) [Proceedings]
  5. Social learning and multimodal interaction for designing artificial agents (SLMIDAA) [Proceedings]
  6. 1st Workshop on Multi-Sensorial Approaches to Human-Food Interaction (MHFI)
  7. [Proceedings]
  8. Multimodal Analyses enabling Artificial Agents in Human­Machine Interaction (MA3HMI) [Proceedings]

ICMI 2016 Workshop Chairs

Julien Epps (The University of New South Wales, Australia)
Gabriel Skantze (KTH, Sweeden)

1st Workshop on Embodied Interaction with Smart Environments

Organizers: Patrick Holthaus, Thomas Hermann, Sebastian Wrede, Sven Wachsmuth, Britta Wrede

Our homes become increasingly smart through modular hardware and software apps controlling home automation functions such as setting the room temperature or starting the washing machine. Also, mobile robots start to enter our homes as vacuum cleaners, mobile cell phone platforms or toys. All of these come with their own interfaces resulting not only in a multitude of different interaction devices with different interaction philosophies. Additionally, the increasingly embodied capabilities of smart devices, ranging from ambient actions (light, sound..) to moving objects (robots, furniture...) yield to the overwhelming amount of information and control that needs to be mastered within such a convoluted environment. Yet, despite large research efforts, the main modality of interaction with smart home devices is often still a challenging graphical interface.

Such a complex situation opens up a range of new research questions pertaining to the interaction with smart environments. How can they be made more intuitive and adaptive? And how to deal with agency or explicit lack of it, i.e. whom to address when specifying a command or a goal situation? In this workshop we want to address the question how the various installments inside a smart environment can be used as intuitive means of interaction.

This entails on the one hand questions regarding the human partner - in how far do users profit from embodied interaction partners (e.g. virtual agents or robots) as opposed to non-embodied devices? Do the different devices and agents have to provide a coherent interaction? On the other hand, this entails questions of situation awareness with respect to the interaction partner. How can the environment be attentive with respect to the interaction partner's intentions without having to overhear all conversations and interactions that are not addressed towards the environment?


Please visit our website for more information:

2nd International Workshop on Advancements in Social Signal Processing for Multimodal Interaction (ASSP4MI@ICMI2016)

Organizers: Khiet Truong, Dirk Heylen, Toyoaki Nishida, Mohamed Chetouani

In the last decade, an increasing need for affective and socially intelligent technology has been seen, partly caused by upcoming interactive technology that is enhancing our daily lives in our homes and at work. This has led to a significant increase of research in Social Signal Processing (SSP) in which the aims are to model, analyse, and synthesize social signals (including affective signals) and to develop socially intelligent machines. This body of work is inherently multimodal (e.g., eye gaze, touch, vocal, and facial expressions) and multidisciplinary (e.g., psychology, linguistics, computer science). Major research foci include the automatic understanding and generation of emotional and social behavior in specific situations. Applications are plentiful: the development of social robots, intelligent virtual agents, and smart environments are some of the application areas that will benefit from SSP research.

SSP research involves studying human-human interactions, as well as human-machine interactions. Large corpora consisting of spontaneous human-human interactions offer SSP researchers the opportunity to analyse and understand multimodal human behaviors, and to develop detectors and data mining algorithms. Mining large amounts of human-human interaction data can unravel relations between modalities that were initially hidden from the naked eye. Human-machine interactions on the other hand can be studied in order to understand how the socially intelligent technology developed affects how humans interact with machines.

Although many SSP-related applications already exist, the puzzle is far from solved. Major challenges include robustness of the applications and algorithms, the role of situational and user context in SSP, data collection and annotation, and unknown relations among multiple modalities. SSP is a continuously developing and lively multidisciplinary research domain, bringing along new challenges, methods, application areas and emerging fields of research.

We invite contributions, both research and position papers, addressing recent developments, challenges, and research results in SSP.

Website :

2nd workshop on Emotion Representations and Modelling for Companion Systems (ERM4CT 2016)

Organizers: Kim Hartmann, Ingo Siegert, Ali Albert Salah, Khiet Truong, Angelina Thiers, Michael Tornow

The major goal in human computer interaction (HCI) research and applications is to improve the interaction between humans and computers. As interaction is often very specific for an individual and generally of multi-modal nature, the current trend of multi-modal user-adaptable HCI systems arose over the past years. These systems are designed as companions capable of assisting their users based on the users’ needs, preferences, personality and affective state. Companion systems are dependent on reliable emotion recognition methods in order to provide natural, user-centred interactions.

In order to study natural, user-centred interactions, to develop user-centred emotion representations and to model adequate affective system behaviour, appropriate multi-modal data comprising not just audio and video material must be available. Following its ancestor, the ERM4HCI workshop series, the ERM4CT workshop focuses on emotion representations, signal characteristics used to describe and identify emotions as well as their influence on personality and user state models to be incorporated in companion systems. As a further highlight, this year's workshop offers a “hands-on” session, where a dataset comprised of 10 different modalities will be made available prior to the workshop to the participants. The participants are encouraged to analyse the dataset in terms of emotion recognition, interaction studies, conversational analyses, etc.

Please visit our website for more information:

Multimodal Virtual and Augmented Reality - MVAR 2016

Organizers: Wolfgang Huerst, Daisuke Iwai, Prabhakaran Balakrishnan

Virtual reality (VR) and augmented reality (AR) are currently two of the "hottest" topics in the IT industries. Many consider them to be the next wave in computing with a similar impact as the shift from desktop systems to mobiles and wearables. Yet, we are still far from the ultimate goal of creating new virtual environments or augmentations of existing ones that feel and react similarly as their real counterparts. Many challenges and open research questions remain - especially in the areas of multimodality and interaction.

The aim of this workshop is therefore to investigate any aspects about multimodality and multimodal interaction in relation to VR and AR. What are the most pressing research questions? What are the difficult challenges? What opportunities do other modalities than vision offer for VR and AR? What are new and better ways for interaction with virtual objects and for an improved experience of VR and AR worlds?

We invite researchers and visionaries to submit their latest results on any aspects that are relevant for multimodality and interaction in VR and AR. Contributions of more fundamental nature (e.g., psychophysical studies and empirical research about multimodality) are welcome as well as technical contributions (including use cases, best-practice demonstrations, prototype systems, etc.). Position papers and reviews of the state-of-the art and ongoing research are invited, too. Submissions do not necessarily have to address multiple modalities, but work focusing on single modes that go beyond the state-of-the-art of “purely visual” systems (e.g., papers about smell, taste, and haptics) are suited, as well.

Website :

Social learning and multimodal interaction for designing artificial agents

Organizers: Mohamed Chetouani, Salvatore Maria Anzalone, Giovanna Varni, Isabelle Hupont, Ginevra Castellano, Angelica Lim, Gentiane Venture

To go beyond scripted and artificial interaction, social agents should be able to learn with or from humans. Such complex skills emerge from a complete understanding of the inner mechanisms of social interactions, in particular on the awareness of the user’s actions, behaviours, and mental and emotional states and on the coherent production of multimodal, verbal and non-verbal communication skills in a human-like manner.

In recent years, advances on this field contributed to the development of several kinds of agents able to face a broad range of social situations: human aware robot partners in industries, companion agents for children or for elderly people, social robots in public or in personal spaces, virtual avatars as educational tools at school and so on. Such experiences shown how this domain cannot be solely approached from a pure engineering perspective: human sciences, social sciences, developmental sciences, play a primary role on the development and the enhancement of social interaction skills for artificial agents.

The results achieved by researchers are particularly important to allow naïve people to interact in their everyday life with naturally communicative agents. This is impacting markets opening new social and economic opportunities for industries.

The scope of this workshop is to present rigorous scientific and philosophical advances on social interaction and multimodal learning for social agents, welcoming contributes on both theoretical aspects as well as on practical application, fostering interdisciplinary collaboration between researchers on the domain as well as with industrial partners.

Website :

1st Workshop on Multi-Sensorial Approaches to Human-Food Interaction

Organizers:Anton Nijholt, Carlos Velasco, Gijs Huisman, Kasun Karunanayaka

In the workshop we are calling for investigations and applications of systems that create new, or enhance already existing, eating and drinking experiences in the context of Human-Food Interaction. Moreover, we are interested in those works that are based on the principles that govern the systematic connections that exist between the senses. Human Food Interaction also involves the experiencing food interactions digitally in remote locations. This includes sensing taste, smell, and flavor information from one place, transferring them over the internet digitally, and effectively regenerate at the destination. Therefore, in this workshop we are also interested in sensing and actuation interfaces, new communication mediums, and persisting and retrieving technologies for human food interactions. Enhancing social interactions to augment the eating experience is another issue we would like to see addressed in this workshop.

Topics: defining, recording, and transferring flavor experiences; defining the methods of associating the extended sensory data (smell, taste, feel) with traditional (audiovisual, text) data; understanding flavor perception and cross-cultural food eating environments; creating multisensory flavour experiences in virtual reality systems by including auditory, haptic, smell and/or taste stimulation devices; using multisensory digital devices to manipulate eating and drinking atmospheres (e.g. colour, music in the room) and factors such as food presentation (e.g. size and/or shape of the plate, smell and/or colour of the food); collecting user’s responses derived from flavour experiences through digital devices, such as tracking behavioral aspects (e.g. tracking movements, eating speed, and facial expressions), and/or using psychophysiological measurements; utilizing multisensory experience design, technology, and playful interactions to influence food habits and choice; novel applications of food and technology in different contexts, for example during airplane flights, or space travel; exploring the role of technology to enhance or otherwise influence social aspects surrounding eating behavior.

Please visit our website for more information:

Multimodal Analyses enabling Artificial Agents in Human­Machine Interaction (MA3HMI)

Organizers: Ronald Böck, Francesca Bonin, Nick Campbell, Ronald Poppe

One of the aims in building multimodal user interfaces and combining them with technical devices is to make the interaction between user and system as natural as possible. The most natural form of interaction may be how we interact with other humans. Current technology is far from human-like, and systems can reflect a wide range of technical solutions.

Transferring the insights for analysis of human-human communication to human-machine interactions remains challenging. It requires that the multimodal inputs from the user (e.g., speech, gaze, facial expressions) are recorded and interpreted. This interpretation has to occur at both the semantic and affective levels, including aspects such as the personality, mood, or intentions of the user. These processes have to be performed in real-time in order for the system to respond without delays ensuring that the interaction is smooth.

The MA3HMI workshop aims at bringing together researchers working on the analysis of multimodal data as a means to develop technical devices that can interact with humans. In particular, artificial agents can be regarded in their broadest sense, including virtual chat agents, empathic speech interfaces and life-style coaches on a smart-phone. More general, multimodal analyses support any technical system in the research area of human-machine interaction. We focus on the real-time aspects of human-machine interaction. We address the development and evaluation of multimodal, real-time systems.

We solicit papers that concern the different phases of the development of such interfaces. Tools and systems that address real-time conversations with artificial agents and technical systems are also within the scope of the workshop.

Website :

ICMI 2016 ACM International Conference on Multimodal Interaction. Copyright © 2015-2024