Sessions
Conference Program Overview (click to enlarge)
Download the detailed program here: Detailed Program 2025
Proceedings: toc
Adjunct Events Day 1 - Monday, 13 October 2025
Adjunct Events Day 1 – Monday, 13 October 2025
08:00-17:00 Registration
09:00 – 17:30 Doctoral Consortium
|
09:00 – 09:10 |
Opening and Welcome |
|
|
Session 1: LLMs for multimodal interactions |
||
|
09:10 – 09:35 |
Cognitive Effort Analysis in Digital Learning Environments |
Shayla Sharmin |
|
09:35 – 10:00 |
Enhancing Accessibility in Animation: A Context-Aware Audio Description System for Visually Impaired Children |
Md Fahad Bin Zamal |
|
10:00 – 10:25 |
Designing Multimodal Nonverbal Communication Cues for Multirobot Supervision Through Event Detection and Policy Mapping |
Richard Attfield |
|
10:25 – 10:45 |
Short Break 1 |
|
|
Session 2: Cognitive and emotion state modelling |
||
|
10:45 – 11:05 |
Towards Intelligent Adaption in Cognitive Assistance Systems through Physiological Computing |
Jordan Schneider |
|
11:05 – 11:30 |
Towards Context-sensitive Emotion Recognition |
Sayak Mukherjee |
|
11:30 – 11:55 |
Differentiating Frustration from Cognitive Workload in a Dual-task System |
Heting Wang |
|
11:55 – 12:20 |
Multimodal Analysis of Caregiving Interactions in Simulation-Based Training |
Behdokht Kiafar |
|
12:30 – 13:30 |
Lunch Break |
|
|
Session 3: Social interaction & behaviours |
||
|
13:30 – 13:55 |
Decoding Social Interaction to Understand Traumatic Behaviours in Social Dynamics |
Mr Pritesh Nalinbhai Contractor |
|
13:55 – 14:20 |
Multimodal Conversational Events Estimation in Complex Social Scenes |
Litian Li |
|
14:20 – 14:45 |
Modeling Social Dynamics from Multimodal Cues in Natural Conversations |
Kevin Hyekang Joo |
|
14:45 – 15:00 |
Short Break 2 |
|
|
Session 4: Virtual Reality and interaction |
||
|
15:00 – 15:25 |
Designing and Evaluating Gen-AI for Cultural Resilience |
Ka Hei Carrie Lau |
|
15:25 – 15:50 |
Towards Seamless Interaction: Neuroadaptive Virtual Reality Interfaces for Target Selection |
Jalynn Blu Nicoly |
|
15:50 – 16:15 |
Developing Virtual Reality (VR) Simulations with Embedded User Analytics for Cognitive Rehabilitation in PTSD Veterans |
Ravi Varman Selvakumaran |
|
16:15 – 16:25 |
Short Break 3 |
|
|
16:25 – 17:25 |
Panel session |
|
|
17:25 – 17:30 |
Closing |
|
08:00 – 16:00 AAP Workshop
| 09:00 | Opening: Zakia Hammal, Steffen Walter, and Nadia Berthouze | |
| 09:00-10:00 | Invited talk | TBD |
| 10:00-10:30 | Coffee Break | |
| 10:30-12:00 | Paper presentation | |
| 10:30-10:45 | Canonical Time Series Features for Pain Classification by Sai Revanth Reddy Boda et al. | |
| 10:45-11:00 | When Features Matter More than Sequence: A Case for Tabular In Context Learning in Pain Classification by Richard A. A. Jonker et al. | |
| 11:00-11:15 | Feel the Pain: An Interpretable Multimodal Approach for Physiological Signal-Based Pain Detection by Tahia Tazin et al. | |
| 11:15-11:30 | Tiny-BioMoE: a Lightweight Embedding Model for Biosignal Analysis by Stefanos Gkikas et al. | |
| 11:30-11:45 | The AI4Pain Grand Challenge 2025: Advancing Pain Assessment with Multimodal Physiological Signals by Raul Fernandez Rojas et al. | |
| 11:45-12:00 | PainXtract: A Multimodal System for Multiclass Pain Classification Using Physiological Signals by Anup Kumar Gupta et al. | |
| 12:00-13:30 | Lunch break | |
| 13:30-16:00 | Paper presentation | |
| 13:30-13:45 | A Multimodal Deep Learning Exploration for Pain Intensity Classification by Javier Orlando Pinzon-Arenas et al. | |
| 13:45-14:00 | Explaining Pain by Combining Deep Learning Models and Physiology-Driven Ensembles using PPG, EDA, and Respiration by Miguel Javierre et al. | |
|
14:00-14:15 |
EnsembleIQ-Pain: Intelligent Cluster Calibration for Personalized Pain Detection by Rupal Agarwal et al. | |
| 14:15-14:30 | Painthenticate: Feature Engineering on Multimodal Physiological Signals by Sajeeb Datta et al. | |
| 14:30-15:00 | Coffee Break | |
| 15:00-15:15 | Investigation into Unimodal Versus Multimodal Pain Recognition from Physiological Signals by Anis Elebiary et al. | |
| 15:15-15:30 | Efficient Pain Recognition via Respiration Signals: A Single Cross-Attention Transformer Multi-Window Fusion Pipeline by Stefanos Gkikas et al. | |
| 15:30-15:45 | Multi-Representation Diagrams for Pain Recognition: Integrating Various Electrodermal Activity Signals into a Single Image by Stefanos Gkikas et al. | |
| 15:45-16:00 | Closing | |
9:00 – 12:30 Deepfake Tutorial
Abhinav Dhall, Zhixi Cai, Shreya Ghosh
13:30 – 18:00 CCMI Workshop
| 13:15 | Opening and Welcome | |
|
Session 1 Session Chair: Koji Inoue |
||
| 13:20 | Benchmarking Visual Generative Models through Cultural Lens: A Case Study with Singapore-Centric Multi-Cultural Context (Long, 20 min) | Ali Koksal, Loke Mei Hwan, Hui Li Tan, Nancy F. Chen |
| 13:40 |
Culture-Aware Multimodal Personality Prediction using Audio, Pose, and Cultural Embeddings (Short, 15 min) |
Islam J A M Samiul, Khalid ZAMAN, Marius Funk, Masashi Unoki, Yukiko Nakano, Shogo Okada |
| 13:55 | Invited Talk 1: Multimodal Deepfake Detection Across Cultures and Languages | Abhinav Dhall |
| 14:30 | Break | |
| Session 2 Session Chair: Shogo Okada | ||
| 15:00 | Multimodal grounding in HRI using two types of nods in Japanese and Finnish (Long, 20 min) | Taiga Mori, Kristiina Jokinen, Leo Huovinen, Biju Thankachan |
| 15:20 | Analyzing Multimodal Multifunctional Interactions in Multiparty Conversations via Functional Spectrum Factorization (Long, 20 min) | Momoka Tajima, Issa Tamura, Kazuhiro Otsuka |
| 15:40 | MultiGen: Child-Friendly Multilingual Speech Generator with LLMs (Short 15 min) | Xiaoxue Gao, Huayun Zhang, Nancy F. Chen |
| 15:55 | Contextualized Visual Storytelling for Conversational Chatbot in Education (Short, 15 min) | Hui Li Tan, Gu Ying, Liyuan Li, Mei Chee Leong, Nancy F. Chen |
| 16:10 | Break | |
| 16:20 | Invited talk: Cross-cultural studies on human-human and human-agent interaction | Yukiko Nakano |
| 16:55 | Panel discussion | Yukiko Nakano, Abhinav Dhall, Liu Zhengyuan, Shogo Okada |
| 17:25 | Closing | |
Main conference Day 1 - Tuesday, 14 October 2025
Main conference
Day 1 – Tuesday, 14 October 2025
08:30 – 09:00 Welcome
ICMI 2025 General Chairs and Program Chairs
09:00 – 10:00 Keynote
Session Chair: Yukiko Nakano
Multimodal Task Analysis in Wearable Contexts
Julien Epps
10:00 – 10:30 Break
10:30 – 12:00 Oral Session 1: Affect & Behaviour Understanding
Session Chair: Shohgo Okada
10:30 – 10:48
Multimodal Behavioral Characterization of Dyadic Alliance in Support Groups
Kevin Hyekang Joo, Zongjian Li, Yunwen Wang, Yuanfeixue Nan, Mina Kian, Shriya Upadhyay, Maja Mataric, Lynn Carol Miller, Mohammad Soleymani
10:48 – 11:06
What makes you say yes? An investigation of mental state and personality in persuasion during a dyadic conversation
Siyuan Chen
11:06 – 11:24 (Best paper award nominee)
Decoding Affective States without Labels: Bimodal Image-brain Supervision
Vadym Gryshchuk, Maria Maistro, Christina Lioma, Tuukka Ruotsalo
11:24 – 11:42
Can Adaptive Interviewer Robots Based on Social Signals Make a Better Impression on Interviewees and Encourage Self-Disclosure?
Fuminori Nagasawa, Shogo Okada
11:42 – 12:00
Foundation Feature-Guided Hierarchical Fusion of EEG-Physiological for Emotion Estimation
Haifeng Zhang, Von Ralph Dane Marquez Herbuela, Yukie Nagai
12:00-13:30 Lunch
13:30-14:30 Keynote (Sustained Accomplishment Award Talk)
Session Chair: Ramanathan Subramanian
From audio, through haptics to augmented reality: travels in multimodal interaction Stephen Brewster
14:30-15:30 Blue Sky Papers
Session Chair: Alessandro Vinciarelli
14:30 – 15:00
Human Authenticity and Flourishing in an AI-Driven World: Edmund’s Journey and the Call for Mindfulness
Sebastian Zepf, Mark Colley
15:00 – 15:30
MUSE: A Multimodal, Generative, and Symbolic Framework for Human Experience Modeling
Mohammad Rashedul Hasan
15:30-16:00 Break
16:00-18:00 Poster Session 1 (including DC posters)
Session Chair: Madhawa Perera
Emotion and Affect
Privileged Contrastive Pretraining for Multimodal Affect Modelling
Kosmas Pinitas, Konstantinos Makantasis, Georgios Yannakakis
A Multifaceted Multi-Agent Framework for Zero-Shot Emotion Analysis and Recognition of Symbolic Music
Jiahao Zhao, Yunjia Li, Kazuyoshi Yoshii
Disentangling Cross-Modal Interactions for Enhanced Multimodal Emotion Recognition in Conversation
Jian Ding, Bo Zhang, Dailin Li, Jian Wang, Hongfei Lin
Write! Draw! Move!: Investigating the Effects of Positive and Negative Self-Reflection on Emotion through Self-Expression Modalities
Golnaz Moharrer, Kavya Rajendran, Rowena Pinto, Andrea Kleinsmith
Gesture and Behavior Generation
Motion Diffusion Autoencoders: Enabling Attribute Manipulation in Human Motion Demonstrated on Karate Techniques
Anthony Richardson, Felix Putze
DifussionCleft: Facial Anomaly Synthesis Guided by Text
Karen Rosero, Lucas M Harrison, Alex A Kane, Rami R. Hallac, Carlos Busso
Gesture and Behavior Recognition
WatchHAR: Real-time On-device Human Activity Recognition System for Smartwatches
Taeyoung Yeon, Vasco Xu, Henry Hoffman, Karan Ahuja
Disentangling Perceptual Ambiguity in Multifunctional Nonverbal Behaviors in Conversations via Tensor Spectrum Decomposition
Issa Tamura, Momoka Tajima, Shiro Kumano, Kazuhiro Otsuka
Predicting End-of-turn and Backchannel Based on Multimodal Voice Activity Prediction Model
Ryo Ishii, Shin-ichiro Eitoku, Ryota Yokoyama, Junichi Sawase
PI-STGCNN: A Spatio-Temporal Graph Convolutional Neural Network with Partial Interaction Optimization for Human Trajectory Prediction
Zhuangzhuang Chen
Time-channel Adaptive Fusion and Hierarchical Attention Mechanism for Dynamic Hand Gesture Recognition
Longjie Huang, Jianhai Liu, Yong Gu, Kai Jiang, Haibo Li
Leveraging Pre-Trained Transformers and Facial Embeddings for Multimodal Hirability Prediction in Job Interviews
Eric Fithian, Theodora Chaspari
Health & Wellbeing
Punctual or Continuous? Analyzing Depression Traces in Language and Paralanguage with Multiple Instance Learning
Rawan Alsarrani, Anna Esposito, Alessandro Vinciarelli
VitaStress: A Multimodal Dataset for Stress Detection
Paul Schreiber, Simon Burbach, Beyza Cinar, Lennart Mackert, Maria Maleshkova
BiFuseNet: A Multimodal Network for Estimating Blood Alcohol Concentration via Bidirectional Hierarchical Fusion
Abdullah Tariq, Martin Masek, Zulqarnain Gilani, Arooba Maqsood
Investigating differences in Paramedic trainees’ multimodal interaction during low and high physiological synchrony
Vasundhara Joshi, Surely Akiri, Sanaz Taherzadeh, Gary B Williams, Andrea Kleinsmith
A multimodal Framework for exploring behavioural cues for automatic Stress Detection
Rebecca Valerio, Marwa Mahmoud
16:30-18:00 Doctoral Consortium Poster Session
Modeling Social Dynamics from Multimodal Cues in Natural Conversations
Kevin Hyekang Joo
Developing Virtual Reality (VR) Simulations with Embedded User Analytics for Cognitive Rehabilitation in PTSD Veterans
Ravi Varman Selvakumaran
Enhancing Accessibility in Animation: A Context-Aware Audio Description System for Visually Impaired Children
Md Fahad Bin Zamal
Designing Multimodal Nonverbal Communication Cues for Multirobot Supervision Through Event Detection and Policy Mapping
Richard John Attfield
Towards Context-sensitive Emotion Recognition
Sayak Mukherjee
Towards Intelligent Adaption in Cognitive Assistance Systems through Physiological Computing
Jordan Schneider
Multimodal Conversational Events Estimation in Complex Social Scenes
Litian Li
Decoding Social Interaction to Understand Traumatic Behaviors in Social Dynamics
Pritesh Nalinbhai Contractor
Towards Seamless Interaction: Neuroadaptive Virtual Reality Interfaces for Target Selection
Jalynn Blu Nicoly, Shayla Sharmin
Multimodal Analysis of Caregiving Interactions in Simulation-Based Training
Behdokht Kiafar, Ka Hei Carrie Lau
Differentiating Frustration from Cognitive Workload in a Dual-task System
Heting Wang
18:00 -19:00 Welcome Reception
Main conference Day 2 - Wednesday, 15 October 2025
09:00-10:00 Keynote
Session Chair: Gelareh Mohammadi
Multimodal AI for Transforming Industries and Empowering Social Interaction
Fang Chen
10:00-10:30 Break
10:30-12:00 Oral Session 2: Health & Wellbeing
Session Chair: Maria Maleshkova
10:30 – 10:48
Evaluating the Efficacy of Pulse Transit Time between Palm and Forehead in Blood Pressure Estimation
Chu Chu Qiu, Jing Wei Chin, Tsz Tai Chan, Kwan Long Wong, Richard Hau Yue So
10:48 – 11:06
From Lab to Wrist: Bridging Metabolic Monitoring and Consumer Wearables for Heart Rate and Oxygen Consumption Modeling
Barak Gahtan, Sanketh Vedula, Gil Samuelly Leichtag, Einat Kodesh, Alex Bronstein
11:06 – 11:24 (Best paper award nominee)
SpikEy: Preventing Drink Spiking using Wearables
Zhigang Yin, Ngoc Thi Nguyen, Agustin Zuniga, Mohan Liyanage, Petteri Nurmi, Huber Flores
11:24 – 11:42
From Speech and PPG to EDA: Stress Detection Based on Cross-Modal Fine-Tuning of Foundation Models
Alia Ahmed Al Dossary, Mathieu Chollet, Alessandro Vinciarelli
11:42 – 12:00
Psychological and Neurophysiological Indicators of Stress and Relaxation in Immersive Virtual Reality Environments: A Multimodal Approach
Ankit Arvind Prasad, Shashank Laxmikant Bidwai, Ashutosh Jitendra Zawar, Diven Ashwani Ahuja, Apostolos Kalatzis, Vishnunarayan Girishan Prabhu
12:00-13:30 Lunch
13:30-15:00 Oral Session 3: Interaction Design
Session Chair: Silvia Rossi
13:30 – 13:48 (Best paper award nominee)
Exploring the effects of force feedback on VR Keyboards with varying visual designs
Zhenxing Li, Jari Kangas, Ahmed Farooq, Roope Raisamo
13:48 – 14:06 (Best paper award nominee)
Functional Near-Infrared Spectroscopy (fNIRS) Analysis of Interaction Techniques in Touchscreen-Based Educational Gaming
Shayla Sharmin, Elham Bakhshipour, Mohammad Fahim Abrar, Behdokht Kiafar, Pinar Kullu, Nancy Getchell, Roghayeh Leila Barmaki
14:06 – 14:24
AirSpartOne: One-Handed Distal Pointing for Large Displays on Mobile Devices and in Midair
Martin Birlouez, Yosra Rekik, Laurent Grisoni
14:24 – 14:42
StoryDiffusion: How to Support UX Storyboarding With Generative-AI
Zhaohui Liang, Xiaoyu Zhang, Kevin Ma, Zhao Liu, Xipei Ren, Kosa Goucher-Lambert, Can Liu
14:42 – 15:00
A Scenario-Based Design Pack for Exploring Multimodal Human–GenAI Relations
Josh Andres, Chris Danta, Andrea Bianchi, Sahar Farzanfar, Gloria Milena Fernandez-Nieto, Alexa Becker, Tara Capel, Frances Liddell, Shelby Hagemann, Ned Cooper, Sungyeon Hong, Li Lin, Eduardo Benitez Sandoval, Anna Brynskov, Hubert Dariusz ZajƒÖc, Zhuying Li, Tianyi Zhang, Arngeir Berge
15:00-15:30 Break
15:30-16:30 Grand Challenge Session
- 3:30pm: 5 min – Welcome and intro
- 3:35pm: 25 min – Keynote speaker: Olympia Yarger
- 4:00pm: 15 min – Matthew Vestal: Introduction to Grand Challenge and summary paper presentation
- 4:15pm: 10 min – Elane Peng: mIoG: An Evaluation Metric for Multispectral Instance Segmentation in Robotics
- 4:25pm: 5min – Concluding remarks
16:00-18:00 Poster Session 2 and Demos
Session Chair: Roland Goecke
Topic: Interaction Design
1. A Systematic Review of Fusion Methods for the User-Centered Design of Multimodal Interfaces
Ronja Heinrich, Chris Zimmerer, Martin Fischbach, Marc Erich Latoschik
2. Exploring Sound-to-Sound Personalization for Accessible Digital Media
Dhruv Jain, Jason Miller
3. When Words Fall Short: The Case for Conversational Interfaces that Don’t Listen
James Simpson, Hamish Stening, Gaurav Patil, Patrick Nalepka, Mark Dras, Rachel W. Kallen, Simon Hosking, Michael J Richardson, Deborah Richards
4. Pinching Visuo-haptic Display: Investigating Cross-Modal Effects of Visual Textures on Electrostatic Cloth Tactile Sensations
Takekazu Kitagishi, Chun Wei Ooi, Yuichi Hiroi, Jun Rekimoto
Topic: LLMs for interactions
1. Using a Secondary Channel to Display the Internal Empathic Resonance of LLM-Driven Agents for Mental Health Support
Matthias Schmidmaier, Jonathan Rupp, Sven Mayer
2. Few-shot Fine-grained Image Classification with Interpretable Prompt Learning through Distribution Alignment
Dongliang Guo, Handong Zhao, Ryan Rossi, Sungchul Kim, Nedim Lipka, Tong Yu, Sheng Li
3. Multimodal Synthetic Data Finetuning and Model Collapse: Insights from VLMs and Diffusion Models
Zizhao Hu, Mohammad Rostami, Jesse Thomason
4. Multimodal Behavioral Patterns Analysis with Eye-Tracking and LLM-Based Reasoning
Dongyang Guo, Yasmeen Abdrabou, Enkeleda Thaqi, Enkelejda Kasneci
5. Talking-to-Build: How LLM-Assisted Interface Shapes Player Performance and Experience in Minecraft
Lei Wang, Xin Sun, Y Li, Jie Li, Massimo Poesio, Julian Frommel, Koen Hindriks, Jiahuan Pei
6. Large Language Models For Multimodal User Interaction in Virtual Environments
Ahmed Sayed, Kevin Pfeil
7. Understanding and Supporting Multimodal AI Chat Interactions of DHH College Students: an Empirical Study
Nan Zhuang, Yanni Ma, Xin Zhao, Wang Ying, Shaolong Chai, Shitong Weng, Mengru Xue, Yuxi Mao, Cheng Yao
Interacting with Social robots
1. When Robots Listen: Predicting Empathy Valence from Multimodal Storytelling Data
Jiayu Wang, Himadri Shekhar Mondal, Tom Gedeon, Md Zakir Hossain
2. USER-VLM 360: Personalized Vision Language Models with User-aware Tuning for Social Human-Robot Interactions
Hamed Rahimi, Adil Bahaj, Mouad Abrini, Mahdi Khoramshahi, Mounir Ghogho, Mohamed Chetouani
3. Demographic User Modeling for Social Robotics with Multimodal Pre-trained Models
Hamed Rahimi, Mouad Abrini, Jeanne Malecot, Ying Lai, Adrien Jacquet Crétides, Mahdi Khoramshahi, Mohamed Chetouani
16:00-18:00 Demo Session (Poster session)
Session Chair: Roland Goecke
Simulated Insight, Real-World Impact: Enhancing Driving Safety with CARLA-Simulated Personalized Lessons and Eye-Tracking Risk Coaching
Wenbin Gan, Minh-son Dao, Koji Zettsu
Affective and Physiological Responses to Immersive Intangible Cultural Heritage Experiences in Extended Reality
Fasih Haider, Sofia de la Fuente Garcia, Alicia Núñez García, Saturnino Luz
SocialWise: LLM-Agentic Conversation Therapy for Individuals with Autism Spectrum Disorder to Enhance Communication Skills
Albert Tang
The Human Record Needle: A Novel Interface for Embodied Music Interaction
Brandon Waylan Ables
PoseDoc: An Interactive Tool for Efficient Keypoint Annotation in Human Pose Estimation
Chengyu Fan, Tahiya Chowdhury
Realtime Multimodal Emotion Estimation using Behavioral and Neurophysiological Data
Von Ralph Dane Marquez Herbuela, Yukie Nagai
A Multilingual Telegram Chatbot for Mental Health Data Collection
Danila Mamontov, Alexey Karpov, Wolfgang Minker
Improving Deepfake Understanding through Simplified Explanations
Abhijeet Narang, Parul Gupta, Liuyijia Su, Abhinav Dhall
The Crock of Shh: A Whispering Water Interface for Reshaping Reality
Brandon Waylan Ables
19:00 – 22:00 Banquet Dinner
Main conference Day 3 - Thursday, 16 October 2025
09:00-10:00 Keynote
Session Chair: Yukiko Nakano
Designing for Meaningful Oversight: Human and Organisational Agency in Multimodal AI Systems
Liming Zhu
10:00-10:30 Break
10:30-12:00 Oral Session 4: Safe & Inclusive Interactions
Session Chair: Carlos Busso
10:30 – 10:48
Lightweight Transformers for Isolated Sign Language Recognition
Cristina Luna-Jiménez, Lennart Eing, Annalena Bea Aicher, Fabrizio Nunnari, Elisabeth André
10:48 – 11:06
All of That in 15 Minutes? Exploring Privacy Perceptions Across Cognitive Abilities via Ad-hoc LLM-Generated Profiles Inferred from Social Media Use
Kirill Kronhardt, Sebastian Hoffmann, Fabian Adelt, Max Pascher, Jens Gerken
11:06 – 11:24
SignFlow: End-to-End Sign Language Generation for One-to-Many Modeling using Conditional Flow Matching
Nabeela Khan, Bowen Wu, Sihan Tan, Carlos Toshinori Ishi
11:24 – 11:42
MENA: A Multimodal Framework for Analyzing Caregiver Emotions and Competencies in AR Geriatric Simulations
Behdokht Kiafar, Pavan Uttej Ravva, Salam Daher, Asif Ahmmed, Roghayeh Leila Barmaki
(virtual talk: Link)
11:42 – 12:00
Multimodal LLM using Federated Visual Instruction Tuning for Visually Impaired
Ankith Bala, Alina Vereshchaka
(virtual talk: Link)
12:00-13:30 Lunch
13:30-15:00 Oral Session 5: Conversational Dynamics
Session Chair: Md Zakir Hossain
13:30 – 13:48
Enhancing Gaze Prediction in Multi-Party Conversations via Speaker-Aware Multimodal Adaptation
Meng-Chen Lee, Zhigang Deng
13:48 – 14:06 (Best paper award nominee)
Real-time Generation of Various Types of Nodding for Avatar Attentive Listening System
Kazushi Kato, Koji Inoue, Divesh Lala, Keiko Ochi, Tatsuya Kawahara
14:06 – 14:24
Valerie K. Chen, Claire Liang, Julie Shah, Sean Andrist
14:24 – 14:42
Multimodal Analysis of Disagreement in Dyadic Conversations: An Approach Based on Emotion Recognition
Areej Buker, Emily Smith, Olga Perepelkina, Alessandro Vinciarelli
14:42 – 15:00 (Best paper award nominee)
Speech-to-Joy: Self-Supervised Features for Enjoyment Prediction in Human–Robot Conversation
Ricardo Santana, Bahar Irfan, Erik Lagerstedt, Gabriel Skantze, Andre Pereira
15:00-15:30 Break
15:30-17:30 Poster Session 3 (including LBR papers)
Session Chair: Laxminarayen NV
Topic: Multiparty Interactions
1. Multimodal Quantitative Measures for Multiparty Behavior Evaluation
Ojas Shirekar, Wim Pouw, Chenxu Hao, Vrushank Phadnis, Thabo Beeler, Chirag Raman
2. Beyond Utterance: Understanding Group Problem Solving through Discussion Sequences
Zhuoxu Duan, Zhengye Yang, Brooke Foucault Welles, Richard J. Radke
3. Learning Multimodal Motion Cues for Online End-of-Turn Prediction in Multi-Party Dialogue
Meng-Chen Lee, Zhigang Deng
4. Team Dynamics in Human-AI Collaboration: Effects on Confidence, Satisfaction, and Accountability
Mamehgol Yousefi, Ahmad Shahi, Mos Sharifi, Alvaro J Jorge Romera, Simon Hoermann, Thammathip Piumsomboon
5. A Multimodal Classroom Video Question-Answering Framework for Automated Understanding of Collaborative Learning
Nithin Sivakumaran, Chia-Yu Yang, Abhay Zala, Shoubin Yu, Daeun Hong, Xiaotian Zou, Elias Stengel-Eskin, Dan Carpenter, Wookhee Min, Cindy Hmelo-Silver, Jonathan Rowe, James Lester, Mohit Bansal
Topic: Safe & Inclusive Interactions
1. Causal Explanation of the Quality of Parent-Child Interactions with Multimodal Behavioral Features
Katherine Guerrerio, Lujie Karen Chen, Lisa Berlin, Brenda Jones Harden
2. Seeing, Hearing, Feeling: Designing Multimodal Alerts for Critical Drone Scenarios
Nina Knieriemen, Anke Hirsch, Muhammad Moiz Sakha, Florian Daiber, Hannah Kolb, Simone M. Hüning, Frederik Wiehr, Antonio Krüger
3. Unobtrusive Universal Acoustic Adversarial Attacks on Speech Foundation Models in the Wild
Jayden Fassett, Anjila Budathoki, Jack Morris, Qin Hu, Yi Ding
4. A Multilingual, Multimodal Dataset for Disinformation and Out-of-Context Analysis with Rich Supportive Information
Shuhan Cui, Hanrui Wang, Ching-Chun Chang, Huy H. Nguyen, Isao Echizen
5. MERD: A Multimodal Emotional Response Dataset from 360° VR Videos Across Different Age Groups
Qiang Chen, Shikun Zhou, Yuming Fang, Dan Luo, Tingsong Lu
6. Knowledge Graphs and Fine-Grained Visual Features: A Potent Duo Against Cheapfakes
Tuan-Vinh La, Minh-Hieu Nguyen, Minh-son Dao
7. Analyzing Character Representation in Media Content using Multimodal Foundation Model: Effectiveness and Trust
Evdoxia Taka, Debadyuti Bhattacharya, Joanne Garde-Hansen, Sanjay Sharma, Tanaya Guha
8. A Block-Level Fine-Graining Framework for Multimodal Fusion in Federated Learning
Guozhi Zhang, Mengying Jia, Shuyan Feng, Zixuan Liu
Topic: XR
1. Adaptive Gen-AI Guidance in Virtual Reality: A Multimodal Exploration of Engagement in Neapolitan Pizza-Making
Ka Hei Carrie Lau, Sema Sen, Philipp Stark, Efe Bozkir, Enkelejda Kasneci
2. Please Let Me Think: The Influence of Conversational Fillers on Transparency and Perception of Waiting Time when Interacting with a Conversational AI in Virtual Reality
David Obremski, Paula Friedrich, Carolin Wienrich
3. Exploring the Impact of Distance on XR Selection Techniques
Becky Spittle, Maite Frutos-Pascual, Chris Creed, Ian Williams
15:30-17:30 Late Breaking Results (Poster Session)
Multimodal Analysis of Listener’s Active Listening Behaviors in Speed Dating Dialogues
Asahi Ogushi, Naoki Azuma, Daichi Shikama, Ryo Ishii, Toshiki Onishi, Akihiro Miyata
TGN-PL: Learning to Socialize Using Privileged Information and Temporal Graph Networks
Jouh Yeong Chew, Joanne Taery Kim, Sehoon Ha
You Like This Robot? I Don’t! How Individual Differences Influence Perceptions of Robot Teammates in Virtual Reality
Karla Bransky, Penny Sweetser
SBM: Social Behavior Model for Human-Like Action Generation
Jouh Yeong Chew, Zhi-Yi Lin, Xucong Zhang
A New LLM-Powered Communication Metric: Information Sharing as a Predictor of Team Performance
Xinyun Hu, Penny Sweetser
A Platform for Experimenting with Non-Verbal Communication: Inserting facial displays of misunderstanding into live conversations.
Ella Cullen
Bridging Video and Symbols: A Hybrid AI for Edge Traffic-Risk Reasoning
Minh-Son Dao, Thi-Mai-Phuong Nguyen, Swe Nwe Nwe Htun, Koji Zettsu
Identifying Participant Roles in Online Group Discussions
Kazuki Kodaira, Kazuki Nakaya, Jie Zeng, Hiyori Toda, Fumio Nihei, Ryo Ishii, Yukiko Nakano
Most DAIC-WoZ Depression Classifiers Are Invalid, They Don’t Learn Task-Specific Features: Preliminary Findings From a Large-Scale Reproducibility Study
Santosh Varma Patapati, Ishan Pendyala, Murari Ambati, Pranav Kunadharaju, Pranav Kokati, Amit Adiraju Narasimha, Trisanth Srinivasan
LociVR: Design of a Virtual Reality Prototype for Memory Training
Cancan Jin, Yanze Gao, Zirui Yu, Ningning Xu
When Pose Estimation Fails: Measuring Occlusion for Reliable Multimodal Interaction
Chengyu Fan, Tahiya Chowdhury
From Behavior to Interaction: Understanding User Intents in Metaverse Experience
Ningning Xu, Lingyun Yu, Qinglin Mao, Kaiwen Li, Yifei Chen, Haibo Zhou, Xu Sun
Large Language Models as Perceivers of Dynamic Full-Body Expressions of Emotion
Huakun Liu, Miao Cheng, Xin Wei, Felix Dollack, Victor Schneider, Hideaki Uchiyama, Yoshifumi Kitamura, Kiyoshi Kiyokawa, Monica Perusquia-Hernandez
17:30-18:00 Closing Ceremony
Adjunct Events Day 2 - Friday, 17 October 2025
Adjunct Events Day 2 – Friday, 17 October 2025
9:00 – 12:00 Human-AI Interaction Tutorial
Madhawa Perera, Md Zakir Hossain, Alexander Krumpholz, Tom Gedeon


