Keynote Speakers

From hands to brains: How does human body talk, think and interact in face-to-face language use?

Asli Ozyurek

Asli Ozyurek

Professor, Donders Institute for Brain, Cognition and Behavior, Radboud University
Research Associate, MaxPlanck Institute for Psycholinguistics
Director, Multimodal Language and Cognition lab

Bio: Prof. Dr. Asli Ozyurek completed her BA in Psychology at Bogazici University Turkey and earned a double Ph.D. degree in both Psychology and Linguistics from the University of Chicago. Currently she is a Principle Investigator at the Donders Institute for Brain, Cognition and Behavior at Radboud University in Netherlands and is a Research Associate at MaxPlanck Institute for Psycholinguistics and is the Director of the Multimodal Language and Cognition lab.
In general she investigates the role our language ability plays in human cognition and communication. To do so she uses a cross linguistic, cross cultural and multimodal aproaches to understand how language is used in embodied, situated and social contexts through our bodily actions, as in the use of gestures and sign languages. She is also interested in how our use of multimodal language interacts with other domains cognition (spatial cognition, action, memory, social, communicative intention), that is its neural, cognitive and social foundations.
She is an elected member of Academia Europea and has received ASPASIA award from Dutch Science Foundation and Young Scientist award from Turkish Science Foundation. She has obtained many career grants such as ERC Starting Grant, Dutch Science Foundation (NWO) VIDI and VICI grants and hosted many Marie Curie Individual Fellowships and has publications in Science, PNAS, Cognition, Psychological Science, J of Cognitive Neuroscience, Neuroimage among others.
More information and CV can be found at and
Twitter: @GestureSignlab and @ozyurek_a

Deep Learning for Joint Vision and Language Understanding

Kate Saenko

Kate Saenko

Associate Professor, Boston University
Consulting Professor, MIT-IBM Watson AI Lab

Bio: Kate is an Associate Professor of Computer Science at Boston University and a consulting professor for the MIT-IBM Watson AI Lab. She leads the Computer Vision and Learning Group at BU, is the founder and co-director of the Artificial Intelligence Research (AIR) initiative, and member of the Image and Video Computing research group. Kate received a PhD from MIT and did her postdoctoral training at UC Berkeley and Harvard. Her research interests are in the broad area of Artificial Intelligence with a focus on dataset bias, adaptive machine learning, learning for image and language understanding, and deep learning.
More information can be found at

Sonic Interaction: From gesture to immersion

Atau Tanaka

Atau Tanaka

Professor of Media Computing, Goldsmiths University of London

Bio: Atau Tanaka conducts research in embodied musical and human-computer interaction. He has a BA from Harvard, composition/audio engineering degrees from Peabody Conservatory and obtained his doctorate from Stanford University's CCRMA. He uses muscle sensing via the electromyogram (EMG) in conjunction with machine learning in concert performance and interaction research where the human body can be said to become a musical instrument. Atau has carried out research at IRCAM Centre Pompidou, was Artistic Ambassador for Apple France and researcher at Sony Computer Science Laboratory (CSL) Paris. His artistic work has been presented at Ars Electronica, San Francisco Museum of Modern Art (SFMOMA), Eyebeam NYC, Southbank London, NTT-ICC Tokyo, and ZKM Karlsruhe. His scientific research is published in the NIME, CHI, and SIGRRAPH communities, and been supported by the European Research Council (ERC), and UK research and arts councils. He has been mentor at the UK's National Endowment for Science, Technology & Art (NESTA) and was Artistic Co-Director of STEIM in Amsterdam and Edgar Varèse guest professor at TU Berlin. He is Professor of Media Computing at Goldsmiths University of London.

Human-centered Multimodal Machine Intelligence

Shrikanth Narayanan

Shrikanth (Shri) Narayanan

Recipient of the ICMI 2020 Sustained Accomplishment Award

Abstract: Multimodal machine intelligence offers exciting possibilities for helping us understand the human condition and human functioning and to support and enhance the human experiences. What makes these approaches and systems exciting is the promise they hold for adaptation and personalization in the presence of the rich and vast inherent heterogeneity, variety and diversity within and across people. Multimodal engineering approaches can help analyze human trait (e.g., age), state (e.g., emotion), and behavior dynamics objectively and at scale. Machine intelligence could also help detect and analyze deviation in patterns from what is deemed typical. These techniques in turn can assist, facilitate or enhance decision making by humans, and by autonomous systems. Realizing such a promise requires addressing two major lines of, oft intertwined, challenges: creating inclusive technologies that work for everyone while enabling tools that can illuminate the source of variability or difference of interest.

This talk will highlight some of these possibilities and opportunities through examples drawn from two specific domains. The first relates to advancing health informatics in behavioral and mental health. With over 10% of the world's population affected, and with clinical research and practice heavily dependent on (relatively scarce) human expertise in diagnosing, managing and treating the condition, opportunities for engineering in offering access at scale and tools to support care are immense. For example, determining whether a child is on the Autism spectrum, a clinician would engage and observe a child in a series of interactive activities, targeting relevant cognitive, communicative and socio- emotional aspects, and observe the resulting behavior cues and codify specific patterns of interest e.g., typicality of vocal intonation, facial expressions, joint attention behavior. Machine intelligence driven processing of speech, language, visual and physiological data, and combining them with other forms of clinical data, enable novel and objective ways of supporting and scaling up these diagnostics. Likewise, multimodal systems can automate the analysis of psychotherapy session, including computing quality-assurance measures e.g., rating a therapist's expressed empathy. These technology possibilities can go beyond the traditional realm of clinics, directly to patients in their natural settings. For example, remote multimodal sensing of biobehavioral cues can enable new ways for screening and tracking behaviors (e.g., in workplace) and progress to treatment (e.g., for depression), and offer just in time support.

The second example is drawn from the world of media. Media are created by humans and for humans to tell stories. They cover an amazing range of domains–from the arts and entertainment to news, education and commerce–and in staggering volume. Machine intelligence tools can help analyze media and measure their impact on individuals and society. This includes offering objective insights into diversity and inclusion in media representations through robustly characterizing media portrayals from an intersectional perspective along relevant dimensions of inclusion – gender, race, gender, age, ability and other attributes, and in creating tools to support change. Again this underscores the twin technology requirements: to perform equally well in characterizing individuals regardless of the dimensions of the variability, and use those inclusive technologies to shine light on and create tools to support diversity and inclusion.

Bio: Shrikanth (Shri) Narayanan is University Professor and Niki & C. L. Max Nikias Chair in Engineering at the University of Southern California, where he is Professor of Electrical & Computer Engineering, Computer Science, Linguistics, Psychology, Neuroscience, Otolaryngology and Pediatrics, Director of the Ming Hsieh Institute and Research Director of the Information Sciences Institute. Prior to USC he was with AT&T Bell Labs and AT&T Research. His research focuses on human-centered information processing and communication technologies. He is a Fellow of the National Academy of Inventors, the Acoustical Society of America, IEEE, ISCA, the American Association for the Advancement of Science (AAAS), the Association for Psychological Science, and the American Institute for Medical and Biological Engineering (AIMBE). He is a recipient of several honors including the 2015 Engineers Council's Distinguished Educator Award, a Mellon award for mentoring excellence, the 2005 and 2009 Best Journal Paper awards from the IEEE Signal Processing Society and serving as its Distinguished Lecturer for 2010-11, a 2018 ISCA CSL Best Journal Paper award, and serving as an ISCA Distinguished Lecturer for 2015-16 and the Willard R. Zemlin Memorial Lecturer for ASHA in 2017. He has published over 900 papers and has been granted seventeen U.S. patents. His research and inventions have led to technology commercialization including through startups he co-founded: Behavioral Signals Technologies focused on the telecommunication services and AI based conversational assistance industry and Lyssn focused on mental health care delivery, treatment and quality assurance.
More information can be found at

ICMI 2020 ACM International Conference on Multimodal Interaction. Copyright © 2019-2020