Audio and acoustics

Podcast

What’s Your Story: Ivan Tashev

February 1, 2024

Partner Software Architect Ivan Tashev talks about applying his expertise in audio signal processing to the design and study of audio components for Microsoft products such as Kinect and shares how a focus on what…

Microsoft Research Blog

Research at Microsoft 2023: A year of groundbreaking AI advances and discoveries

December 22, 2023

AI saw unparalleled growth in 2023, reaching millions daily. This progress owes much to the extensive work of Microsoft researchers and collaborators. In this review, learn about the advances in 2023, which set the stage…

Publication

EEG and Eye-Tracking Error-Related Responses During Predictive Text Interactions: A BCI Case Study

Sophia K. Mehdizadeh, Ed Cutrell, R. Michael Winters, Nemanja Djuric, Yang Cheng, Ivan Tashev, Yu Te Wang

International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC ’23) | December 2023

Project

Video

Synchronized Audio-Visual Generation with a Joint Generative Diffusion Model and Contrastive Loss

November 6, 2023

The rapid development of deep learning techniques has led to significant advancements in the fields of multimedia generation and synthesis. However, generating coherent and temporally aligned audio and video content remains a challenging task due…

46:11

Video

Binaural spatial audio positioning in video calls

October 4, 2023

Spatially separating voices plays a crucial role in speech intelligibility, speaker identification and cognitive load in conversations. Voices are naturally separated in in-person conversations, but in most video conferencing software voices are mixed down to…

01:03:57

Publication

Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser

Yung-Hsuan Lai, Yen-Chun Chen, Yu-Chiang Frank Wang

NeurIPS 2023 | October 2023

Publication

Imitator: Personalized Speech-driven 3D Facial Animation

Balamurugan Thambiraja, Ikhsanul Habibie, Sadegh Aliakbarian, Darren Cosker, Christian Theobalt, Justus Thies

International Conference on Computer Vision (ICCV), 2023 | October 2023

Publication

Spatio-Temporal Windowing for Encoding Perceptually Salient Early Reflections in Parametric Spatial Audio Rendering

Tobias Jüterbock, Fabian Brinkmann, Hannes Gamper, Nikunj Raghuvanshi, Stefan Weinzierl

Journal of the Audio Engineering Society | October 2023, Vol 71(10)

Video

Final intern talk: Improving Frechet Audio Distance for Generative Music Evaluation

September 22, 2023

As generative music models become more powerful and popular, there is a growing need for robust objective metrics of music quality that correlates with human perception. The Frechet Audio Distance (FAD) is a commonly used…

41:10

Publication

ICASSP 2023 Acoustic Echo Cancellation Challenge

Ross Cutler, Ando Saabas, Tanel Parnamaa, Marju Purin, Evgenii Indenbom, Nicolae Catalin Ristea, Jegor Guzvin, Hannes Gamper, Sebastian Braun, Robert Aichner

IEEE Open Journal of Signal Processing | September 2023, Vol 5: pp. 675-685