What’s Your Story: Ivan Tashev
Partner Software Architect Ivan Tashev talks about applying his expertise in audio signal processing to the design and study of audio components for Microsoft products such as Kinect and shares how a focus on what…
Research at Microsoft 2023: A year of groundbreaking AI advances and discoveries
AI saw unparalleled growth in 2023, reaching millions daily. This progress owes much to the extensive work of Microsoft researchers and collaborators. In this review, learn about the advances in 2023, which set the stage…
Synchronized Audio-Visual Generation with a Joint Generative Diffusion Model and Contrastive Loss
The rapid development of deep learning techniques has led to significant advancements in the fields of multimedia generation and synthesis. However, generating coherent and temporally aligned audio and video content remains a challenging task due…
Binaural spatial audio positioning in video calls
Spatially separating voices plays a crucial role in speech intelligibility, speaker identification and cognitive load in conversations. Voices are naturally separated in in-person conversations, but in most video conferencing software voices are mixed down to…
Final intern talk: Improving Frechet Audio Distance for Generative Music Evaluation
As generative music models become more powerful and popular, there is a growing need for robust objective metrics of music quality that correlates with human perception. The Frechet Audio Distance (FAD) is a commonly used…