Math Solver: Simplifying Online Math Learning for K-12
When students are stuck with a math problem, searching online for help can be challenging. Typing complex math problems also poses a significant barrier. We look to reimagine input modalities through camera to make search…
Anticipate, absorb, and adapt—introducing the societal resilience research agenda
The genetic sequence for COVID-19 was first published in January 2020. Before the end of the year, new vaccines—which typically take five to ten years to develop—were approved for emergency use in multiple nations. This…
Avbert
This repository contains the code and models for our ICLR 2021 paper: Parameter Efficient Multimodal Transformers for Video Representation Learning
Microsoft and NVIDIA introduce parameter-efficient multimodal transformers for video representation learning
Understanding video is one of the most challenging problems in AI, and an important underlying requirement is learning multimodal representations that capture information about objects, actions, sounds, and their long-range statistical dependencies from audio-visual signals. Recently,…
Vision Longformer for Object Detection
This project provides the source code for the object detection part of vision longformer paper.
Hands-on research and prototyping for haptics
While many of us think of human-computer interaction as a job for the eyes, ears and mind, we don’t think as often about the importance and complexity of our tactile interactions with computers. Haptics –…