AI advances in image captioning: Describing images as well as people do
Image captioning is an interesting problem in the intersection between computer vision and natural language processing, and it has attracted great attention from their respective research communities. Recent image captioning models have achieved impressive results…
AI advances in image captioning: Describing images as well as people do webinar
This webinar will focus on some of the recent vision-language pretraining (VLP) approaches for image captioning. We will cover our latest approaches, including object-semantics aligned pretraining (OSCAR) and visual-vocabulary pretraining (VIVO). We will also discuss…
Microsoft Research Conversations in STEM: Future Horizons of Science
Future Horizons of Science If you agree with Arthur C. Clarke that the only way to discover the limits of the possible is to venture a little way past them into the impossible, you’ll be…
Microsoft Research Conversations in STEM: Medical and Health Technology
Medical and Health Technology Innovations in medical and healthcare technology are among the most exciting and promising in technology research. Join us for a fascinating – and timely – live discussion about what’s new in…
Microsoft Vision Model ResNet-50 combines web-scale data and multi-task learning to achieve state of the art
Pretrained vision models accelerate deep learning research and bring down the cost of performing computer vision tasks in production. By pretraining one large vision model to learn general visual representation of images, then transferring the…
Microsoft Vision
Microsoft Vision Model ResNet-50 is a state-of-the-art ResNet-50 model pretrained with web-scale data, multi-task training, and web-supervision.