Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
SafeAgents
A unified framework for building and evaluating safe multi-agent systems SafeAgents provides a simple, framework-agnostic API for creating multi-agent systems with built-in safety evaluation, attack detection, and support for multiple agentic frameworks (Autogen, LangGraph, OpenAI…
BusyBox
BusyBox is a physical 3D-printable device for benchmarking affordance generalization in robot foundation models. It features Please check out our website (opens in new tab) for more details. For fully building a instrumented BusyBox capable…
Chartifact
Declarative, interactive data documents Chartifact is a low-code document format for creating interactive, data-driven pages such as reports, dashboards, and presentations. It travels like a document and works like a mini app. Designed for use…
TestExplora
This repository is the official implementation of the paper “TestExplora: Benchmarking LLMs for Proactive Bug Discovery via Repository-Level Test Generation” It can be used for baseline evaluation using the prompts mentioned in the paper. TestExplora…
SABER: Scaling-Aware Best-of-N Estimation of Risk
Scaling-Aware Best-of-N Estimation of Risk A Python package for predicting large-scale adversarial risk in Large Language Models under Best-of-N sampling. Paper: https://arxiv.org/pdf/2601.22636 (opens in new tab) Standard LLM safety evaluations use single-shot (ASR@1) metrics,…
SigmaCollab
SigmaCollab is a dataset that enables research on human-AI physically situated collaboration. The dataset consists of a set of 85 sessions in which untrained participants were guided by a mixed-reality assistive AI agent in performing…