Data commons
Open data initiatives for all builders of AI
We are supporting the Institutional Data Initiative and CORE to expand access to high-quality data for AI innovation.
Emerging funding and business models
The Open Data Policy Lab examines emerging funding and business models for data commons that support equitable, sustainable reuse of data for public-interest AI.
New Commons Challenge awards
The New Commons Challenge recognized two projects that demonstrate innovation in data commons – foundational initiatives that are using AI to understand data and create a bridge to humanitarian and community support.
Language and cultural diversity in the AI era
Open Science for AI-driven discovery
AI is transforming research, but its impact depends on access to high‑quality, open, machine‑readable data. Microsoft advances open science by sharing research, building open-enabling platforms, partnering globally, and advocating for policies that expand access and usability of scientific data.
Open Government Data
Additional resources
Open Data for Social Impact Framework
A tool leaders can use to put data to work to solve important societal issues.
The Open Data Opportunity
The importance behind data sharing explained.
AI for Good Lab Open Source Database
Making open datasets, research code, and tools freely available for a global community of problem solvers.
Capabilities
Microsoft Discovery
An AI‑powered research platform that supports the open ecosystem by enabling researchers to integrate and reuse data, code, and models from across an open scientific ecosystem.
Azure Data Factory
A fully managed, serverless data integration service to ingest, transform, and orchestrate data from a wide range of sources.
Microsoft Foundry
Microsoft's unified AI platform for building, deploying, and governing AI agents at enterprise scale.
Researcher tools
Explore a collection of datasets, code, and models from Microsoft Research for the broader academic community to advance state-of-the-art research across all disciplines.
Legal frameworks
CDLA Permissive 2.0
The Community Data License Agreement (CDLA) Permissive 2.0 is an open data agreement designed to make it easier to share and collaborate with open data.
C-UDA 1.0
The Computational Use of Data Agreement (C-UDA) 1.0 is intended for use with datasets that may include material not owned by the data provider, but where it may have been assembled lawfully from publicly accessible sources.
DUA-OAI
The Data Use Agreement for Open AI Model Development (DUA-OAI) provides terms to govern the sharing of data by an organization with another for the purpose of allowing that second organization to use the data to train an AI model, where the trained model is open sourced.
DUA-DC
The Data Use Agreement for Data Commons (DUA-DC) can be used by multiple parties who want to share data through a common, Application Programming Interface (API)-enabled database.