PlugMem: Transforming raw agent interactions into reusable knowledge
It seems counterintuitive: giving AI agents more memory can make them less effective. As interaction logs accumulate, they grow large, fill with irrelevant content, and become increasingly difficult to use. More memory means that agents must…
How people use Copilot for Health
Efficient Distributed Orthonormal Optimizers for Large-Scale Training
Kwangjun delivered a 50-minute technical talk on recent advances in orthonormal update methods for large-scale AI model training. This topic has been rapidly gaining attention in the community, emerging as a strong successor to AdamW…
Research Intern – AI Safety and Security
Protecting large language models (LLMs) from malicious inputs is critical. LLMs can also be used to protect users from malicious attacks. The Deep Learning Team in Microsoft Research – Redmond is seeking Research Interns interested…
Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model
We are pleased to announce Phi-4-reasoning-vision-15B, a 15 billion parameter open‑weight multimodal reasoning model, available through Microsoft Foundry (opens in new tab), HuggingFace (opens in new tab) and GitHub (opens in new tab). Phi-4-reasoning-vision-15B is…
Dion2: A new simple method to shrink matrix in Muon
Dion2 reduces the cost of Muon’s orthonormalization step by orthonormalizing only a small, selected submatrix at each iteration. This lightweight approach preserves Muon’s strong performance while significantly improving scalability of optimizer at scale.
ARO: A new lens on matrix optimization for LLMs
We present Adaptively Rotated Optimization (ARO), a matrix optimizer that speeds up LLM training by applying updates in a rotated, geometry-aware coordinate system. Guided by new insights on global structures on LLM loss landscapes, ARO…
Lessons from deploying HealthBots with experts-in-the-loop
The Tyger framework enables faster, more accessible medical imaging by streaming raw data to the cloud for accelerated reconstruction—reducing patient wait times and discomfort—while empowering researchers to rapidly test and deploy new algorithms.
Teaching small language models to think like optimization experts with OptiMind
OptiMind is a specialized language model that translates natural-language problem descriptions directly into solver-ready mathematical optimization formulations. This removes one of the most expertise-intensive bottlenecks in optimization workflows and makes advanced optimization more accessible.
Agent Lightning: One learning system that makes all agents evolve
Agent Lightning is an agent optimization framework that enables agents to learn from their experiences through reinforcement learning and other methods. By treating agents as first-class citizens, optimization becomes automatic for any agent with minimal…