Akshay Nambi

Principal Researcher

Fara-7B: An Efficient Agentic Model for Computer Use

Pushing the frontiers of computer-use agents with an open-weight, ultra-compact model, optimized for real-world web tasks Opens in a new tab

Building AI for population-scale systems with Akshay Nambi

Akshay Nambi is a principal researcher at Microsoft Research. His work lies at the intersection of systems, AI, and machine learning with a focus on designing, deploying, and…

Microsoft Research and Physics Wallah team up to enhance AI-based tutoring

Researchers from Microsoft Research are developing new algorithms and techniques to enhance the accuracy and reasoning capabilities of AI models. They are now collaborating with Physics Wallah to…

a group of people sitting at a desk in front of a crowd

Teachers in India help Microsoft Research design AI tool for creating great classroom content

About

I am a Principal Researcher at Microsoft Research India, working at the intersection of AI, machine learning, and systems. My research focuses on building trustworthy agentic AI systems that can reason, plan, use tools, and operate reliably in real-world environments.

Recent advances in large language models have enabled a shift from passive generation to autonomous agents capable of multi-step task execution. However, these systems still struggle with reliable reasoning, long-horizon planning, tool use, and safety, especially in non-verifiable real-world settings. My work addresses these challenges through reinforcement fine-tuning, structured reasoning, and scalable tool integration, with a focus on improving the reliability and efficiency of agentic systems. I also work closely with product and impact partners to build population-scale AI agents, with applications in domains such as education, agriculture and more.

Research Areas

1. Reliable Reasoning with Tool Use

AI systems are increasingly expected to solve multi-step, long-horizon tasks in real-world, non-verifiable environments. My work focuses on enabling reliable reasoning and execution with tools, including programmatic tool use and structured execution, effective context and state management for multi-step tasks, efficient interaction with large and dynamic toolspaces, and robust tool discovery, selection, and recovery under uncertainty.

2. Reinforcement Fine-Tuning for Agentic Reasoning

I develop reinforcement fine-tuning (RFT) methods to train agents beyond supervised learning. This includes advancing RL for multi-step reasoning and tool use, building scalable post-training pipelines using synthetic environments, enabling self-distillation and iterative improvement, and designing rubric-based rewards and evaluation frameworks for non-verifiable tasks.

3. Trustworthy and Safe Agentic Systems

As agents move from generation to action, ensuring safety, reliability, and alignment becomes critical. My work focuses on building agents that can reason about uncertainty, verify outcomes, and decide when to act or refuse, especially in non-verifiable settings. This includes developing methods for safe tool use, failure detection, and robustness in long-horizon execution, as well as evaluation frameworks that capture real-world risks beyond standard benchmarks.

4. Population-Scale AI Systems and Copilots

A key focus of my work is translating research into real-world AI systems that operate at scale. I build and deploy agentic copilots for product teams, such as Researcher Agents for deep research and complex workflows, as well as for societal applications in domains like education and agriculture.

Please visit my projects and publications page for more details. I have developed and scaled impactful solutions that are actively used by several thousands of users across diverse sectors, including education, agriculture, transportation (opens in new tab), healthcare, and energy.

Internship opportunities (3-6months): I’m always on the lookout for bright students and researchers who have strong hands-on experience in large language models, agentic AI, reinforcement learning, reasoning systems, and scalable ML systems. I particularly value individuals who can move fast and build end-to-end systems. If you are interested in internships or collaborations, please email me your CV and research interests.

Akshay Nambi

Fara-7B: An Efficient Agentic Model for Computer Use

Building AI for population-scale systems with Akshay Nambi

Microsoft Research and Physics Wallah team up to enhance AI-based tutoring

Teachers in India help Microsoft Research design AI tool for creating great classroom content

About

Research Areas

1. Reliable Reasoning with Tool Use

2. Reinforcement Fine-Tuning for Agentic Reasoning

3. Trustworthy and Safe Agentic Systems

4. Population-Scale AI Systems and Copilots

Featured content

Minister of School Education and Literacy Karnataka Government launches Shiksha copilot to 1000+ teachers.

Satya talks about HAMS in his keynote during his visit to India

HAMS is being used to automate driver license testing in India

Contact Akshay Nambi

AI Frontiers

Microsoft Research Lab – India