Portrait of Akshay Nambi

Akshay Nambi

Principal Researcher

Connect on LinkedIn

About

I am a Principal Researcher at Microsoft Research India, working at the intersection of AI, machine learning, and systems. My research focuses on building trustworthy agentic AI systems that can reason, plan, use tools, and operate reliably in real-world environments.

Recent advances in large language models have enabled a shift from passive generation to autonomous agents capable of multi-step task execution. However, these systems still struggle with reliable reasoning, long-horizon planning, tool use, and safety, especially in non-verifiable real-world settings. My work addresses these challenges through reinforcement fine-tuning, structured reasoning, and scalable tool integration, with a focus on improving the reliability and efficiency of agentic systems. I also work closely with product and impact partners to build population-scale AI agents, with applications in domains such as education, agriculture and more.

Research Areas

1. Reliable Reasoning with Tool Use

AI systems are increasingly expected to solve multi-step, long-horizon tasks in real-world, non-verifiable environments. My work focuses on enabling reliable reasoning and execution with tools, including programmatic tool use and structured execution, effective context and state management for multi-step tasks, efficient interaction with large and dynamic toolspaces, and robust tool discovery, selection, and recovery under uncertainty.

2. Reinforcement Fine-Tuning for Agentic Reasoning

I develop reinforcement fine-tuning (RFT) methods to train agents beyond supervised learning. This includes advancing RL for multi-step reasoning and tool use, building scalable post-training pipelines using synthetic environments, enabling self-distillation and iterative improvement, and designing rubric-based rewards and evaluation frameworks for non-verifiable tasks.

3. Trustworthy and Safe Agentic Systems

As agents move from generation to action, ensuring safety, reliability, and alignment becomes critical. My work focuses on building agents that can reason about uncertainty, verify outcomes, and decide when to act or refuse, especially in non-verifiable settings. This includes developing methods for safe tool use, failure detection, and robustness in long-horizon execution, as well as evaluation frameworks that capture real-world risks beyond standard benchmarks.

4. Population-Scale AI Systems and Copilots

A key focus of my work is translating research into real-world AI systems that operate at scale. I build and deploy agentic copilots for product teams, such as Researcher Agents for deep research and complex workflows, as well as for societal applications in domains like education and agriculture.

Please visit my projects and publications page for more details. I have developed and scaled impactful solutions that are actively used by several thousands of users across diverse sectors, including education, agriculture, transportation (opens in new tab), healthcare, and energy.

Internship opportunities (3-6months): I’m always on the lookout for bright students and researchers who have strong hands-on experience in large language models, agentic AI, reinforcement learning, reasoning systems, and scalable ML systems. I particularly value individuals who can move fast and build end-to-end systems. If you are interested in internships or collaborations, please email me your CV and research interests.