Senior Applied Scientists and Principal Applied Scientists (Multiple Positions) – Copilot Tuning
We are seeking Senior Applied Scientists and Principal Applied Scientists (Multiple Positions) with strong research skills and the desire to pursue the cutting edge in model development that pushes technological boundaries. We are looking for…
The Illusion of Inclusion How LLMs Misrepresent African Languages and Cultural Contexts
Multilingual LLMs have become so powerful, yet African languages remain misrepresented. Despite the progress in African NLP, driven by community-led initiatives, LLMs continue to misrepresent African languages and the meanings they convey. In this talk,…
CalcLM: Agent Grid
Prototype experimenting with agents in a grid UI — simple, open experiment that illustrates how agent workflows might live in a grid-like surface.
Magentic Marketplace
Magentic Marketplace is an open-source simulation environment for exploring the numerous possibilities of agentic markets and their societal implications at scale. It provides a foundation for studying these markets and guiding them toward outcomes that benefit everyone.
MMCTAgent
MMCTAgent (Multi-modal Critical Thinking Agent) is a state-of-the-art multi-modal AI framework that brings human-like critical thinking to visual reasoning tasks. it combines advanced planning, self-critique, and tool-based reasoning to deliver superior performance in complex image…
MMCTAgent: Enabling multimodal reasoning over large video and image collections
MMCTAgent enables dynamic multimodal reasoning with iterative planning and reflection. Built on Microsoft’s AutoGen framework, it integrates language, vision, and temporal understanding for complex tasks like long video and image analysis.
BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI
BlueCodeAgent is an end-to-end blue-teaming framework built to boost code security using automated red-teaming processes, data, and safety rules to guide LLMs’ defensive decisions. Dynamic testing reduces false positives in vulnerability detection.