Publication DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning Haoyu Huang, Jiaxin Bai, Shujie Liu, Yang Wei, Hong Ting Tsang, Yisen Gao, Zhongwei Xie, Yufei Li, Yangqiu Song May 2026
Publication Sample-Mean Anchored Thompson Sampling for Offline-to-Online Learning with Distribution Shift Bochao Li, Yao Fu, Wei Chen, Fang-yuan Kong May 2026
Publication Reinforce Adjoint Matching: Scaling RL Post-Training of Diffusion and Flow-Matching Models Andreas Bergmeister, Stefanie Jegelka, Nikolas Nusken, Carles Domingo-Enrich, Jakiw Pidstrigach May 2026
Publication EmbodiSkill: Skill-Aware Reflection for Self-Evolving Embodied Agents Ruofei Ju, Xinrui Wang, Xin Ding, Yifan Yang, Haotian Wu, Shiqi Jiang, Qianxi Zhang, Hao Wen, Xiangyu Li, Weijun Wang, Kunchang Li, Yunxin Liu, Haipeng Dai, Wei Wang, Ting Cao May 2026
Publication AI in the Enterprise: How People Use M365 Copilot Chat Scott Counts, Yan Chen, Jing Dong, Himanshu Sharma, Andrey Zaikin, Rui Hu, Alperen Kok, Gorkem Ozer Yilmaz, Siddharth Suri, Kiran Tomlinson, Sonia Jaffe, Will Wang May 2026
Publication Rebellious Student: Reversing Teacher Signals for Reasoning Exploration with Self-Distilled RLVR Jeonghye Kim, Jiwon Jeon, Dongsheng Li, Yuqing Yang May 2026
Publication ReVision: Scaling Computer-Use Agents via Temporal Visual Redundancy Reduction Amirhossein Abaskohi, Yuhang He, Peter West, Giuseppe Carenini, Pranit Chawla, Vibhav Vineet May 2026
Publication CodeClinic: Evaluating Automation of Coding Skills for Clinical Reasoning Agents Timothy Ossowski, Xinchi Liu, Danyal Maqbool, Vaibhav Dhanuka, Sheng Zhang, Hoifung Poon, Majid Afshar, Tyler J. Bradshaw, Junjie Hu May 2026
Publication Oracle Poisoning: Corrupting Knowledge Graphs to Weaponise AI Agent Reasoning Ben Kereopa-Yorke, Guillermo Díaz, Holly Wright, Reagan Johnston, Ron F. Del Rosario, Timothy Lynar May 2026