Publication Tracing the Traces: Latent Temporal Signals for Efficient and Accurate Reasoning Martina G. Vilas, Safoora Yousefi, Besmira Nushi, Eric Horvitz, Vidhisha Balachandran ICLR 2026 | October 2025
Publication Chain-of-Retrieval Augmented Generation Liang Wang, Haonan Chen, Nan Yang, Xiaolong Huang, Zhicheng Dou, Furu Wei NeurIPS 2025 | October 2025
Publication Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting Michael Y. Hu, Ben Van Durme, Jacob Andreas, Harsh Jhamtani October 2025
Publication Dyna-Mind: Learning to Simulate from Experience for Better AI Agents Xiao Yu, Baolin Peng, Michel Galley, Hao Cheng, Qianhui Wu, Janardhan Kulkarni, Suman Nath, Zhou Yu, Jianfeng Gao ICLR 2026 | October 2025 Under review as a conference paper at ICLR 2026
Publication RepDL: Bit-level Reproducible Deep Learning Training and Inference Peichen Xie, Xian Zhang, Shuo Chen October 2025
Publication Saving SWE-Bench: A Benchmark Mutation Approach for Realistic Agent Evaluation Spandan Garg, Benjamin Steenhoek, Yufan Huang ArXiv | October 2025, Vol abs/2510.08996
Publication All claims are equal, but some claims are more equal than others: Importance-sensitive factuality evaluation of LLM generations Miriam Wanner, Leif Azzopardi, Paul Thomas, Soham Dan, Ben Van Durme, Nick Craswell October 2025 preprint arXiv:2510.07083
Publication ConDABench: Interactive Evaluation of Language Models for Advanced Data Analysis Avik Dutta, Priyanshu Gupta, Hosein Hasanbeig, Rahul Pratap Singh, Harshit Nigam, Sumit Gulwani, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari Multi Turn Interaction Workshop @ NeurIPS'25 | October 2025
Publication The Markovian Thinker Milad Aghajohari, Kamran Chitsaz, Amirhossein Kazemnejad, Sarath Chandar, Alessandro Sordoni, Aaron C. Courville, Siva Reddy ICLR 2026 | October 2025
Publication Flipping the Dialogue: Training and Evaluating User Language Models Tarek Naous, Philippe Laban, Wei Xu, Jennifer Neville ICLR 2026 | October 2025