Calibration without Ground Truth

Yuqing Kong; Mingyu Song; Yizhou Wang; Yifan Wu

Calibration without Ground Truth

Yuqing Kong ,
Mingyu Song ,
Yizhou Wang ,
Yifan Wu

ArXiv | January 2026 , Vol abs/2601.19862

Download BibTex

Villalobos et al. [2024] predict that publicly available human text will be exhausted within the next decade. Thus, improving models without access to ground-truth labels becomes increasingly important. We propose a label-free post-processing framework that improves a strong but miscalibrated model using a weaker yet better-calibrated reference. Our framework guarantees a strict performance improvement under any proper loss. Our approach is based on a characterization of when strict improvement is possible: when the strong and reference models are not mutually calibrated. We formalize this condition, connect it to arbitrage and no-trade results from economics, and develop an efficient Bregman projection algorithm that guarantees worst-case loss reduction without labels. Experiments on representative LLMs across varying scales demonstrate that our label-free method significantly reduces proper losses and calibration errors, achieving performance competitive with supervised baselines.