Improving the Post-hoc Calibration of Modern Neural Networks with Probe Scaling
We present "probe scaling": a post-hoc recipe for calibrating the predictions of modern neural networks. Our recipe is inspired by several lines of work, which demonstrate that early layers in the neural network learn general rules whereas later layers specialize. We show how such observations can be utilized in a post-hoc manner to calibrate the predictions of trained neural networks by injecting linear probes on the network's intermediate representations. Similar to temperature scaling, probe scaling neither retrains the architecture nor requires significantly more parameters. Unlike temperature scaling, however, it utilizes intermediate layers in the neural network. We demonstrate that probe scaling improves performance over temperature scaling on benchmark datasets across all five metrics: expected calibration error (ECE), negative log-likelihood, Brier score, classification accuracy, and the area under the ROC curve.
PDF Abstract