From Flat Facts to Sharp Hallucinations: Detecting Stubborn Errors via Gradient Sensitivity
arXiv:2605.00939v1 Announce Type: new Abstract: Traditional hallucination detection fails on "Stubborn Hallucinations" -- errors where LLMs are confidently wrong. We propose a geometric solution: Embedding-Perturbed Gradient Sensitivity (EPGS). We hypothesize that while robust facts reside in flat minima, stubborn hallucinations sit in sharp minima, supported by brittle memorization. EPGS detects this sharpness by perturbing input embeddings with Gaussian noise and measuring the resulting spike ...
Read full article →