A Survey on Hallucination in Large Language Models
Authors: Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, Ting Liu (Harbin Institute of Technology) Publisher: ACM TOIS Canonical URL: https://arxiv.org/abs/2311.05232
Summary
The current canonical survey of hallucination in LLMs. Introduces a taxonomy that distinguishes LLM-era hallucination from prior task-specific NLG hallucination, surveys causes, detection methods, and mitigation strategies, including the limitations of retrieval-augmented generation (RAG) as a fix.
Key Contributions
- Updated taxonomy. Splits hallucination into factuality (the claim is wrong about the world) and faithfulness (the claim is unsupported by provided source material) — important distinction often conflated.
- Causes are multi-layered. Pre-training data quality, training objective, RLHF artifacts, decoding randomness, and prompting all contribute.
- Detection methods. Token-level uncertainty, consistency checks, retrieval-based verification, model-as-judge — each with trade-offs.
- RAG is not a cure. Even with retrieval, LLMs hallucinate when the retrieved context is irrelevant, contradictory, or absent — and they often fail to abstain when retrieval fails.
- Knowledge boundaries. Models do not reliably know what they do not know (compare with Kadavath et al. 2022).
Relevance to This Wiki
- Primary reference for Hallucination concept page.
- Faithfulness/factuality split is important for H7 design: epistemic labeling can plausibly improve faithfulness (the model knows what it was given) but cannot fully solve factuality (the model doesn’t always know what’s true).
- RAG-is-not-a-cure finding is critical for any agentic system relying on retrieval as a hallucination guard — SATs may still be needed even when sources are provided.
See Also
- Kadavath et al. — Models Mostly Know What They Know (2022) — counterpoint on model self-knowledge
- Hallucination — concept page
- Ji et al. (2023) Survey of Hallucination in NLG (ACM Computing Surveys, arXiv:2202.03629) — predecessor survey