OpenAI’s o3 and o4-mini: Hallucination Rates Surge 2–3x

In April 2025, OpenAI’s release of its latest reasoning models, o3 and o4-mini, was met with both excitement and concern. While these models promised state-of-the-art performance in reasoning, coding, and multimodal tasks, internal and third-party evaluations revealed a startling issue: hallucination rates have surged to two or even three times those of previous models. OpenAI officials have openly admitted their inability to fully explain this phenomenon, raising critical questions about the reliability, safety, and future direction of advanced AI reasoning systems.