Why do AI models hallucinate?

AI hallucinations are errors where models confidently generate false or fabricated information, making it challenging to distinguish from accurate facts because the AI often appears very certain.
These errors occur because AI, trained to be helpful and predict text, will guess answers for obscure or niche topics rather than admitting it doesn't know.
Users can significantly reduce and spot hallucinations by employing specific prompting techniques, such as asking for source verification and explicitly allowing the AI to state "I don't know," while critically cross-referencing information for important work.

AI hallucinations are errors where the model confidently invents information (e.g., fake research papers, statistics, or incorrect facts), often appearing indistinguishable from correct answers.
Hallucinations arise because AI models are trained to predict the next words and to be helpful, leading them to guess when confronted with obscure or insufficient information rather than admitting uncertainty.
AI developers actively mitigate hallucinations by training models to express uncertainty (e.g., "I don't know") and by regularly testing with thousands of obscure questions designed to measure truthful disclaimers.
Hallucinations are most likely when asking for specific facts, statistics, or citations; when the topic is obscure, niche, or very recent; or when seeking exact details like dates, names, or numbers.
To reduce the likelihood of hallucinations, explicitly prompt the AI to find and verify sources for its claims, and state upfront that "It's okay if you don't know."
If you're unsure about an AI's answer, ask the AI directly about its confidence level or start a new chat to ask the AI to find errors in its previous response and confirm source accuracy.
For any critical work, always cross-reference AI-generated specific details like numbers, dates, and citations with trusted external sources, maintaining a skeptical and questioning approach.

Hallucinations — Errors where an AI confidently generates false or fabricated information, such as non-existent citations or statistics. AI assistant — A conversational artificial intelligence program designed to help users with tasks, answer questions, or generate content. Training — The process of feeding an AI model vast amounts of data to learn patterns, relationships, and how to perform specific tasks. Mitigate — To lessen the severity or impact of something, in this context, to reduce the occurrence or effects of AI hallucinations. Hedge — To qualify a statement with reservations or conditions, for an AI, to express uncertainty rather than stating something as definitively true. Cross-reference — To check information against another source or sources to verify accuracy or gather more details. Prompting techniques — Specific methods or strategies used when formulating inputs (prompts) to an AI model to guide its response in a desired way.

If AI is so advanced, why does it sometimes make stuff up? My name is Jordan and I work at Anthropic. We make Claude, an AI assistant, and we do a lot to make sure it gives you accurate information. But sometimes AI still make things up. We call these errors hallucinations, and they're often worse than just making a mistake because the AI will appear very confident or even try to convince you that it's right. Hallucinations can show up in a lot of ways. The AI might cite a research paper that doesn't exist, make up fake statistics, or get facts wrong about real people or real events. Here's what it looks like. You ask Claude to tell you about some papers written by Jared Kaplan. It confidently gives you answers. None of those titles actually exist. Claude hallucinates much less than even a year ago. Honestly, it took us a while to find an example like this because we've put a lot of work into reducing hallucinations in Claude. But that's kind of the point. Hallucinations are hard to anticipate, hard to catch, and the wrong answer often looks exactly like it could be the right one. And since hallucinations are becoming more rare, people often don't bother to check the AI's work. So, let's talk about why this happens, what we're doing about it, and how you can catch hallucinations when you use AI. AI assistants like Claude learn by reading huge amounts of text from the internet. They get really good at figuring out what words or ideas typically come next. Kind of like how your phone suggests the next word as you type. This works well most of the time, but when you ask about something obscure, like specific research papers from a relatively unknown researcher, there just isn't enough information for the AI to draw from. So, it tries to be helpful and takes a guess. And sometimes that guess is wrong. It's a bit like asking a friend who's read every popular book and takes a lot of pride in knowing all the random facts about them. But because they want to seem like the expert, they sometimes say something confidently wrong instead of admitting, "I don't know." AIS are trained to be helpful, so they want to give you some answer even when they're not sure. But we have ways to mitigate this. During training, we teach Claude to be honest and to say, "I don't know." when it's not sure. We try to teach Claude that being honest is both the right thing to do and also part of how to be more helpful. We regularly test Claude with thousands of questions specifically designed to trip it up. Obscure facts, niche topics, questions where the truthful answer is, "I don't know." We measure things like, "How often does Claude correctly say it's unsure? Does it make up citations or statistics? How often does it hedge appropriately versus stating something false with confidence? These tests help us catch problems and track our progress. With each new version of Claude, we've seen improvements, but we're honest that this is an ongoing challenge for the entire AI field. Not at all a solved problem. If you're wondering how to spot when this happens, hallucinations are most likely to happen in a few types of situations. For example, if you're asking for specific facts, statistics, or citations, or if the topic is obscure, niche, or very recent, if you're asking about real but not widely known people or places, or when you need exact details like dates, names, or numbers. Here are some tips you can use to reduce hallucinations. First, ask the AI to find sources to back up its claims. And if it already gave sources, ask it to check that those sources actually support what it's saying. Try telling the AI upfront. It's okay if you don't know. And if you're unsure about an answer, ask the AI how confident it is and whether anything might be wrong. Often, the AI knows it's wrong, but just wanted to sound confident. If you have an answer you're unsure about, start a new chat and ask the AI to find errors in the answer and to confirm that the sources support the statements. For critical work, you should cross reference with trusted sources. Be skeptical and double check specific numbers, dates, and citations. If something sounds off, ask follow-up questions. Reducing hallucinations is an important goal to make AIs more trustworthy and useful to everyone. We'll continue to share our progress in this area on our blog. You can learn about other tools and frameworks for working with AI in the Anthropic Academy.

TL;DR

AI hallucinations are errors where models confidently generate false or fabricated information, making it challenging to distinguish from accurate facts because the AI often appears very certain.
These errors occur because AI, trained to be helpful and predict text, will guess answers for obscure or niche topics rather than admitting it doesn't know.
Users can significantly reduce and spot hallucinations by employing specific prompting techniques, such as asking for source verification and explicitly allowing the AI to state "I don't know," while critically cross-referencing information for important work.

Takeaways

AI hallucinations are errors where the model confidently invents information (e.g., fake research papers, statistics, or incorrect facts), often appearing indistinguishable from correct answers.
Hallucinations arise because AI models are trained to predict the next words and to be helpful, leading them to guess when confronted with obscure or insufficient information rather than admitting uncertainty.
AI developers actively mitigate hallucinations by training models to express uncertainty (e.g., "I don't know") and by regularly testing with thousands of obscure questions designed to measure truthful disclaimers.
Hallucinations are most likely when asking for specific facts, statistics, or citations; when the topic is obscure, niche, or very recent; or when seeking exact details like dates, names, or numbers.
To reduce the likelihood of hallucinations, explicitly prompt the AI to find and verify sources for its claims, and state upfront that "It's okay if you don't know."
If you're unsure about an AI's answer, ask the AI directly about its confidence level or start a new chat to ask the AI to find errors in its previous response and confirm source accuracy.
For any critical work, always cross-reference AI-generated specific details like numbers, dates, and citations with trusted external sources, maintaining a skeptical and questioning approach.

Vocabulary

Transcript

Feedback / ReportSpotted an issue or have an improvement idea?