Why “Hallucination”? Examining the History, and Stakes, of How We Label AI’s Undesirable Output

Joshua Pearson examines the history of the term “hallucination” in the development and promotion of AI technology.

Why “Hallucination”? Examining the History, and Stakes, of How We Label AI’s Undesirable Output

NAVIGATING THE DISCOURSE around “artificial intelligence” is infuriating, amusing, and exhausting by turn. In this rhetorical space, the most distorting aspects of our contemporary information system converge: the myopia of STEM writing, the vacuity of business lingo, and the hype-chasing prevalent in journalism, all compounded by the seemingly universal tendency to borrow the most sensationalizing examples science fiction has to offer.

Many scholars and (more responsible) journalists have pushed back against the term “artificial intelligence” itself, demonstrating repeatedly that AI is a mystifying umbrella term giving false cohesion to a wide variety of computational techniques. Their work establishes that it is vital, first, to distinguish between the still science-fictional concept of “general AI” (or artificial general intelligence, a.k.a. AGI) and the realities of currently existing “narrow AIs” and, second, to identify the specific technology being discussed (machine learning, natural language processing, large language models, etc.). The presence or absence of such disambiguation is an easy smell test you can apply to any discussion of AI you encounter. Another red flag in AI commentary is recourse to “not if, but when” teleology, which displaces discussion of what AI actually does in the present into speculation about what its perfected future form might do. These promised futures obscure the specific use cases, limitations, and adverse impacts of existing tech.

The mystifying element of AI discourse that I focus on here is the pervasive use of the term “hallucination” to describe output from generative AI that does not match the user’s desires or expectations, as when large language models confidently report inaccurate or invented material. Many people have noted that “hallucination” is an odd choice of label for this kind of inaccurate output. This wildly evocative term pulls associated concepts of cognition, perception, intentionality, and consciousness into our attempt to understand LLMs and their products, making a murky subject even harder to navigate. As Carl T. Bergstrom and Brandon Ogbunu argue, not only does the term’s analogy with human consciousness invite a general mystification and romanticization of AI tech, but it also specifically blurs the crucial distinction between existing “narrow” AI and the SF fantasy of “general AI.” Every time we use the term “hallucination,” it is harder to remember that Clippy is a better mental model for existing AI than Skynet is.

The other important effect of the term “hallucination” to mark undesirable outputs, which we focus on here, is the way it frames such outcomes as an exception to AI’s general ability to both recognize and report “real” information about the world. “Hallucination,” as Bergstrom and Ogbunu argue, implies that the AI accidentally reported something unreal as if it were real. “Hallucination” is a way to frame an acknowledgment that AI output isn’t totally trustworthy while emphasizing the idea that its output is still a generally accurate reporting of reality. This exculpatory function of “hallucinate” as the label of choice is made more apparent when we consider the alternate term that Bergstrom and Ogbunu propose: “Bullshit.”

On one hand, we must recognize how this proposal draws on profanity’s power to cut through mystifications and abstractions of all kinds. But the rigorous definition of BS that Bergstrom (with co-author Jevin D. West) provides in Calling Bullshit: The Art of Skepticism in a Data-Driven World (2020) goes further by breaking through the false framing of truth/hallucination, because the nature of bullshit lies in its indifference to truth. Bullshit doesn’t seek to be right; it seeks to sound right, and this imperative of compelling resemblance maps better onto the actual processes of generative AI models than hallucination does. Using bullshit rather than hallucination as our operative term reminds us that everything chatbots output is an attempt to look and sound right, a compelling resemblance that may or may not correspond to the true state of things. What hallucination frames as the exception, bullshit reveals as the nature of the beast.

This juxtaposition with bullshit emphasizes the mystifying qualities of “hallucination,” sharpening our interrogation of why a term that actively works to make generative AI harder to understand became the industry standard term. If you are a melancholic Marxist like me, your gut is telling you that the answer is ultimately reducible to some kind of bad faith capitalist fuckery. Naomi Klein gives a clear voice to such suspicions in her excellent article “AI Machines Aren’t ‘Hallucinating.’ But Their Makers Are.” She argues that “hallucinate” is “the term that architects and boosters of generative AI have settled on to characterize […] the fallibility of their machines, [while] simultaneously feeding the sector’s most cherished mythology.” Like Bergstrom and Ogbunu, Klein highlights the intense usefulness of the term “hallucinate” for the implicit goal of tech discourse: the alchemical transformation of inquiry into hype, critical thought into speculative valuation.

I found Klein’s approach so satisfying that I concluded confirmation bias had to be involved. As with most lefty critiques of “hallucination” as a term that mystifies more than it reveals, Klein’s framing contains a problematic implicit assumption about the term’s origin and purpose, an inferred narrative in the form of a conspiracy theory: Because the term has such utility in mystifying and mythologizing generative AI, that utility must have been a key factor in the adoption of that term during the development of the technology.

Assuming bad faith was intentionally baked into the term from the start indulges my distrust of those promoting AI tech today. However, it also demands an impressive level of both reflective awareness and cynical planning on the part of the researchers who developed the underlying technology. Certainly, “hallucination” has been a boon for hucksters during AI’s star turn in the hype cycle. But was the term invented for this purpose, or are hucksters just exploiting possibilities within a tool that lies ready to hand?

To find out, we would need to ask specific questions: When did this term actually originate in computing discourse? What use cases did the term “hallucination” originally have, and how did its originators frame “hallucination” in relation to the products of early generative network models? Can we detect precursors to the marketing-focused usage at these origin points? Or, more likely, can we see how these early uses of the term sow the seeds for the ambiguity, contradiction, and misrecognition in “hallucination” that facilitate tech-bro hype today?

Despite the staggering volume of “content” out there about generative AI, there have been surprisingly few attempts to answer these questions. Those that identify different points of emergence are scarce, and their discussion of the rhetorical framing and theoretical implications of the term in these original sources is cursory at best, largely projecting the current boosterish understanding of the term backward onto the historical sources.

A good example is the attempt by Ben Zimmer, The Wall Street Journal’s resident linguist. In asking how “a word used for illusory human perceptions g[o]t applied to computer-synthesized responses,” Zimmer points a very incriminating finger indeed: “In a 2015 blog post, Andrej Karpathy, a founding member of OpenAI, wrote about how models can ‘hallucinate’ text responses, like making up plausible URLs and mathematical proofs. The term was picked up in a 2018 conference paper by AI researchers working with Google.” Zimmer is not alone in pointing to the 2015 post or the 2018 paper, and you can see why. If “hallucinate” was coined by a founder of OpenAI, whose business model relies on hype-driven venture capital investment, and was developed by Google, which has so visibly abandoned its pledge “not to be evil,” then that would seem to confirm the assumption that coining “hallucinate” was a conscious act of mystification. Key sections from the 2018 paper can be read in that light, as when the authors warn that “even if hallucinations occur only occasionally, the NMT model may lose user trust and/or lead the user to a false sense of confidence in a very incorrect translation.” For these researchers, it seems, the negative impact on the business model of bad optics regarding AI’s bullshit output is at least as concerning as the hazard to the end user. Maybe more, as the threat to “consumer trust” is listed first. So, has Zimmer found the smoking gun?

Unfortunately for Zimmer (and the many others who focused on these high-profile sources), this is not the origin point of AI hallucination. Computer scientists coined and recoined the term decades before this. The oldest source of the term commonly referenced in popular journalism, and in much recent scholarly work, is Simon Baker and Takeo Kanade’s presentation “Hallucinating Faces” at the IEEE International Conference on Automatic Face and Gesture Recognition in 2000 (based on their 1999 paper of the same name). But the term goes back further than this. Bergstrom and Ogbunu identify the oldest sources on “hallucinating” algorithms I have seen referenced: John Irving Tait’s 1982 technical report “Automatic Summarising of English texts,” and Eric Mjolsness’s 1985 thesis “Neural Networks, Pattern Recognition, and Fingerprint Hallucination.” But they don’t actually dig into Tait and Mjolsness’s understanding of the term and how it tracks with usage today.

Working my way through these and other early papers and technical reports, I found two very different genealogies of the term “hallucination” in text and image processing. Let’s start with the text branch. Describing the limitations of an early text-parsing technique called FRUMP, Tait writes, “I have called this problem the hallucination of matches because in effect what FRUMP has done is to see in the incoming text a text which fits its expectations, regardless of what the input text actually says” (my emphasis). We can see already, in 1982, a familiar framing of problems on the text-focused branch: bogus output doesn’t indicate AI deception but rather misapprehension, a “difficulty processing the text” that is figured as a failure of vision. What we don’t see, though, is today’s framing of “hallucinated matches” as exceptional—bugs in an otherwise perfectible system. Indeed, Tait writes:

I cannot see that a system can be called robust when it produces complete misanalyses and does so without leaving any indication that it has had difficulty processing the text. It would probably be better if the system produced no analysis at all and indicated that it could not process the text.

Like many AI skeptics today, Tait locates the problem not in the faulty results but in the way the system is designed to make those outputs indistinguishable from accurate ones. While the text-processing side of the genealogy of “hallucination” does provide useful framing for today’s AI boosters, there is also more skepticism and concern for the end user than modern commentators might expect.

Results from the image-processing side are even more surprising. Mjolsness’s 1985 thesis proposes a system for “cleaning up” images of fingerprints to assist automated matching using a two-stage approach:

[T]he approach used will be first to obtain a network which produces plausible clean fingerprint-like patterns from random input, and then modify it so that real fingerprint patterns as input produce “nearby” but clean (noiseless) fingerprint patterns as output. […] The first stage, the production of clean fingerprint-like patterns from random input, is referred to as “hallucination.”

Here, Mjolsness’s logic in choosing “hallucination” is clear, because the explicit task of the system is to impose a desired pattern onto random data, literally seeing something that isn’t there. While I couldn’t find a direct citation of Mjolsness in Baker and Kanade’s 1999 paper, we find there a similar approach, and a similar framing of “hallucination.” Their work focuses on image resolution enhancement algorithms (think of the classic Blade Runner “enhance” scene). When Baker and Kanade discuss “our hallucination algorithm,” it is not just a label for mistaken or unwanted output. Instead, their description characterizes both the process and the entire output of their system. When extrapolating from a low-pixel image of a human face to a higher-pixel version, “[t]he additional pixels are, in effect, hallucinated” by their algorithm. It performs well when “enhancing” images of faces, no matter how degraded, precisely because it distorts all inputs into face-like patterns in a manner that seems hallucinatory.

In this genealogy of usage within image processing and generation, “hallucination” describes technologies that are, as Bergstrom and Ogbunu emphasized, characterized by their indifference to truth. The compelling resemblances that they produce may be more or less useful, but all are equally “hallucinated.” This usage of “hallucination,” encompassing process and product, is still current in the specific domain of facial recognition and reconstruction but doesn’t seem to cross over to either the specialist discourses of text processing or the popular discourses shaped primarily by chatbot boosters (though this may change as LLM developers come to terms with the fact that “hallucination is inevitable”).

Examining the real history of AI’s “architects” coining and recoining the term “hallucination,” then, shows that they left multiple tools to hand within the term—alternatives to those elevated by the hype cycle today. While the technical details of image processing and text generation performed by LLMs differ in important ways, I believe that the genealogy of “hallucination” developed in image processing still offers a better, more accurate, and more useful public-facing model for understanding all forms of AI output than the text processing–derived usage that currently dominates popular discourse. It might be too late to withdraw the election of “hallucination” to term du jour in the public discourse, but we may still be able to draw on Mjolsness, Baker, and Kanade’s version of the term to recast AI’s output as the bullshit it truly is.


Featured image: László Moholy-Nagy. Circle and Bar, 1921–23. Gift of Collection Société Anonyme. Yale University Art Gallery (1941.578). CC0, artgallery.yale.edu. Accessed January 18, 2024. Image has been cropped.

LARB Contributor

Dr. Joshua Pearson teaches courses in science fiction, media, and cultural studies at California State University, Los Angeles. His scholarship explores the intersections of economics, identity, and social agency in science fiction.


LARB Staff Recommendations

Did you know LARB is a reader-supported nonprofit?

LARB publishes daily without a paywall as part of our mission to make rigorous, incisive, and engaging writing on every aspect of literature, culture, and the arts freely accessible to the public. Help us continue this work with your tax-deductible donation today!