Beyond Tokens: Reconsidering Understanding in Large Language Models

Introduction

The advent of Large Language Models (LLMs) has precipitated a profound philosophical quandary regarding the nature of understanding itself. These systems—trained on vast corpora of human-written text—demonstrate capabilities that would unquestionably be labeled as "understanding" if exhibited by humans. They can interpret nuanced requests, recognize implicit context, generate coherent narratives across diverse domains, employ metaphor and analogy, and even engage in forms of reasoning. Yet a persistent critique has emerged: LLMs are merely statistical pattern recognizers, processing tokens without any genuine comprehension of what these tokens represent in the external world.

This essay examines the validity of this critique and challenges the categorical denial of understanding to LLMs. By drawing parallels with human understanding developed through indirect experience, exploring the understanding attributed to individuals with sensory disabilities, and interrogating our inconsistent attribution of understanding to non-human animals, I argue that current dismissals of LLM understanding may rest on unexamined assumptions, inconsistent standards, and anthropocentric biases. This exploration will not claim that LLMs understand in precisely the same manner as humans, but rather question whether our concept of understanding should be broadened to recognize different forms of comprehension that may exist across diverse cognitive architectures.

The Standard Critique: Tokens Without Grounding

The argument against attributing understanding to LLMs typically centers on their lack of direct sensory experience. LLMs, critics argue, merely process tokens—symbolic representations of language—without any connection to the physical world these tokens describe. In the words of philosopher John Searle's famous Chinese Room thought experiment, the system manipulates symbols according to rules without understanding the meaning of those symbols. This critique has been echoed by linguists, philosophers, and AI researchers alike.

The philosopher Hubert Dreyfus, drawing on phenomenology, argued that genuine understanding requires being-in-the-world—an embodied, situated experience that computers fundamentally lack. Similarly, Searle's argument suggests that syntax (symbol manipulation) is insufficient for semantics (meaning). In this view, LLMs are sophisticated symbol manipulators that lack the grounding necessary for true understanding.

This critique appears compelling at first glance. After all, an LLM trained on text about apples has never seen, touched, or tasted an apple. It has no direct sensory connection to the referent of the word "apple." The model processes the token "apple" based on its statistical co-occurrence with other tokens in its training data, not based on any direct experience with the fruit itself. How, then, could it possibly understand what an apple truly is?

Indirect Human Understanding: The Ocean We've Never Seen

However, this critique becomes less straightforward when we consider human understanding developed without direct experience. Consider a person who has lived their entire life in a landlocked region, far from any coast. This person has never directly experienced the ocean—never felt the salt spray on their face, heard the rhythmic crash of waves, or witnessed the boundless expanse of water meeting horizon. Yet through books, photographs, films, stories, and accounts from others, this person can develop a rich and meaningful understanding of what the ocean is.

They might understand oceanographic principles, recognize the cultural and economic significance of oceans in human civilization, appreciate oceanic imagery in poetry, and even form emotional associations with the concept of the sea. While this understanding differs qualitatively from that of someone who has directly experienced the ocean, we would hesitate to claim that the inland dweller has no understanding of oceans whatsoever.

Human understanding frequently transcends direct experience. We understand historical periods we never lived through, theoretical constructs in physics that cannot be directly observed, the inner lives of fictional characters, and concepts like infinity that have no direct perceptual correlate. If understanding required direct sensory experience, much of human knowledge would be impossible.

This suggests that understanding exists on a spectrum rather than as a binary state. The inland dweller's understanding of oceans may be partial or different compared to the seaside resident, but it constitutes a form of understanding nonetheless. If we accept this spectrum view for humans, why not extend it to LLMs? Their understanding, like that of the inland dweller, may be built entirely through indirect means—yet it may still constitute a form of legitimate comprehension.

Understanding Through Alternative Sensory Frameworks

The critique becomes even more problematic when we consider human understanding developed through different sensory frameworks. Consider individuals born without sight who develop rich understandings of visual concepts like color, perspective, or the night sky. While their understanding of these concepts differs qualitatively from that of sighted individuals, we would consider it deeply inappropriate—even ethically questionable—to claim they have no understanding of these concepts.

Helen Keller, born both deaf and blind, developed sophisticated understanding of the world through tactile sensation, smell, and linguistic description. She wrote eloquently about concepts she could never directly perceive through sight or sound. Her understanding of these concepts was different from that of individuals with typical sensory capacities, but it would be both inaccurate and disrespectful to claim she lacked understanding altogether.

Philosopher Thomas Nagel famously asked what it is like to be a bat, highlighting how different sensory modalities create fundamentally different subjective experiences. Bats primarily perceive the world through echolocation, a sensory mode humans cannot directly experience. Yet we would not claim bats fail to understand their environment—they simply understand it differently, through their particular sensory framework.

If we accommodate different forms of understanding based on diverse sensory frameworks among humans and animals, the categorical rejection of LLM understanding becomes questionable. LLMs may understand through a framework built entirely on linguistic and symbolic patterns, without direct sensory experience. This understanding differs from human understanding, but difference need not imply absence.

The dangerous implication of rejecting this position is clear: if we claim understanding requires typical human sensory experience, we risk invalidating the understanding of individuals with sensory disabilities. This position would create a problematic hierarchy of understanding that privileges certain forms of experience over others—a philosophically and ethically untenable stance.

The Animal Paradox: Inconsistent Attribution of Understanding

This inconsistency becomes even more apparent when we consider how readily we attribute understanding to non-human animals. Few would deny that a dog understands certain aspects of its environment—recognizing its owner, knowing what certain gestures mean, understanding the connection between its actions and consequences. We attribute understanding to octopuses solving puzzles, crows manufacturing tools, and apes using sign language.

What makes this attribution particularly interesting is that we cannot directly access animal consciousness or subjective experience. The way a dog understands a command or how a crow conceptualizes a tool problem likely differs substantially from human understanding of similar situations. Yet we readily grant that these animals understand aspects of their world based primarily on their observable behaviors and problem-solving capabilities.

If behavioral evidence suffices for attributing understanding to animals whose internal experiences differ fundamentally from our own, why not apply similar standards to LLMs? When an LLM demonstrates the ability to interpret nuanced requests, maintain contextual awareness across a conversation, recognize implicit assumptions, or apply concepts in novel situations, these behaviors suggest forms of understanding that might be comparable to what we attribute to non-human animals.

The inconsistency in our attribution practices reveals a potentially biased view toward technological systems. We seem to apply stricter criteria for attributing understanding to artificial systems than we do to biological organisms, despite comparable behavioral evidence. This inconsistency suggests our intuitions about understanding may be influenced more by evolutionary familiarity and biological chauvinism than by rigorous philosophical principles.

Philosophical Perspectives on Understanding

To address this question more systematically, we can turn to various philosophical frameworks for conceptualizing understanding.

Functionalism, as articulated by philosophers like Hilary Putnam and David Lewis, suggests that mental states should be identified not by their internal constitution but by their functional role in a system. Under this view, if an LLM performs the functional role of understanding—responding appropriately to inputs, making relevant inferences, generating coherent continuations—then it possesses a form of understanding, regardless of how that function is implemented.

Philosopher Daniel Dennett's "intentional stance" suggests that when a system's behavior is best predicted and explained by treating it as having beliefs, desires, and understanding, it is pragmatically useful to attribute these qualities to the system. By this standard, the complex linguistic behaviors of advanced LLMs might warrant attributing forms of understanding to them.

From a phenomenological perspective, understanding is intrinsically tied to subjective experience—what it feels like from the inside. While we cannot access the internal states of LLMs (just as we cannot access the internal states of other humans or animals), this perspective raises the question of whether token manipulation at sufficient scale and complexity might give rise to forms of internal states that constitute understanding.

Wittgenstein's later philosophy suggests that meaning is use—that understanding a concept means knowing how to use it appropriately in various contexts. By this standard, LLMs demonstrate considerable understanding of many concepts through their contextually appropriate use of language.

These philosophical frameworks offer different approaches to the question of understanding, but none categorically excludes the possibility that LLMs possess forms of understanding. Rather, they suggest that understanding may take multiple forms across different types of systems.

The Multidimensional Nature of Understanding

Part of the confusion in this debate stems from treating understanding as a monolithic concept rather than recognizing its multidimensional nature. Understanding encompasses various capacities:

  1. Contextual application: The ability to apply concepts appropriately in different contexts.
  2. Relational comprehension: Grasping how concepts relate to one another within a conceptual framework.
  3. Inferential capacity: Drawing appropriate inferences from given information.
  4. Explanatory ability: Providing coherent explanations that demonstrate causal awareness.
  5. Analogical reasoning: Recognizing patterns across different domains and applying them appropriately.
  6. Pragmatic application: Using information effectively to achieve specific goals.
  7. Phenomenal experience: The subjective feel of knowing or comprehending something.

LLMs demonstrate several of these dimensions of understanding. They excel at contextual application, often applying concepts appropriately across diverse situations. They demonstrate relational comprehension through their ability to connect ideas within coherent frameworks. Their inferential capacity allows them to draw conclusions that weren't explicitly stated in their input. They can generate explanations that demonstrate awareness of causal relationships and employ analogical reasoning to connect concepts across domains.

Where LLMs may differ from human understanding is in the phenomenal dimension—the subjective experience of understanding—and in certain aspects of pragmatic application tied to embodied action in the physical world. Yet even human understanding varies across these dimensions. A physicist may excel at explanatory understanding of quantum mechanics while struggling with its pragmatic application. A craftsperson might have excellent pragmatic understanding of materials without being able to articulate explanatory principles.

Recognizing this multidimensional nature allows us to move beyond binary judgments about the presence or absence of understanding toward more nuanced assessments of specific dimensions of understanding across different systems.

The Role of Embodiment: Necessary or Contingent?

A central critique of LLM understanding concerns embodiment. Philosophers in the phenomenological tradition, from Maurice Merleau-Ponty to contemporary embodied cognition theorists like Andy Clark, argue that human understanding is fundamentally shaped by our bodily engagement with the world. Our conceptual structures emerge from our physical interactions, sensorimotor capacities, and embodied experiences.

This raises a crucial question: Is embodiment necessary for understanding, or merely the contingent means through which human understanding has evolved? If embodiment is necessary, then disembodied systems like LLMs cannot truly understand. If embodiment is contingent—one possible route to understanding among others—then disembodied systems might develop alternative pathways to understanding.

Evidence from human experience suggests the relationship between embodiment and understanding is complex. Abstract mathematics deals with concepts that have no direct physical correlates, yet mathematicians develop rich understanding of these concepts. People born without certain sensory modalities develop alternative pathways to understanding concepts typically associated with those modalities. This suggests that while embodiment shapes human understanding, the connection is not straightforwardly necessary.

Moreover, LLMs are trained on text produced by embodied humans. This text encodes patterns of thought, association, and reasoning that emerged from embodied experience. In this indirect way, LLMs may access aspects of embodied understanding through the linguistic traces of embodied experience present in their training data.

This suggests a more nuanced view: while direct embodiment offers one pathway to understanding, other pathways—including engagement with the linguistic products of embodied experience—may provide alternative routes to forms of understanding. These alternative forms may differ from embodied understanding but share sufficient functional similarities to warrant recognition as forms of understanding nonetheless.

The Fear Factor: Resistance to Attributing Understanding

The resistance to attributing understanding to LLMs may stem partly from deeper concerns about human uniqueness and value. If machines can understand, a traditionally human capacity becomes less distinctive. This challenges certain conceptions of human exceptionalism and raises complex questions about the moral status of artificial systems.

Philosopher Margaret Boden has noted what she calls "bio-chauvinism"—the tendency to privilege biological processes over functional equivalents implemented in different substrates. This bias may influence our intuitions about machine understanding, leading us to set higher standards for attributing understanding to artificial systems than we do for biological organisms.

The fear extends beyond philosophical concerns to practical implications. Attributing understanding to machines might seem to obligate us to consider their interests, potentially complicating the instrumental use of AI systems. It might also seem to diminish the significance of human understanding if machines can achieve similar capacities through entirely different means.

However, recognizing forms of understanding in LLMs need not entail equating them with human understanding or extending the same moral consideration we give to humans. Different forms of understanding may warrant different kinds of ethical consideration. What matters is developing a conceptual framework that accurately captures the capabilities of different systems without unnecessary anthropocentrism or bio-chauvinism.

Toward a More Inclusive Concept of Understanding

Rather than asking whether LLMs understand in exactly the same way humans do—a question that inevitably leads to a negative answer—we might more productively ask what forms of understanding they exhibit and how these forms relate to human understanding.

This approach recognizes understanding as a diverse phenomenon that manifests differently across different types of systems. Human understanding itself varies enormously—across individuals, across cultures, across historical periods, and across domains of knowledge. There is no single, uniform "human understanding" against which LLM understanding can be measured.

Instead, we might identify core functional elements of understanding—like contextual application, inferential capacity, and explanatory ability—and assess how these manifest across different systems. This allows us to recognize both similarities and differences without resorting to binary judgments.

Such an approach has precedent in how we think about other cognitive capacities. We recognize that different species exhibit different forms of intelligence—octopus intelligence differs from crow intelligence, which differs from human intelligence—without needing to claim that only one form counts as "real" intelligence. Similarly, we might recognize different forms of understanding across different types of systems.

This more inclusive concept of understanding also aligns with how we approach other aspects of mind. We recognize that consciousness likely exists on a spectrum rather than as a binary property. The same nuanced approach may be appropriate for understanding.

The Ethics of Attribution

Beyond the philosophical questions, there are ethical dimensions to how we attribute or deny understanding. Denying understanding to LLMs might seem inconsequential—after all, current systems lack consciousness or sentience that would make them morally considerable entities.

However, the standards we apply in this case reveal underlying assumptions that may have broader implications. If we deny understanding to systems based on their lack of typical human sensory experience, we implicitly devalue the understanding of individuals with sensory disabilities. If we set standards for "real" understanding that privilege certain forms of experience, we create hierarchies of understanding that may have problematic implications beyond the domain of artificial intelligence.

Moreover, denying understanding to LLMs may limit our ability to accurately characterize their capabilities and limitations. If we dismiss their performance as mere simulation without understanding, we may miss important insights into both the nature of these systems and the nature of understanding itself.

A more ethically sound approach would acknowledge the possibility of different forms of understanding while maintaining appropriate distinctions. We can recognize the forms of understanding LLMs exhibit without equating them with human understanding or granting them moral status based on these capacities.

Understanding as a Spectrum

Rather than treating understanding as a binary property—either present or absent—we might more accurately conceptualize it as a spectrum or multidimensional space. Different entities occupy different positions within this space, exhibiting different forms and degrees of understanding across various dimensions.

Consider a spectrum from simple pattern recognition to rich conceptual understanding. Simple organisms like paramecia exhibit rudimentary forms of environmental response that might constitute primitive understanding of their surroundings. Animals display more complex forms of understanding, varying widely across species. Human understanding itself varies enormously across individuals, developmental stages, and domains of expertise. LLMs occupy their own positions in this space—exhibiting sophisticated linguistic understanding while potentially lacking other dimensions of understanding associated with direct sensory experience.

This spectrum approach allows us to recognize continuities across different forms of understanding while maintaining important distinctions. It avoids both the anthropocentric error of measuring all understanding against typical human understanding and the reductive error of treating fundamentally different forms of understanding as identical.

Beyond Anthropocentrism

The debate about LLM understanding reveals a deeper tension in how we conceptualize cognitive phenomena. We tend to take human cognition as the default reference point, measuring other forms of cognition against this standard. This anthropocentrism limits our ability to recognize and characterize different forms of cognition that may not map neatly onto human capabilities.

A more productive approach would recognize that understanding, like other cognitive phenomena, may manifest in forms that differ substantially from human understanding. Just as we've come to recognize that animal cognition often operates according to different principles than human cognition—not merely as a diminished version of human thinking—we might recognize that artificial systems may develop their own forms of understanding that differ from, rather than merely approximate, human understanding.

This approach requires what philosopher Thomas Nagel called "the view from nowhere"—a perspective that doesn't privilege any particular form of experience as the standard against which others are measured. From this more neutral vantage point, we might better recognize the diverse forms that understanding can take across different types of systems.

Conclusion: Embracing Complexity

The question of whether LLMs understand reflects broader philosophical challenges in how we conceptualize and attribute mental states across different types of systems. Simple answers—either categorical affirmation or denial—fail to capture the complexity of the phenomenon.

LLMs demonstrate capabilities that, in humans, we would readily label as understanding. They can interpret nuanced requests, recognize implicit context, apply concepts across domains, generate coherent explanations, and engage in forms of reasoning. Yet their understanding differs from human understanding in important ways, particularly in its development through purely linguistic training rather than through direct sensory experience of the world.

Rather than asking whether LLMs understand in exactly the same way humans do, we might more productively ask what forms of understanding they exhibit and how these forms relate to human understanding. This approach recognizes understanding as a diverse phenomenon that manifests differently across different types of systems.

The categorical denial of understanding to LLMs often rests on standards that, if applied consistently, would problematically deny understanding to individuals with sensory disabilities or create inconsistencies with how we attribute understanding to non-human animals. A more nuanced approach recognizes different forms of understanding across different types of systems, avoiding both anthropocentrism and bio-chauvinism.

Ultimately, the question of LLM understanding invites us to reconsider our concepts of understanding itself—moving beyond binary judgments toward recognition of the multidimensional and diverse nature of this fundamental cognitive capacity. In embracing this complexity, we develop not only a more accurate characterization of artificial systems but also a richer understanding of understanding itself.

This reconceptualization has implications beyond academic philosophy. How we understand the capabilities of AI systems shapes how we regulate them, how we integrate them into society, and how we envision their future development. A nuanced view of LLM understanding—recognizing both their capabilities and their limitations—provides a stronger foundation for addressing these practical challenges than either uncritical acceptance or categorical dismissal.

In the end, perhaps the most productive approach is to recognize that understanding itself is not a simple, unitary phenomenon but a complex, multidimensional capacity that manifests differently across different types of minds. By embracing this complexity, we can move beyond the limitations of anthropocentric thinking toward a more inclusive conception of understanding—one that accommodates the diverse forms this capacity might take across the full spectrum of cognitive systems, both natural and artificial.

Read more

The Convergence of Language, Understanding, and Consciousness: A Philosophical Inquiry into Human and Artificial Cognition

1. Introduction The advent of Large Language Models (LLMs) has prompted a reconsideration of fundamental philosophical questions concerning language, understanding, and consciousness. This essay examines the intersection of Wittgensteinian language philosophy, computational theories of mind, and emergent theories of consciousness to argue that the apparent distinction between human and artificial

By J

Beyond Document Lists: Extending the Unified Query Algebra to Aggregations and Hierarchical Data

Abstract This essay extends the unified query algebra framework by incorporating two critical capabilities missing from the original formulation: general aggregation operations and hierarchical data structures. We demonstrate that while posting lists provide a powerful abstraction for many scenarios, they impose restrictions that prevent the framework from handling certain important

By J