Reasoning about human & LLM reasoning
LLMs don't 'think' like humans do, and it's a mistake to assume that they do
This post is going to be a bit more abstract and theoretical than my usual fare. We’re going to explore a common error I see people make when they try to understand why a LLM like ChatGPT does a certain thing. People frequently try to analogize an LLM’s output to a person’s thinking, and they make erroneous inferences based on the assumption that an LLM reasons as a human reasons.
Let’s make this more concrete. Consider the following prompt: Explain why analogizing LLM reasoning to human reasoning is a flawed analogy.
Before we review its output, an important note: in a couple of the paragraphs that follow, ChatGPT4 appears to imply that it is human. It says of humans’ ability to think “we”. I would conclude that this is not some earth-shattering indication of incipient AGI or consciousness, but, rather, that it is an example of how LLM reasoning differs from human reasoning. ChatGPT doesn’t understand the difference between “we” and “they” (or any other set of pronouns). It is merely making a statistical choice: “We” precedes “can” as frequently as “they” precedes “can”.
When you read a human say of its thought “I do…” or “We do…” you are reading an acknowledgment of the human’s humanness. It does not follow from the output that I quote here that ChatGPT doing the same thing represents a claim by it that it is human, that it is concsious, or that it is intelligent in the way that humans are intelligent.
Here’s its output:
Analogizing Large Language Model (LLM) reasoning, such as GPT-4, to human reasoning can indeed be a flawed analogy for several reasons. Here are some key differences between human reasoning and LLM reasoning:
Understanding and Meaning: Humans, when reasoning, are capable of understanding context, nuance, and the meaning behind the words. They not only process the literal meaning but also understand metaphorical or idiomatic expressions based on context. On the other hand, LLMs don’t “understand” text in the same way humans do. They generate responses based on patterns they have learned during their training, but they don’t comprehend the semantics or implications behind these patterns.
Consciousness and Self-Awareness: Humans possess consciousness and self-awareness, which play a critical role in our reasoning process. We can introspect, analyze our thoughts, and make decisions based on our conscious understanding. In contrast, LLMs don’t have consciousness or self-awareness. They can’t reflect on their “thoughts” or make decisions based on conscious thought.
Experience and Learning: Human reasoning is influenced by personal experiences and continuous learning. We have the ability to learn from each situation we encounter, adapt our reasoning over time, and incorporate new knowledge dynamically into our thought processes. However, LLMs do not have personal experiences or the ability to learn from new data after they have been trained. Their reasoning is confined to the data they were trained on, and they cannot adapt or learn from user interactions post-training.
Emotions and Empathy: A lot of human reasoning is affected by emotions and empathy. We can understand and respond to emotional states, both our own and others’, which often guides our decision-making process. LLMs, on the other hands, lack the ability to feel or understand emotions. Any “empathy” displayed by an LLM is a mimicry of patterns it has learns, rather than a genuin emotional understanding.
Ethics and Morality: Humans have a sense of ethics and morality, which play a crucial role in our reasoning. We consider moral values, societal norms, and ethical implications when making decisions. LLMs, while they can be programmed to avoid certain responses. do not have a moral or ethical compass and cannot genuinely contemplate the ethical implications of their outputs.
Therefore, while LLMs can mimic some aspects of human reasoning to an impressive degree, there are fundamental differences that make the analogy between the two somewhat flawed. LLMs are tools designed to assist with information and tasks, but they don’t reason or think in the way humans do.
If a human described dogs’ behavior by writing “We like to smell the ground for the scent of other animals, including other dogs,” you would find that construction to be strange. You would expect something different: “Dogs like to smell the ground for the scent of other animals, including other dogs.” Or possibly “They like to smell the ground for the scent of other animals…” The point here is that you expect a human writing about dogs to use pronouns other than “I”, “we” or “our”. And yet, GPT-4 above uses these pronouns when describing human reasoning. This is a perfect example of how LLM reasoning is not at all like human reasoning. Assuming that these two styles of reasoning have much in common leads us astray, and inhibits our understanding of what LLMs are doing when they generate output from our prompts.
I love that we set out to create a machine that could think, but now we're just trying to figure out what "thinking" really is.