Think Again - Why ChatGPT Isn't Truly Intelligent

Photo by Tim Witzdam

Artificial intelligence: it seems like everywhere I go, people are talking about it. It's Silicon Valley's new gold rush. Tech giants like Meta, Google, Nvidia, and Microsoft are pouring billions into large language models (LLMs), massive computer systems that can write essays, code apps, and generate art. Each week brings a new headline: a $100 billion dollar deal between OpenAI and Nvidia, a $300 billion dollar deal between Oracle and OpenAI, billions and billions of dollars invested into technology that many don't truly understand. Those interested in finance may see parallels between the current state of artificial intelligence and the dot-com bubble of the early 2000s. Yet for all the money and excitement, today's AI systems are far from what many imagine. In truth, despite their impressive fluency and speed, LLMs haven't gotten us any closer to achieving genuine artificial intelligence, often called artificial general intelligence (AGI), the kind that thinks, reasons, and understands the world like humans do. 

Understanding why we aren't close to AGI starts with understanding how LLMs like ChatGPT work. LLMs are computer models trained on immense amounts of data, trillions of words of human text, often, in the case of models like ChatGPT, scraped from the internet, and are designed to learn patterns in how words and sentences relate to each other. The model first turns human text into numbers, called tokens, which it can process mathematically. It passes these tokens through a transformer network - a special type of neural network that helps the model figure out which words are important, and which words go together. Transformers are, in essence, giant, complex mathematical functions that learn patterns across layers and layers of calculations. 

This is a strange concept to fathom, so here's a more intuitive way to think about it: in basic algebra, a line can describe the relationship between two points. But if you add a third point that doesn't align with the other two, you need a more complex, nonlinear equation to include all three points. Transformers do something similar. They treat each token as a point, applying linear and nonlinear transformations to them to "fit" them to multidimensional surfaces that model their relationships. Because language is so complex, a transformer network fits points in hundreds or thousands of dimensions, far beyond the two-dimensional equations we encounter in algebra. Once a transformer has learned these patterns, it becomes very good at predicting what comes next in a sentence, much like how an equation helps us predict the value of y for a given x.

But though the predictive nature of LLMs allows it to generate human-like text, it also might be what's preventing LLMs from reaching "true" intelligence. Because LLMs are fundamentally pattern recognizers, they excel at predicting what comes next in a sentence because they have learned the structure of language, but they don't "understand" the meaning behind the words. Unlike humans, they have no grounding in the physical world – they've never actually experienced anything. For example, ask an LLM what happens if you put a bowling ball on top of a cardboard box. It can guess based on patterns in its data, but it has no concept of how objects actually behave – it's just throwing words out there. This is a big problem, because it means that LLMs can't actually function in the real world. They have less intuition for the world than a toddler. 

Some researchers see a solution to this in world models, systems that are trained not just on text, but also videos and sensory data, giving them more "physical intuition". And big tech companies are certainly pursuing world models, attempting to create virtual environments in which to train computer models to understand objects, forces, and cause-and-effect relationships in three-dimensional space. However, even these models might be far from reaching AGI. Simply feeding more data or scaling models doesn't automatically produce intelligence. True intelligence includes a capacity for abstraction. For example, if you know what a "book" is and you know what a "table" is, you can understand what happens when you put the book on the table without having seen that exact scenario before. Humans can generalize from limited data - that's why toddlers can learn what a cat is with just a few encounters, while it takes a computer millions of labelled photos to recognize one. 

While artificial intelligence is advancing at an incredible pace, we still remain far off from true artificial general intelligence. Today's models can mimic language, but lack understanding, reasoning, and genuine common sense. Paradoxically, even as we scale up models and feed them more data, progress seems to be slowing. A recent MIT study warns that focusing on scaling up LLMs may soon hit a wall of diminishing returns. The verdict seems clear: scaling up models and feeding AI more data isn't enough. Achieving AGI will require algorithmic breakthroughs in abstraction, grounding, and understanding. In a world driven by hype, it's best to stay cognizant of what technology can do, and what it cannot. Because if you think ChatGPT can truly think, think again.

Previous
Previous

Trapped in the Scroll: The Neuroscience Behind Dopamine Addiction and Shortened Attention Spans

Next
Next

The Search for Extraterrestrial Life: Are We Alone?