No one yet knows howChatGPTand it isartificial intelligence cousinswill transform the world, and one reason is that no one really knows what is going on inside them. Some of these systems' capabilities go far beyond what they were trained to do—and even their inventors are confused as to why. A growing number of tests suggest that these AI systems develop internal models of the real world, just as our own brains do, although the machines' engineering is different.
"Anything we want to do with them to make them better or safer or anything like that seems to me like a ridiculous thing to ask ourselves to do if we don't understand how they work," said Ellie Pavlick of Brown University. one of the researchers working to fill the explanatory void.
At one level, she and her colleagues understand GPT (short for generative pretrained transformer) and other large language models, or LLMs, very well. The models rely on a machine learning system called a neural network. Such networks have a structure loosely modeled after the connected neurons of the human brain. The code for these programs is relatively simple and takes up only a few screens. It sets up an autocorrect algorithm that selects the most likely word to complete a passage based on painstaking statistical analysis of hundreds of gigabytes of Internet text. Additional training ensures that the system presents its results in the form of dialogue. In this sense, all it does is reproduce what it has learned—it is a "stochastic parrot,” in the words of Emily Bender, a linguist at the University of Washington. But LLM has also managed to do thatace the bar exam, explain the Higgs boson in iambic pentameter, and make an attempt tobreak the marriage of their users. Few would have expected a fairly straightforward autocorrect algorithm to acquire such broad capabilities.
That GPT and other AI systems are performing tasks they were not trained to do, giving them "emergent abilities," has surprised even researchers who have generally been skeptical of the hype over LLMs. "I don't know how they do it, or if they could do it more generally like humans do — but they've challenged my views," says Melanie Mitchell, an AI researcher at the Santa Fe Institute.
"It's definitely much more than a stochastic parrot, and it definitely builds some representation of the world - although I don't think it's exactly like how humans build an internal world model," says Yoshua Bengio, an AI researcher at the University of Montreal.
At a conference at New York University in March, philosopher Raphaël Millière of Columbia University offered another astonishing example of what LLMs can do. The models had already demonstrated the ability to write computer code, which is impressive but not too surprising because there is so much code out there on the Internet to emulate. Millière went a step further and showed that GPT can also execute code. The philosopher entered a program to calculate the 83rd number in the Fibonacci sequence. "It's multi-step reasoning at a very high level," he says. And the bot did it. However, when Millière asked directly about the 83rd Fibonacci number, GPT got it wrong: this suggests that the system was not just messing with the Internet. Rather, it was performing its own calculations to arrive at the correct answer.
Although an LLM runs on a computer, it is not itself a computer. It lacks essential computational elements, such as working memory. In a tacit acknowledgment that GPT on its own shouldn't be able to run code, its inventor, the technology company OpenAI, has since introduced a specialized plug-in — a tool ChatGPT can use when answering a query — that makes make it possible for it to do so. But that plug-in was not used in Millière's demonstration. Instead, he hypothesizes that the machine improvised a memory by exploiting its mechanisms to interpret words according to their context—a situation similar to how nature reuses existing capacities for new functions.
This improvisational ability shows that LLMs develop an internal complexity that goes far beyond a superficial statistical analysis. Researchers find that these systems appear to achieve genuine understanding of what they have learned. In a study presented last week at the International Conference on Learning Representations (ICLR), PhD student Kenneth Li of Harvard University and his AI research colleagues – Aspen K. Hopkins of the Massachusetts Institute of Technology, David Bau of Northeastern University and Fernanda Viégas, Hanspeter Pfister and Martin Wattenberg, all at Harvard – created their own smaller replica of the GPT neural network so they could study its inner workings. They trained it on millions of matchesboard game Othelloby feeding long sequences of moves in text form. Their model became an almost perfect player.
To study how the neural network encoded information, they used a technique that Bengio and Guillaume Alain, also at the University of Montreal, devised in 2016. They created a miniature "probe" network to analyze the main network layer by layer. Li compares this approach to neuroscientific methods. "This is similar to when we put an electrical probe into the human brain," he says. In the case of AI, the probe showed that its "neural activity" matched the representation of an Othello record, albeit in a convoluted form. To confirm this, the researchers ran the probe in reverse to implant information into the network—for example, by turning one of the game's black marker pieces into a white one. "Basically, we're hacking into the brains of these language models," says Li. The network adjusted its movements accordingly. The researchers concluded that it played Othello much like a human: by holding a game board in its "mind's eye" and using this model to evaluate movements. Li says he believes the system learns this skill because it is the most parsimonious description of its training data. "If you get a whole bunch of game scripts, trying to figure out the rule behind it is the best way to compress," he adds.
This ability to infer the structure of the outside world is not limited to simple game moves; it also shows in dialogue. Belinda Li (no relation to Kenneth Li), Maxwell Nye, and Jacob Andreas, all at M.I.T., studied networks playing a text-based adventure game. They fed sentences like "The key is in the treasure chest" followed by "You take the key." Using a probe, they found that the networks themselves encoded variables corresponding to "breast" and "you," each with the property of having a key or not, and updated those variables sentence by sentence. The system had no independent way of knowing what a box or key is, yet it picked up the concepts it needed for this task. "There is some representation of the state hidden inside the model," says Belinda Li.
Researchers wonder how much LLMs are able to learn from text. Pavlick and her then Ph.D. student Roma Patel found that these networks absorb color descriptions from Internet text and construct internal representations of colors. When they see the word "red," they treat it not just as an abstract symbol, but as a concept that has some relation to maroon, crimson, fuchsia, rust, and so on. Demonstrating this was a bit difficult. Instead of inserting a probe into a network, the researchers studied its response to a series of text prompts. To check if it was just an echo of color relationships from online references, they tried to mislead the system by telling it that red is actually green - just like the old philosophical thought experiment where one person's red is someone else's green. Instead of leaving an incorrect response, the system's color evaluations changed appropriately to maintain the correct relationships.
Conceived of the idea that to perform its autocorrect function, the system searches for the underlying logic in its training data, machine learning researcher Sébastien Bubeck of Microsoft Research suggests that the wider the range of the data, the more general the system's rules. will discover. "Maybe we're seeing such a big jump because we've reached a diversity of data that's big enough that the only underlying principle of it all is that intelligent beings produced it," he says. "And so the only way to explain all this data is [for the model] to become intelligent."
In addition to extracting the underlying meaning of the language, LLMs are able to learn on the fly. In the field of AI, the term "learning" is usually reserved for the computationally intensive process in which developers expose the neural network to gigabytes of data and adjust its internal connections. At the time you write a query in ChatGPT, the network should be fixed; unlike humans, it should not continue to learn. So it came as a surprise that LLMs actually learn from their users' prompts—a capability known as "in-context learning." "It's a different kind of learning that you didn't really understand existed before," says Ben Goertzel, founder of AI company SingularityNET.
An example of how an LLM learns comes from the way humans interact with chatbots such as ChatGPT. You can give the system examples of how you want it to react and it will obey. Its output is determined by the last several thousand words it has seen. What it does, given these words, is prescribed by its fixed internal connections—but the word sequence nonetheless offers some adaptability. Entire websites are dedicated to "jailbreak" prompts that overcome the system's "autoshields"—restrictions that prevent the system from telling users how to make a pipe bomb, for example—typically by instructing the model to pretend to be a system without guard rail. Some people use jailbreaking for sketchy purposes, while others implement it to elicit more creative responses. "It will answer scientific questions, I would say, better" than if you just ask it directly, without the special jailbreak prompt, says William Hahn, co-director of the Machine Perception and Cognitive Robotics Laboratory at Florida Atlantic University. "It's better for scholarship."
Another type of learning in context is via "chain of thought" prompting, which means asking the network to print out each step of its reasoning—a tactic that makes it better for logic or arithmetic problems that require multiple steps. (But one thing that made Millière's example so surprising is that the network found the Fibonacci number without any such coaching.)
In 2022, a team at Google Research and the Swiss Federal Institute of Technology in Zurich—Johannes von Oswald, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, and Max Vladymyrov—showed that learning in context follows the same basic computational procedure as standard learning, known as gradient descent. This procedure was not programmed; the system detected it without help. "It has to be a learned skill," says Blaise Agüera y Arcas, vice president at Google Research. In fact, he believes that LLMs may have other latent abilities that no one has discovered yet. "Every time we test for a new ability that we can quantify, we find it," he says.
Although LLMs have enough blind spots not to qualify as artificial general intelligence, or AGI—the term for a machine that achieves the ingenuity of animal brains—these new capabilities suggest to some researchers that tech companies are closer to AGI than even optimists had guessed . "They're indirect evidence that we're probably not that far away from AGI," Goertzel said in March at a conference on deep learning at Florida Atlantic University. OpenAI's plug-ins have given ChatGPT a modular architecture similar to that of the human brain. "Combining GPT-4 [the latest version of LLM that powers ChatGPT] with various plug-ins may be a path toward a human-like specialization of function," says M.I.T. researcher Anna Ivanova.
At the same time, however, researchers worry that the window may close on their ability to study these systems. OpenAI has not disclosed the details regardinghow it designed and trained GPT-4, in part because it is locked in competition with Google and other companies – not to mention other countries. "Probably there will be less open research from industry, and things will be more siled and organized around building products," says Dan Roberts, a theoretical physicist at M.I.T., who applies the techniques of his profession to understanding AI.
And this lack of transparency doesn't just hurt researchers; it also hinders efforts to understand the social consequences of the rush to adopt AI technology. "Transparency around these models is key to ensuring safety," says Mitchell.
ABOUT THE AUTHOR(S)
George Musseris contributing editor atScientific Americanand author ofSpooky action at a distance(Farrar, Straus and Giroux, 2015) andThe Complete Idiot's Guide to String Theory(Alpha, 2008). Follow him on Mastodon@email@example.comKredit: Nick Higgins
Recent articles by George Musser
- Open offices don't work, so how do we design an office that does?
- Solution of the hated Open-Design Office
- How physicists cracked a black hole paradox