Limits to Super-Intelligence

Kevin Zatloukal
6 min readApr 3, 2023

While the community struggles to sort out how the capabilities of LLMs compare to those of humans, it may be easier to put limits on the capabilities of LLMs by comparing them to other AIs. In fact, I think it is provably true that GPT[x] will never be better at chess than AlphaZero.

As I will explain below, this claim does not require any prediction about what capabilities will emerge in future version of GPT. Rather, it follows from that the fact that the capabilities of AlphaZero subsume those of ChatGPT when playing chess. While it is logically possible (though unlikely) that ChatGPT could match the ability of AlphaZero, it cannot ever be better at chess because AlphaZero can do anything that ChatGPT can do (and in reality, it can almost certainly do more).

ChatGPT versus AlphaZero

As a quick reminder, ChatGPT can take a text description of a chess game, feed that through its deep neural network, and output the most likely next move. The picture looks like this (albeit with many more hidden layers):

AlphaZero also contains a deep neural network. When you feed it a chess position — now encoded in some sort of internal representation, rather than text — it will also spit out the probabilities of the next move:

For AlphaZero, however, that is just the beginning. That is only one step of its chess calculation. Next, it calculates (in its internal representation) the board position after each of those moves, and then recursively examines each of those board positions:

It continues in this manner, performing a deep tree search of the space of possible moves, often looking 20 or even 30 moves ahead.

In AlphaZero terms, ChatGPT and all future GPTs only think “one move ahead”. They could potentially analyze an individual position at the same level as future AlphaZeros, but they will never be able to exceed its capabilities because AlphaZero’s deep neural network can do the same. In all likelihood, AlphaZero will remain substantially better at chess than future GPTs due to its ability to analyze many moves ahead.

Caveats

Since I’ve made a bold claim, let me add the appropriate caveats. In this analysis, I am assuming that future AlphaZeros are able to increase the size of their neural networks as much as future GPTs and able to change their neural network architectures as well. I’m also assuming that future AlphaZeros are trained on all the same data, specifically, the same set of chess games.

One might argue that future GPTs have an advantage from being trained on text as well. (Perhaps something it learns from Shakespeare will give it an edge in chess!) However, I will again assume the same is data is available to future AlphaZeros, and in that case, you could train a larger model on text and chess positions and then simply not use the text inputs at runtime. Under this configuration, future AlphaZeros’ neural networks for analyzing individual moves remain equally capable to anything future GPTs can do.

Internal Representations

The ability of humans to think ahead is not limited to chess, and in all cases, doing so depends on having an internal representation for the state of the world after potential moves have been made.

Some people like to argue that LLMs lack an internal representation, but this is hard to know for sure. Some bits of its hidden layers could, in principle, be encoding the chess position in the same way as AlphaZero:

Later hidden layers might even be calculating the position after potential moves. I can’t say one way or the other.

Here’s what I can say. First, ChatGPT cannot look arbitrarily far into the future because it does not do a tree search. Its architecture does not allow for that. No amount of training will make a tree search magically appear inside its neural network. Second, whatever calculation ChatGPT’s neural network can do, AlphaZero’s can also do… and then it adds tree search on top of that.

The discussion above is also missing some extra difficulties for ChatGPT. Its architecture considers any possible word coming next, including ones that are not chess moves, illegal chess moves, etc. ChatGPT must be trained to avoid illegal moves. (And this appears to be difficult given how often it makes illegal moves now!) In contrast, AlphaZero has ordinary software that calculates all moves that could follow, and that code does not produce any that are illegal. The humans that built it made sure that illegal moves are not possible. No training necessary.

Super-Intelligence?

Some are claiming that LLMs will, once large enough and trained on enough data, become “super-intelligences”. Such a thing should, I think, be able to play chess better than any human. Since future GPTs will always, in the sense described above, think only one move ahead, the claim that super-intelligence will emerge is a claim that single move analysis will eventually become so good that there is no need to do a tree search. I think that is a very bold claim, and not a bet I would make myself.

Experience with chess engines using deep neural networks, trained on massive data sets, is contrary to this. As far as I know, no one has ever found evidence that tree search did not hugely improve performance.

It is also interesting to note that, if some future GPT was able to figure out that a certain position was, say, mate in 14, the “thinking” to figure that out must happen at training time not at run time (since it lacks a tree search). If you consider the number of possible positions that are mate in 10+ moves and the amount of calculation that would need to happen at training time to analyze those all, I imagine we would discover that it greatly exceeds the amount of training we could ever do for an LLM. If that is true, then real LLMs will be strictly worse than AlphaZero for that reason alone.

Conclusion

As we have seen, while comparing LLMs to human reasoning remains difficult, comparing LLMs to other deep neural networks allows us to find limitations fairly easily. We know that AlphaZero will never make illegal moves, while future GPTs will need to be trained not to. We know that whatever analysis future GPTs can do of a chess position, that is only thinking “one move ahead” for future AlphaZeros.

Finally, we can see that the claim of super-intelligence emerging within an LLM implies the claim that LLMs will eventually do single position chess analysis so well that they do not need to apply a tree search. That is, in my opinion, a very bold claim that is not supported by experience and, even if logical possible (a big “if”), it would likely would require computational resources exceeding those that will ever exist. If that assessment is correct, then we can rule out the emergence of such super-intelligences.

--

--