In No Time Machine Will Think 🤖
Let us kick-off with some historical perspective. Charles Babbage's Analytical Engine, developed in the mid-1830s, was an early milestone in computing. Almost from the start, those around Babbage described his inventions in ways that suggested they believed the machines possessed, or at least closely modeled, authentic mental powers. Lady Byron recorded in her diary in mid-June 1833 that they went to see what she called "the thinking machine" at Babbage's house. This is the first of many such references1.
After 200 years the quest to mimic human reasoning is still here. I am sure you can tell even more examples.
It has been more than half a year since "o1", the first Reasoning Language Model (RLM), was publicly released as a product2, and the hype train is only gaining momentum. Since then, we've seen the rise of models claiming to "reason" — but what does that really mean?
According to the Cambridge Dictionary, reasoning is "the process of thinking about something in order to make a decision"3.
In practice, however, the working definition often stated as "whatever the reasoning model is doing"4 or "the ability to solve more complex tasks". These loose definitions shift focus from reasoning as a process to reasoning as a result. Just press the think button, and you have it.
But is this actually reasoning or just an illusion supported by complex pattern matching?
A recent study by Apple suggests the latter5. Based on a set of new synthetically crafted puzzles with different difficulties, researchers showed that (1) standard LLMs outperform RLMs at low complexity, (2) RLMs are better at moderate complexity, and (3) both fail catastrophically at high complexity. In other words "0% success rate" on unseen problems even with unlimited compute budget. Research has it's limitations. Still it highlights a gap in true reasoning and generalizable problem-solving capabilities. While performance on existing public benchmarks is getting better. New vs existing, this is what matters.
I also believe we're not quite there yet. This perception may well stem from a kind of confirmation bias. But reasoning is not just pattern recognition. It is hard. Current models simulate the appearance of reasoning, but not its underlying machinery. In some sense we are building mirrors, not minds.
Here are a few books from my short list that I'd like to read to deepen my understanding of this complex topic:
- 🧠 "The Emperor's New Mind" by R. Penrose
- ✍️ "Psychology of Invention in the Mathematical Field" by J. Hadamard
Would love your thoughts or any reading recommendations on the topic. 🙏
PS. Babbage himself, however, seems to have steadfastly refrained from making public claims regarding the putative mentality of his machines.
References:
https://www.yorku.ca/christo/papers/babbage-berlin.htm↩
https://en.wikipedia.org/wiki/Reasoning_language_model↩
https://dictionary.cambridge.org/dictionary/english/reasoning↩
Podcast with Nicolas Carlini https://www.youtube.com/watch?v=n4ipEJ6uJ44↩
Apple ML paper: https://machinelearning.apple.com/research/illusion-of-thinking↩