Large Reasoning Models and Intelligence
Toward the end of 2024, a new type of language model was released: the Large Reasoning Model, or LRM. Examples include OpenAI's o1, Qwen's QwQ and Deepseek's R1. Unlike traditional Large Language Models (LLMs), these models improve their accuracy by performing test-time compute, generating long reasoning chains before outputting their answer.
In the prior discussion on the measures of intelligence, the intelligence of a system is defined as its skill-acquisition efficiency, when given a set of priors and experience. A more intelligent system would end up with greater skill after undergoing a similar amount of experience as a less intelligent system. In essence, this is the measure of the generalization ability of the system in a particular domain.
This article is an exploration into whether LRMs and the method of generating reasoning chains represent a path toward higher intelligence as defined above.