Last week a controversial model was released called Reflection that supposedly both open and better than all frontier models 😮 The trick behind that model was supposed to be teaching the model to think more and reflect on its answers 🤔
As it turned out, the Reflection model was too good to be true but the idea behind it, is legit 👌 In fact, there has been a series of models that have followed that direction starting from WizardLM 🧙 all the way to Orca 🐳. While not exactly the same, those approaches also teach the model to think by including the thought process of a teacher model in their training. The catch is that the resulting models do not surpass their teachers ‼️
Reflecting on answers is not a new idea either, it is the backbone of most agentic workflows 🕵️ and it is also behind the new o1 model from OpenAI which seems to be what Reflection wanted to be and more ✨ So let’s talk about o1…
o1 represents another leap from OpenAI, similar to GPT4 and towards what they have defined as Level 2 AGI which is the ability to problem solve 🧩 In some ways this is a small leap, GPT4 can already problem solve and when forced to think and reflect it does even better, so what is the novelty here?
In short, scaling inference instead of training 📈 As the bitter lesson teaches us, there are only two methods that scale
🧠 learning
🔍 search
Scaling to larger models trained with more data was all about scaling learning 🧠 o1 on the other hand is all about scaling inference, most likely through search 🔍 and a model that is better at guiding that search towards better solutions 🚀 We are at the beginning of understanding the impact this may have in industry applications but if developing software is a good proxy of the abilities that will be unlocked, AI went from being a poor coder to being better than most experienced programmers 🔥
It will definitely have a big impact on agentic workflows and its an intermediate step before the next which is AI that can take actions on our behalf, essentially the assistant we are all waiting for 🍿