Last year was supposed to be the year of AI and it definitely was the year of reasoning AI. I believe this year would be the year of agents but before we get there letโ€™s review what happened last year that led to today.

Data ๐Ÿ’ฟ

The amount of data used to train, at least open AI models, dramatically increased from 2 trillion tokens in Llama 2 all the way to 15 trillion in Llama 3. That, in turn, resulted to a close in performance gap between open and closed source performance, excluding the o series. We also seem to have hit a ceiling in regards to real world data which is unsurprising since the amount of high quality training data had been estimated to be about 100 trillion tokens. Note that this excludes synthetic data as well as non text data, both of which will continue to drive progress forward.

Token JanuaryCost ๐Ÿ’ฐ

The cost of AI decreased significantly during the year from 10$ per 1 million tokens at the start to 2.5$ at the end, and half that using caching. Performance at ever smaller model sizes became a hotly contested area with distillation from larger models becoming the norm with models such as Phi 4 (27B) reaching close to GPT4 performance but also the latest Llama 70B almost on par with 405B, their best model to date.

Speed ๐ŸŽ๏ธ

AI is also running faster thanks to innovations from Groq and Cerebras with speeds faster than 1000 tokens per second, which is at least one order of magnitude faster than the fastest generation at the start of the year. As reasoning and agentic AI takes off, generation speed will become ever more important because both require more compute to tackle more complex tasks.

AI integration ๐Ÿ–ฑ๏ธ

Many companies experimented with AI features last year, with varying level of success. From summarising unread messages and notifications in your phone and communication platforms, to help you write and rewrite content as well as find relevant information and answers faster, most of those attempts were less transformational than they seemed. Cracking AI use cases is proving to be a bit harder than initially thought.

Reasoning ๐Ÿค”

I think the biggest development this year, was reasoning AI with models such as the o series from OpenAI. The impact of those models has been flying a bit under the radar, there was no ChatGPT moment, since the chat interface is probably not the best way for those models to shine. One way to grasp the impact is the fact that AI went from being an average programmer to being the 175th best programmer in the world. Reasoning is in many ways a necessary but not sufficient step for more useful AI that can automate human tasks so now that reasoning has improved we can more reliably move to the next phase of AI which is agents.

The year ahead

At the very least we should expect the same incremental improvements we saw last year, such as cheaper and faster models and more AI into the apps we use. On top of that though, I believe this will be the year of agents, where initially small tasks will be delegated to AI, like checking you in, or sending that document a colleague asked in an email by simply prompting a model. At the same time, there will be a lot of innovation and disruption inside companies that will experiment even more with agentic workflows across the company to make them do more with less. This should be an exciting year, not that any year is particularly boring in AI.