GPT, Claude, Llama? How to tell which AI model is best
Beware model-makers marking their own homework
When Meta, the parent company of Facebook, announced its latest open-source large language model (LLM) on July 23rd, it claimed that the most powerful version of Llama 3.1 had “state-of-the-art capabilities that rival the best closed-source models” such as GPT-4o and Claude 3.5 Sonnet. Meta’s announcement included a table, showing the scores achieved by these and other models on a series of popular benchmarks with names such as MMLU, GSM8K and GPQA.
Explore more
More from Science & technology
Can you breathe stress away?
It won’t hurt to try. But scientists are only beginning to understand the links between the breath and the mind
The Economist’s science and technology internship
We invite applications for the 2025 Richard Casement internship
A better understanding of Huntington’s disease brings hope
Previous research seems to have misinterpreted what is going on
Is obesity a disease?
It wasn’t. But it is now
Volunteers with Down’s syndrome could help find Alzheimer’s drugs
Those with the syndrome have more of a protein implicated in dementia
Should you start lifting weights?
You’ll stay healthier for longer if you’re strong