DeepSeek has unveiled the latest version of its groundbreaking R1 AI large language model, DeepSeek-R1-0528. The company broke into the AI world with the launch of its V3 and R1, both with top-ten AI performance but trained cheaper while using less time than competing models from companies like OpenAI and Google.
The latest R1 model was tested against the following AI benchmarks:
- American Invitational Mathematics Examination (AIME) 2024
- American Invitational Mathematics Examination (AIME) 2025
- Google-Proof Q&A (GPQA)
- LiveCodeBench
- Aider AI coding
- Humanity's Last Exam
Although DeepSeek-R1-0528 has improved performance when compared to the original R1 release across all benchmarks, it answers only 17% of the questions correctly on the tough Humanity's Last Exam. Since its top competitors also score poorly on this exam, the gains in the latest version of DeepSeek R1 likely come from additional AI training time and tuning rather than any breakthrough in AI creation. Importantly, the latest R1 has reduced AI hallucinations, so it is less likely to generate misleading or false replies.
Readers who want to tinker with the open-source R1 model can run distilled, eight-billion parameter versions using an Nvidia 4090 GPU with 24 GB of memory.