Last week, AMD claimed the Radeon RX 7900 XTX could best Nvidia's GeForce RTX 4090 in a DeepSeek benchmark. However, the test didn't include Team Green's newest Blackwell-based GeForce RTX 5090, opting to use a last-gen RTX 4080 Super instead. Now, Nvidia has shown off some benchmarks of its own which (unsurprisingly) portray its offerings in a much better light.
Unlike AMD, Nvidia actually labelled its Y-axis properly (tokens/second). It used the Llama-bench platform with int4 quantization. In the first test with 7 billion parameters, the Radeon RX 7900 XTX generates a little over 100 tokens in a second. The RTX 4090 is 46% faster (~150 tokens/second) and the RTX 5090 103% faster (~200 tokens/second).
The situation is more or less identical with 8 billion tokens, and with a 32 billion token model, the RTX 5090's lead shoots up to 124% (~50 tokens/second). As always, these are first-party benchmarks and they should be treated with scepticism. Plus, the test methodology of both companies seems to be tailored to tilt the outcome in their favour. That said, it comes as no surprise that the RTX 5090 is faster than the two-year-old RX 7900 XTX, especially in a playing field where Nvidia reigns supreme.