Nvidia and AMD TFLOPs war to see Lovelace AD102 RTX 4090 hitting 100 TLOPS: 2.5x more compute than RTX 3090 Ti and 10x more than PlayStation 5
The next generation of desktop GPUs — Nvidia Lovelace AD102 and AMD RDNA 3 — are still some time away, but we have been seeing a steady stream of leaks for these GPUs lately. It looks the upcoming generation will be a race between the GPU heavyweights to reach the 100 TFLOPs mark. We reported that Navi 31 RX 7900 XT could deliver up to 92 TFLOPs of FP32 performance. Now, it looks like Nvidia will not be taking things lying down either.
This information comes from the usual sources, @kopite7kimi and @greymon55, who have been pretty spot on with GPU leaks so far. According to both of them, both Nvidia and AMD are likely to square off on FP32 performance this time besides a host of other parameters including ray tracing and super resolution among others.
The upcoming Nvidia Lovelace AD102 GPU is currently speculated to offer 18,432 CUDA cores with 96 MB of L2 cache paired with a 384-bit 21 Gbps 24 GB GDDR6X VRAM. These cores will be arranged in 12 graphics processing clusters (GPC)s with each GPC offering six texture processing units (TPCs) that further include two streaming multiprocessors (SMs) each for a total of 144 SMs. The RTX 4090 (?) could draw up to 600 W of power.
In a leak earlier this year, the RTX 4090's clock was speculated to be around 2.5 GHz, which would then afford 90 TFLOPs FP32 performance. In the most recent leak, AD102's staggering amount of CUDA cores are rumored to be capable of 100 TFLOPs of FP32 performance with a boost clock of at least 2.7 GHz, though it is likely to be even higher in practice.
As we had seen in the case of the Ampere GA102 die, the first AD102 chip will likely be a partially disabled version with Nvidia possibly reserving the full die for a later refresh. Even if AD102 is capable of hitting 100 TLFOPs, we will have to wait and see if Nvidia would market that number in the initial launch later this year or use it to entice buyers into a refreshed RTX 4090 Ti (?) down the line. Overall, it is safe to expect that both AD102 and Navi 31 will likely have matching clocks with possibly about a ~10 TLOPs difference.
The full picture will only be known once we get to know more concrete specifications. In any case, the upcoming GPU generation will once again push PC gaming even farther than the capabilities of current gen consoles.
Though FP32 gains do not necessarily translate into real-world benefits in gaming, a 100 TFLOPs GPU means that it is theoretically more than 2.5x faster than an RTX 3090 Ti, 8x faster than the Xbox Series X, and close to 10x faster than a PlayStation 5.
To be honest, I don't have much information about AMD. Maybe Lisa and Jensen's competition will give us a 100TFLOPS gaming war in a few months.— kopite7kimi (@kopite7kimi) April 29, 2022
I can only say that the two products have improved a lot compared to their predecessors, but if you want to ask me directly which one is better, I'm sorry I can't answer, because no one knows the specific improvement by percentage.— Greymon55 (@greymon55) April 30, 2022