Nvidia releases more RTX 3000 spec details through Reddit Q&A session
Nvidia managed to impress even the more skeptical fans with the RTX 3000 announcement, but, even so, some of the specs like the CUDA core count or the number of pins on the power connectors appeared to contradict the information provided by certain AIB partners. In order to clear up the confusing specs, Nvidia delegated some of its employees to answer most of the pressing fan concerns in a Reddit Q&A session.
Here is a list with the more relevant clarifications from the Reddit Q&A session:
- The HDMI 2.1 ports support the full 48 Gbps bandwidth, and the Nvidia drivers will allow users to switch between 8-bit, 10-bit and 12-bit color depth, which is good news for those who own TVs with limited 40 Gbps HDMI 2.1 connections.
- “RTX IO will accelerate SSD performance regardless of how fast it is, by reducing the CPU load required for I/O, and by enabling GPU-based decompression, allowing game assets to be stored in a compressed format and offloading potentially dozens of CPU cores from doing that work. Compression ratios are typically 2:1, so that would effectively amplify the read performance of any SSD by 2x.”
- RTX IO “does not allow the SSD to replace frame buffer memory, but it allows the data from the SSD to get to the GPU, and GPU memory much faster, with much less CPU overhead.”
- Regarding the confusing double CUDA cores, Nvidia states that the new Ampere microarchitecture was designed “to achieve twice the throughput for FP32 operations compared to the Turing Streaming Multiprocessors (SM). To accomplish this goal, the Ampere SM includes new datapath designs for FP32 and INT32 operations. One datapath in each partition consists of 16 FP32 CUDA Cores capable of executing 16 FP32 operations per clock. Another datapath consists of both 16 FP32 CUDA Cores and 16 INT32 Cores. As a result of this new design, each Ampere SM partition is capable of executing either 32 FP32 operations per clock, or 16 FP32 and 16 INT32 operations per clock. All four SM partitions combined can execute 128 FP32 operations per clock, which is double the FP32 rate of the Turing SM, or 64 FP32 and 64 INT32 operations per clock.”
- The performance difference between PCIe 3.0 and PCIe 4.0 “is typically less than a few percent,” so Intel users stuck on PCIe 3.0 are not really missing out on any substantial performance gains.
- Nvidia did not announce anything about DLSS 3.0, but the new RTX 3000 cards will support DLSS 2.1, which includes a new ultra performance mode for 8K resolutions, plus support for VR titles and dynamic resolution.
- Nvidia recommends the use of two individual 8-pin cables to power the new RTX 3080 cards.
- Nvidia did not really improve the video encoding capabilities of the Ampere cards, but it added AV1 decode support.
- Nvidia Reflex - the new feature that reduces rendering latency in competitive games - will work with GTX 900 GPUs and up.