Notebookcheck Logo

AMD announces CDNA-based Instinct MI100 GPU with 120 CUs for HPC, promises up to 2.1x more performance per dollar compared to the NVIDIA A100

AMD Instinct MI100 HPC accelerator. (Image Source: AMD)
AMD Instinct MI100 HPC accelerator. (Image Source: AMD)
AMD has announced what it calls the world's fastest HPC GPU, the Instinct MI100 based on the CDNA architecture. The Instinct MI100 offers up to 11.5 TFLOPs of FP64 compute performance when paired with 2nd gen EPYC processors. The MI100 is slated to offer better performance per dollar compared to the NVIDIA A100 GPU along with support for the new ROCm 4.0 software platform.

AMD has announced the Instinct MI100 based on the new CDNA architecture targeted at machine learning (ML) and high performance computing (HPC) workloads. The MI100 is slated to offer 10 teraflops of FP64 performance that goes up to 11.5 TFLOPS when paired with second gen AMD EPYC processors.

During the presentation, AMD also confirmed that the 3rd gen EPYC processors based on Zen 3 codenamed Milan are now being sampled to select OEMs and are slated for a Q1 2021 launch.

AMD said that it is developing different architectures tailored for specific applications with some overlap. While RDNA will cater to gaming, CDNA is more focused towards compute and HPC applications. The Instinct MI100 offers a Matrix Core Technology that enables single and mixed precision matrix operations such as FP32, FP16, bFloat16, Int8, and Int4.

The second gen Infinity Fabric in the MI100 features 32 GB of HBM2 memory at 1.2 GHz delivering 1.23 TB/s of bandwidth.

The following table illustrates the specifications of the AMD Instinct MI100:

DesignFull-height, Dual-slot, 10.5 in. long
Compute Units 120
Stream Processors7,680
FP64 TFLOPs (Peak)11.5
FP32 TFLOPs (Peak)23.1
FP32 Matrix TFLOPs (Peak)46.1
FP16/FP16 Matrix TFLOPs (Peak)184.6
Int4/Int8 TOPS (Peak)184.6
bFLOAT16 TFLOPs (Peak)92.3
HBM2 ECC Memory32 GB
Memory Interface4,096-bit
Memory Clock1.2 GHz
Memory Bandwidth1.23 TB/s
PCIe SupportGen4
Infinity Fabric Links/Bandwidth3 / 276 GB/s
TDP300 W
CoolingPassively cooled

While the MI100 is designed to work well with EPYC processors, AMD confirmed that the new GPU supports Intel processors as well. Overall, up to 7x FP16 performance can be expected from the MI100 compared to previous generation AMD HPC GPUs.

The Instinct MI100 delivers up to 64 GB/s of Infinity Fabric bandwidth between the CPU and the GPU without the need to use any PCIe switches. There are a total of three Infinity Fabric links that offer up to 276 GB/s throughput. Essentially, a quad-GPU hive of the MI100 can yield up to 1.1 TB/s of total bandwidth. According to AMD, these features give the MI100 significant leads over the NVIDIA A100 in FP16/FP32 loads while also offering higher performance per dollar (see slides below).

The Instinct MI100 supports the new ROCm 4.0 ecosystem, which AMD pegs as a complete exascale solution for ML and HPC workloads. ROCm 4.0 now uses an open source compiler and supports OpenMP 5.0 and HIP. Additionally, PyTorch and TensorFlow are now optimized for ROCm 4.0.

The AMD Instinct MI100 can be expected this year end in major OEM and ODM systems from the likes of Dell, Gigabyte, HP, and SuperMicro.

AMD Instinct MI100 - Die Shot. (Image Source: AMD)
AMD Instinct MI100 - Die Shot. (Image Source: AMD)
AMD Instinct MI100 - Left. (Image Source: AMD)
AMD Instinct MI100 - Left. (Image Source: AMD)
AMD Instinct MI100 - Bottom. (Image Source: AMD)
AMD Instinct MI100 - Bottom. (Image Source: AMD)
AMD Instinct MI100 - Right. (Image Source: AMD)
AMD Instinct MI100 - Right. (Image Source: AMD)
AMD Instinct MI100 - Back. (Image Source: AMD)
AMD Instinct MI100 - Back. (Image Source: AMD)
AMD Instinct MI100 - Top. (Image Source: AMD)
AMD Instinct MI100 - Top. (Image Source: AMD)

Here are some of the slides from AMD's press briefing.

Source(s)

AMD Press Release

static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
> Expert Reviews and News on Laptops, Smartphones and Tech Innovations > News > News Archive > Newsarchive 2020 11 > AMD announces CDNA-based Instinct MI100 GPU with 120 CUs for HPC, promises up to 2.1x more performance per dollar compared to the NVIDIA A100
Vaidyanathan Subramaniam, 2020-11-16 (Update: 2020-11-16)