Notebookcheck

AMD announces CDNA-based Instinct MI100 GPU with 120 CUs for HPC, promises up to 2.1x more performance per dollar compared to the NVIDIA A100

AMD Instinct MI100 HPC accelerator. (Image Source: AMD)
AMD Instinct MI100 HPC accelerator. (Image Source: AMD)
AMD has announced what it calls the world's fastest HPC GPU, the Instinct MI100 based on the CDNA architecture. The Instinct MI100 offers up to 11.5 TFLOPs of FP64 compute performance when paired with 2nd gen EPYC processors. The MI100 is slated to offer better performance per dollar compared to the NVIDIA A100 GPU along with support for the new ROCm 4.0 software platform.

AMD has announced the Instinct MI100 based on the new CDNA architecture targeted at machine learning (ML) and high performance computing (HPC) workloads. The MI100 is slated to offer 10 teraflops of FP64 performance that goes up to 11.5 TFLOPS when paired with second gen AMD EPYC processors.

During the presentation, AMD also confirmed that the 3rd gen EPYC processors based on Zen 3 codenamed Milan are now being sampled to select OEMs and are slated for a Q1 2021 launch.

AMD said that it is developing different architectures tailored for specific applications with some overlap. While RDNA will cater to gaming, CDNA is more focused towards compute and HPC applications. The Instinct MI100 offers a Matrix Core Technology that enables single and mixed precision matrix operations such as FP32, FP16, bFloat16, Int8, and Int4.

The second gen Infinity Fabric in the MI100 features 32 GB of HBM2 memory at 1.2 GHz delivering 1.23 TB/s of bandwidth.

The following table illustrates the specifications of the AMD Instinct MI100:

DesignFull-height, Dual-slot, 10.5 in. long
Compute Units 120
Stream Processors7,680
FP64 TFLOPs (Peak)11.5
FP32 TFLOPs (Peak)23.1
FP32 Matrix TFLOPs (Peak)46.1
FP16/FP16 Matrix TFLOPs (Peak)184.6
Int4/Int8 TOPS (Peak)184.6
bFLOAT16 TFLOPs (Peak)92.3
HBM2 ECC Memory32 GB
Memory Interface4,096-bit
Memory Clock1.2 GHz
Memory Bandwidth1.23 TB/s
PCIe SupportGen4
Infinity Fabric Links/Bandwidth3 / 276 GB/s
TDP300 W
CoolingPassively cooled

While the MI100 is designed to work well with EPYC processors, AMD confirmed that the new GPU supports Intel processors as well. Overall, up to 7x FP16 performance can be expected from the MI100 compared to previous generation AMD HPC GPUs.

The Instinct MI100 delivers up to 64 GB/s of Infinity Fabric bandwidth between the CPU and the GPU without the need to use any PCIe switches. There are a total of three Infinity Fabric links that offer up to 276 GB/s throughput. Essentially, a quad-GPU hive of the MI100 can yield up to 1.1 TB/s of total bandwidth. According to AMD, these features give the MI100 significant leads over the NVIDIA A100 in FP16/FP32 loads while also offering higher performance per dollar (see slides below).

The Instinct MI100 supports the new ROCm 4.0 ecosystem, which AMD pegs as a complete exascale solution for ML and HPC workloads. ROCm 4.0 now uses an open source compiler and supports OpenMP 5.0 and HIP. Additionally, PyTorch and TensorFlow are now optimized for ROCm 4.0.

The AMD Instinct MI100 can be expected this year end in major OEM and ODM systems from the likes of Dell, Gigabyte, HP, and SuperMicro.

AMD Instinct MI100 - Die Shot. (Image Source: AMD)
AMD Instinct MI100 - Die Shot. (Image Source: AMD)
AMD Instinct MI100 - Left. (Image Source: AMD)
AMD Instinct MI100 - Left. (Image Source: AMD)
AMD Instinct MI100 - Bottom. (Image Source: AMD)
AMD Instinct MI100 - Bottom. (Image Source: AMD)
AMD Instinct MI100 - Right. (Image Source: AMD)
AMD Instinct MI100 - Right. (Image Source: AMD)
AMD Instinct MI100 - Back. (Image Source: AMD)
AMD Instinct MI100 - Back. (Image Source: AMD)
AMD Instinct MI100 - Top. (Image Source: AMD)
AMD Instinct MI100 - Top. (Image Source: AMD)

Here are some of the slides from AMD's press briefing.

Source(s)

AMD Press Release

static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
> Notebook / Laptop Reviews and News > News > News Archive > Newsarchive 2020 11 > AMD announces CDNA-based Instinct MI100 GPU with 120 CUs for HPC, promises up to 2.1x more performance per dollar compared to the NVIDIA A100
Vaidyanathan Subramaniam, 2020-11-16 (Update: 2020-11-16)
Vaidyanathan Subramaniam
I am a cell and molecular biologist and computers have been an integral part of my life ever since I laid my hands on my first PC which was based on an Intel Celeron 266 MHz processor, 16 MB RAM and a modest 2 GB hard disk. Since then, I’ve seen my passion for technology evolve with the times. From traditional floppy based storage and running DOS commands for every other task, to the connected cloud and shared social experiences we take for granted today, I consider myself fortunate to have witnessed a sea change in the technology landscape. I honestly feel that the best is yet to come, when things like AI and cloud computing mature further. When I am not out finding the next big cure for cancer, I read and write about a lot of technology related stuff or go about ripping and re-assembling PCs and laptops.