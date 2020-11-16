AMD has launched a graphics processor optimised for high performance computing (HPC) rather than graphics.

The Instinct MI100 accelerator marks the divergence of the two types of GPU, delivering 11.5 teraflops of 64bit floating point operations from an array of 120 compute units and 7680 streaming processors in a 300W power envelope for supercomputers and data centres.

“Each era has unique characteristics for compute,” said Brad McCready, VP for data centre GPU accelerators at AMD. “We have moved from CPUs carrying the weight of the computation as we needed a boost to keep performance moving forward using general purpose GPUS. We believe we need another boost to move into the exascale era. AI is driving new workloads, again diversifying the workloads that GPUs are carrying. This is the first GPU to break the 10 TFLOP barrier,” he said.

“The Matrix Core moves into hardware the matrix operations for supercomputing workloads – that provides a 7x performance improvement,” said McCready.

The chips use the latest PCI Express connections to servers and AMD’s separate Infinity fabric that allows up to four GPU accelerator cards can be connected together in a topological cube.

”Four chips can be connected using our Infinity architecture with coherency rather than PCIe Gen4 for a fully connected cube. We physically implement it with a bridge card that goes across the top of the rack with 576Gb/s,” he said. This provides up to 340 GB/s of aggregate bandwidth per card.

Memory bandwidth is also important. “We also get 20 percent memory improvement using HBM2,” said McCready. The accelerator cards support 32GBytes of HBM2 memory at a clock rate of 1.2 GHz and delivers 1.23 TB/s of memory bandwidth to support large data sets and help eliminate bottlenecks in moving data in and out of memory.

The chip is being used by Dell, Gigabyte, HPE and Supermicro for cards alongside AMD’s EPYC