Ampere plans 3nm 256 core AI chip, teams with Qualcomm

Ampere plans 3nm 256 core AI chip, teams with Qualcomm

Business news |
By Nick Flaherty

Ampere has launched a 3nm variant of its data centre AI chip with 256 cores and is working with Qualcomm on AI inference technology.

The joint development for AI inferencing will use the Qualcomm Cloud AI 100 inference solutions and Ampere CPUs.

The latest Ampere 3nm 256 core variant use the same air-cooled thermal solutions as the existing 192 core AmpereOne CPU and deliver more than 40% more performance than any CPU in the market today, without exotic platform designs. The company’s 192-core 12-channel memory platform is still expected later this year.

“We started down this path six years ago because it is clear it is the right path,” said Renee James, CEO of Ampere. “Low power used to be synonymous with low performance. Ampere has proven that isn’t true. We have pioneered the efficiency frontier of computing and delivered performance beyond legacy CPUs in an efficient computing envelope.”

“The current path [on AI power usage] is unsustainable. We believe that the future data centre infrastructure has to consider how we retrofit existing air-cooled environments with upgraded compute, as well as build environmentally sustainable new data centres that fit the available power on the grid. That is what we enable at Ampere.”

Ampere says this needs a flexible CPU for different workloads rather than higher power GPUs, an approach that has also been taken by Tachyum with its Prodigy universal processor.  

“Our Ampere CPUs can run a range of workloads – from the most popular cloud native applications to AI. This includes AI integrated with traditional cloud native applications, such as data processing, web serving, media delivery, and more,” said Ampere’s Chief Product Officer Jeff Wittich 

The Qualcomm Cloud AI 100 Ultra aims to tackle LLM inferencing on the industry’s largest generative AI models. This has up to 576 MB of on-die SRAM and 64 AI cores per PCIe card and is programmable for a wide range of workloads and acceleration techniques such as computer vision frameworks and large language models (LLMs).

Meta’s Llama 3 LLM is now running on Ampere CPUs at Oracle Cloud with the 128 core Altra CPU with no GPU delivers the same performance as an Nvidia A10 GPU paired with an x86 CPU, while using a third of the power.


If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News


Linked Articles