Tachyum to build 50 exaFLOP supercomputer
Cette publication existe aussi en Français
Tachyum is to build a large scale supercomputer based on its 5nm Prodigy Universal Processor chip for a US customer.
The Tachyum supercomputer will have over 50 exaFLOP performance, 25 times faster than today’s systems and support AI models potentially 25,000 times larger with access to hundreds of petabytes of DRAM and exabytes of flash-based primary storage.
The Prodigy chip enables a significant increase in the memory, storage and compute architectures for datacentre, AI and HPC workloads in government, research and academia, business, manufacturing and other industries.
Earlier this year the Slovak/US company detailed a 20 exaflop supercomputer architecture using the Prodigy chip which is expected to sample next year.
Installation of the Prodigy-enabled supercomputer will begin in 2024 and reach full capacity in 2025. This will provide 8 Zettaflops of AI training for big language models and 16 Zettaflops of image and video processing. This would provide the ability to fit more than 100,000x PALM2 530B parameter models or 25,000x ChatGPT4 1.7T parameter models with base memory and 100,000x ChatGPT4 with 4x of base DRAM.
The 4-socket, liquid-cooled nodes are connected to 400G RoCE ethernet, with the capability to double to an 800G all non-blocking and non-overprovisioned switching fabric
“The unprecedented scale and computational power required as part of this installation simply could not be provided by any chip manufacturer on the market today,” said Dr. Radoslav Danilak, founder and CEO of Tachyum.
“While there are startups receiving billions of dollars, based on their promise of achieving similar capabilities sometime in the future, only Tachyum is positioned to deliver the capability to economically build order-of-magnitude bigger machines that potentially enable the transition to cognitive AI, beginning later this year. This purchase order is a testament to our first-to-market position and our ability to provide a positive impact to worldwide AI markets.”
The Universal Processor works with all workloads without needing a separate GPU, so servers can seamlessly and dynamically switch between computational domains (such as AI/ML, HPC, and cloud) on a single architecture. This eliminates the need for expensive dedicated AI hardware and dramatically increasing server utilization, reducing CAPEX and OPEX significantly.