Compute architecture achieves 1 PetaOp/s on a single chip

Compute architecture achieves 1 PetaOp/s on a single chip

Technology News |
By Rich Pell

The architecture of the new Tensor Streaming Processor (TSP) architecture, says the company, is also capable of up to 250 trillion floating-point operations per second (FLOPS).

“Top GPU companies have been telling customers that they’d hoped to be able to deliver one PetaOp/s performance within the next few years; Groq is announcing it today, and in doing so setting a new performance standard,” says Jonathan Ross, Groq’s co-founder and CEO. “The Groq architecture is many multiples faster than anything else available for inference, in terms of both low latency and inferences per second. Our customer interactions confirm that.”

The architecture, says the company, provides a new paradigm for achieving both compute flexibility and massive parallelism without synchronization overhead of traditional GPU and CPU architectures. It can support both traditional and new machine learning models, and is currently in operation on customer sites in both x86 and non-x86 systems.

Inspired by a software-first mindset, the new, simpler processing architecture is designed specifically for the performance requirements of computer vision, machine learning, and other AI-related workloads. Execution planning happens in software, freeing up silicon real estate otherwise dedicated to dynamic instruction execution.

The tight control provided by the architecture provides deterministic processing that is especially valuable for applications where safety and accuracy are paramount. Compared to complex traditional architectures based on CPUs, GPUs and FPGAs, says the company, the chip also streamlines qualification and deployment.

“Groq’s solution is ideal for deep learning inference processing for a wide range of applications,” says Dennis Abts, Chief Architect at Groq, “but even beyond that massive opportunity, the Groq solution is designed for a broad class of workloads. Its performance, coupled with its simplicity, makes it an ideal platform for any high-performance, data- or compute-intensive workload.”

For more, see the company’s white paper on the new architecture: “Tensor Streaming Architecture Delivers Unmatched Performance for Compute-Intensive Workloads.”


Related articles:
Groq says it will reveal potent artificial intelligence chip in 2018
New Nvidia GPU architecture achieves ‘Holy Grail’ of computer graphics
Steep growth of AI chip market will produce new winners

If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News


Linked Articles