Cerebras looks to optical interconnect for 4000x AI boost

Cerebras looks to optical interconnect for 4000x AI boost

Technology News |
By Nick Flaherty

Wafer-scale processor developer Cerebras is working on an optical substate to improve the performance to its systems by a factor of 4000 and is calling for industry collaboration and standardisation.

The current Cerebras WSE3 processor is built on a single 300mmm wafer with 900m transistors that consumes 20kW of power. The California-based company had to develop its own wafer-scale packaging for the I/O, power delivery and cooling and is now working on an optical interconnect. 

The chief systems architect for the company was speaking at the Leti Innovation Days in Grenoble, France, this week on ways to tackle scalability challenges with chiplet and 3D heterogeneous packaging technologies.

“It’s not chiplets but it’s still a candidate for 3D integration,” says JP Fricker, co-founder and chief system architect at Cerebras. “This technique would be transformative.”

However a key limitation for performance, scaling and power consumption is the off chip I/O.

“I/O is a limitation in big compute and prevents you from getting to very large systems. The technologies exist today but we need to invent technologies to put them together. We are developing these technologies and our goal is to build supercomputers that are 4000x faster than today with 1000 wafers connected together.”

“Today the I/O is on two edges of the chip but it would be better if the I/Os were distributed across the chip. When you reduce the channel length you can reduce the size of the SERDES, saving space and power.”

“We’d like to have a very large number of optical engines,” he said. “At the moment they are externals but eventually we will get those lasers into the chip.” These would be used for multiple communications lanes at reasonable data rates of 100 to 200Gbit/s rather than fat pipes, he says.

“We have our waferscale engine and we take a third party waferscale programmable optical interconnect and combine them, using the entire surface of the wafer to connect to a wafer,” he said. “This needs heterogeneous wafer on wafer packaging.”

Companies such as Celestia AI and Lightmatter have been developing these optical interconnect technologies specifically with hyperscaler and AI chip companies in mind.

 “But we need to invent or repurpose technologies. The current interconnect pitch is too coarse, and we cannot get access to fabs that are willing to integrate this as it is so niche so we need to create a different process. Hybrid bonding enables a much finer pitch under 12um and higher assembly yields but it is only available in a given fab with limited pairs of processes in the fab, for example 5nm to 5nm wafer but not with a different foundry, and also two years later.”

There are also challenges in the process steps.

“To do hybrid bonding a fab stops at one of the last copper layers that cannot be easily probed but that makes it difficult to ship to another fab.”

“We would like to develop a new technique to standardise the finish of wafers with a common top layer, and use this layer as a standard interface for wafer stacking so that different wafers can be made differently but the last set of interfaces are common for bonding across different fabs. This also means bonding could be done by a third party, not just high volume fabs,” he said.

The marks from the test probes on the copper layer are also an issue for planarization, and these either have to remove those marks or use a non-contact test system.

But there are significant advantages he says.

“We can transfer power through the optical wafer as the components are more sparse with many through silicon vias (TSVs) and very short channels and these are in a single layer by using multiple wavelengths. This allows power from the top and remove cooling from the bottom in the same system.”

“In our case the network we have on the compute wafer is based on a configurable fabric that is set up before a workload is run on the wafer. When you do this in the optical domain with circuit switching you can evolve your electrical switching into the optical domain but you don’t need to do it that often.


If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News


Linked Articles