ARM extends compute subsystem to custom data centre chips
ARM has developed a new range of compute subsystems (CSS) based on its Neoverse processor cores for data system and 5G chips.
The first product in the range is the ARM Neoverse CSS N2, detailed at the HotChips Conference in the US this week. This is a tile of up to 64 N2 cores based on the ARMv9 instruction set with Cortex-M7 microcontrollers for housekeeping, interconnect and memory interfaces that can be scaled for monolithic or chiplet-based designs with 256 processor cores running up to 3.6GHz.
The CSS includes the IP selection, system configuration, floorplanning, verification, validation, 3rd party IP and fab integration for TSMC’s 5nm process. This can save design teams up to 80 engineer years compared to an IP licensing model, says ARM.
Designers can can implement up to 8x 40b DDR5 or LPDDR5 channels per die, at speeds up to DDR5-5600. CSS N2 supports up to 4x x16 PCIe/CXL combo PHYs and controllers, each with 4-way bifurcation down to 4x x4 lanes.
“The CSS is an R TL deliverable with extra goodies, implementation package, scripts, physical IP, libraries and floorplan for the design, as well as a full software reference stack with power management, runtime and security to make sure that the software development starts on day one,” said Jeff Defilippi, director of product management at ARM, based in Austin, Texas.
“The other part of the offering is the innovation zone to customise and tune the design, and accelerators can be added as monolithic or chiplet,” he said. “This is saving partners an estimated 80 engineering years and we have feedback from customers that its of that order.”
This is essentially a complete chip, and has led to so-far unfounded concerns that ARM is moving into making chips as the CSS will have been built as a chip on TSMC’s 5nm process to prove the IP in silicon.
The 256 core array is also a way to avoid export restrictions that apply to the Neoverse V2 core that is used in Nvidia’s Grace datacentre chip. ARM also detailed the performance of the V2 core at HotChips this week.
- ARM prototype processor raises business model questions
- Analysis: ARM IPO filing reveals depth of Chinese risk
For cloud applications with a high core count is desired, CSS N2 supports scaling of up to 256-cores across two sockets. High-speed chip-to-chip links, using UCIe or a partner-specific PHY, can link up to 128-cores in a single socket and two sockets can be coherently connected using CXL PHYs and SMP protocol. In both cases, AMBA CXS protocol is used to bridge between the UCIe/CXL physical and data link layers into the AMBA CHI-based CMN-700 interconnect mesh.
On-chip accelerators can be incorporated using Arm’s NI-700 packetized network-on-chip interconnect with interrupt and address translation support. For off-chip acceleration, CSS N2 supports combo PCIe Gen5/CXL1.1 PHYs, enabling attachment of GPUs, TPUs, DPUs and other high-speed devices. This includes support for CXL Type3 connections – useful for memory expansion, pooling and tiering use cases.
The CSS also includes embedded Cortex-M7 cores for the a System Control Processor (SCP) is a trusted core controlling all system functions like clock control, and power and voltage domains. The Manageability Control Processor (MCP) interfaces with an external BMC for on-chip management, RAS, event logging, and communication alerts.
The software support is also a key issue for getting a chip design up and running quickly. The CSS N2 is SystemReady SR certified and comes with a reference firmware stack and virtual fixed platform model. This allows teams to quickly develop platform firmware, integrate OS and services, and tune boot flows, security, and power management before taping out final silicon.