MENU

First 3nm Gen 6 PCIe switch enables scalable, efficient data centres for AI workloads

First 3nm Gen 6 PCIe switch enables scalable, efficient data centres for AI workloads

Feature articles |
By Jean-Pierre Joosting



To help prepare for the next-generation of traffic management and data centres, Microchip has launched the first Gen 6 PCIe switch using 3nm technology. According to Microchip, the 3nm node is ideally suited to the power-versus-performance needs of advanced AI data centres.

In addition, the devices benefit from advanced security features, including a hardware root of trust and secure boot. These utilise post-quantum safe cryptography compliant with the Commercial National Security Algorithm Suite (CNSA) 2.0. Further, a diagnostic suite, ChipLink, that has received very positive customer feedback, is available for the Switchtec Gen 6 family to help speed debug and time-to-market for customers and ecosystem partners in the ODM and OEM community.

The Switchtec Gen 6 PCIe switch family addresses the surge in AI workloads driven by Generative AI and Large Language Models (LLMs) that manipulate and create images and video, leading to soaring demand for compute cycles in data centres and massive power consumption.

Brian Carlson, Corporate Vice President at Microchip, Data Center Solutions Business Unit comments, “Microchip wanted to help solve some of that problem by jumping to three nanometer technology first, which gives a very significant power advantage, saving as much based on our estimates of north of 15% power per lane in these switches compared to even just a five nanometer node product, let alone going back to older generations. The Gen 6 PCIe switches enable modern data centres to increase their switching capabilities and accelerate the processing of advanced workloads through data centres without having to take such a heavy toll on power consumption.”

The Switchtec Gen 6 family is designed to support up to 160 lanes for high-density AI system connectivity on a single monolithic die. It is also possible to use a multi-chip package to combine several dies.

Brian adds, “This extra set of lanes is a significant increase over previous generations of Switchtec technology at Microchip, and formerly MicroSemi before its acquisition by Microchip in 2018. It is also an increase in the total number of lanes available relative to the competition as well.”

Previous PCIe generations created bandwidth bottlenecks as data transferred between CPUs, GPUs, memory and storage, leading to underutilisation and wasted compute cycles. PCIe 6.0 doubles the bandwidth of PCIe 5.0 to 64 GT/s per lane, providing the necessary data pipeline to keep the most powerful AI accelerators consistently supplied. Providing high-speed connectivity between CPUs, GPUs, SoCs, AI accelerators and storage devices, the switches are designed to help enable next-generation AI and cloud infrastructure.

One of the first engineering wafers produced for the 3nm Gen 6 PCIe switch shown by Steve Sanghi, CEO and President, Microchip Technology Inc.

One of the first engineering wafers produced for the 3nm Gen 6 PCIe switch shown by Steve Sanghi, CEO and President, Microchip Technology Inc.

Solving the data transfer bottlenecks

Due to the high requirements for AI/ML applications, data centres and hyperscalers are looking to the 1.6T Ethernet interface to get data into and out of the data centre. It offers twice the bandwidth of current 800G implementations, which is critical for preventing bottlenecks in GPU clusters and accelerating the training of large AI models. Within the data centre, PCIe provides a high-speed backbone for various components and is critical to supporting 1.6T by alleviating internal bottlenecks.

Earlier Gen 5 and Gen 4 PCIe switches were based on 100 lanes each, maxing out at 32 GT/s (gigatransfers per second). The increase to 160 lanes, each delivering up to 64 GT/s, is ideal for providing sufficient data to and from the 1.6T interface. Using all 160 lanes, each switch can handle up to 10.2 teratransfers across all lanes.

Brian comments, “Having the highest lane count in the market that’s sampling at 160 lanes versus the competition maxing out at 144 provides an even bigger boost to data rates in the data centre.”

Gen 6 PCIe switches provide a superhighway for switching internal network traffic on boards, for example, between GPUs and CPUs, as well as between boards and other resources such as storage. For example, a GPU performing retrieval augmentation that needs quick access to storage or memory will use a fast PCIe switch. Fast switching speeds and high lane counts are key enablers of the system’s overall performance. Further, delivering fast switching at even lower power means GPUs and CPUs can benefit from a higher power budget.

In addition, the PCIe 6.0 standard also introduces Flow Control Unit (FLIT) mode, a lightweight Forward Error Correction (FEC) system and dynamic resource allocation. These changes make data transfer more efficient and reliable, especially for small packets, which are common in AI workloads. These updates lead to higher overall throughput and lower effective latency.

Addressing the power challenge

The shift to 3nm helps reduce power consumption, but other methods are also applicable. Central to the design of the PCIe switch is the ability to turn blocks within the silicon on or off.

Tam Do, Technical Engineer, Product Marketing at Microchip, Data Center Solutions, comments, “Power savings are dependent on traffic rate and management. For example, the 160-lane PCIe switch is designed so that not all lanes need to run at PCI Gen 6, though they can if needed. For lanes that are not running at Gen 6 but may be running at Gen 5 or Gen 4 rates, those electronic blocks will essentially scale down the amount of power consumed. If there’s no traffic running in a block, it will shut off the non-active block.”


Both hardware and software are involved via a configuration file, as the Switchtec Gen 6 PCIe switch has an integrated microcontroller that not only controls power consumption but also optimises traffic flow. This combination of firmware and hardware keeps power consumption as low as possible.


Brian adds, “A typical AI data centre cluster with 10,000 GPUs generally has at least 10,000 switches—one per GPU—with some configurations requiring up to four switches per GPU. Improving switch efficiency by just 10 W per switch can save about 100,000 W across the cluster, resulting in significant power savings over years of operation. While switch power is a small share of total data centre consumption compared to GPUs or cooling, every watt saved reduces both operating and cooling costs, which is critical at this scale.”

The enhanced switch efficiency delivers consistent operational and electricity cost savings for data centres as well as helping in designing more efficient data centres, as there’s less thermal load that’s needed for the volume of data that you can process on one of these switches—enabling designs for more efficient infrastructure and streamlined cooling requirements. As a result, even modest reductions in board power have a notable impact on overall power and cooling expenses.

Why is PAM 4 signalling important?

A key innovation in next-generation PCIe is the use of PAM 4 signalling. Unlike PCIe Gen 5 and earlier, which transmitted one bit per cycle (either 0 or 1), Gen 6 enables two bits per cycle—00, 01, 10, or 11—thanks to four distinct signal levels. PAM 4 effectively doubles the data rate over the same physical path. The eye diagram on the right shows the four signal levels, visually represented by the open “eyes.” While PAM 4 significantly boosts throughput, implementing it introduces engineering challenges.

Switchtec Gen 6 PCIe switches are offered in 160- and 144-lane variants and feature an integrated MIPS processor with bifurcation options at x8 and x16. These products are focused on accelerators for artificial intelligence use cases.

A lower-lane-count device will be sampling in the coming months, available in 64- and 48-lane configurations with bifurcation options down to x2. The device is also suitable for AI use cases at the lower segment of the high-end market, such as mainstream enterprise use cases. It also features CNSA 2.0 post-quantum-safe cryptography support and is ideal for server and storage architectures.

Hyperscaler data retrieval applications

Hyperscalers focused on data retrieval can have a switch-to-GPU ratio of up to 4 to 1.

Brian explains, “If we consider the analogy where the GPU and CPU are the left and right lobes of the brain, then the PCIe switch is the spinal cord. The PCIe switch provides GPUs and CPUs lightning-fast access to all the resources that they need to perform their functions, including remote sensors, DPUs (data Processing Units), accelerators, storage clusters and NVMe flash.”

For high-performance storage fabrics, the 48- or 64-lane-count devices that will be sampling next year are ideal for connecting massive clusters of storage via a high-speed fabric that provides GPUs or CPUs with direct access to speed up retrieval, thereby minimising GPU idle time waiting for the next packet of information.

For example, the ability to build composable infrastructure with this non-blocking switch architecture supports dynamic resource orchestration with shared accelerators and memory pools—enabling the development of leading-edge AI data centres with efficient, scalable designs.

A key feature of the Gen 6 PCIe switch family is its multicast capability. Typically, when storing data across multiple disc drives, every read and write command signal has to be duplicated in the CPU. Multicast uses a single command to read or write to many devices simultaneously, both upstream and downstream—significantly boosting system-level efficiency. Multicast not only increases data throughput but also reduces the CPU and data transaction requirements.

Development, design and security

Switchtec Gen 6 PCIe switches feature 20 ports and 10 stacks, with each port featuring hot- and surprise-plug controllers. The switches also support NTB (Non-Transparent Bridging) to connect and isolate multiple host domains and multicast for one-to-many data distribution within a single domain. NTB is a critical feature that enables cross-domain communication, overcoming the isolation typical of traditional PCIe architectures. By allowing CPUs and GPUs in separate domains to interact directly via switch cross-links, NTB significantly improves data centre efficiency and resource collaboration.

The PCIe switch also supports live firmware and security updates, making systems easier to maintain and evolve. The switch can be updated over both in-band PCIe and sideband interfaces, such as I²C, while normal traffic continues to run without taking systems offline. Live updates, for example, are especially useful for security as post-quantum cryptography needs to be continuously updated.

Advanced error containment, comprehensive diagnostics, debug capabilities, and a wide breadth of I/O interfaces are available. Input and output reference clocks are based on PCIe stacks with four input clocks per stack.

ChipLink diagnostic tools provide the Switchtec Gen 6 PCIe switches with comprehensive debug, diagnostics, configuration and analysis through an intuitive graphical user interface (GUI). ChipLink connects via in-band PCIe or sideband signals such as UART, TWI and EJTAG, enabling flexible, efficient monitoring and troubleshooting throughout design and deployment. The switches are also supported by the PM61160-KIT Switchtec Gen 6 PCIe Switch Evaluation Kit, which offers multiple interfaces.

Accelerating debugging and system troubleshooting, especially in new prototypes, ChipLink provides direct access and monitoring of internal chip functions, block by block, including viewing eye diagrams and link status. Diagnostic data can be captured without external analysers, even at high data rates of 64 GT/s, to a laptop for further analysis. Further, ChipLink can communicate with other Microchip devices, such as memory controllers, via a USB interface to a laptop, providing visual insight into what is happening when something is not working.

Building the data centre of the future

The first 3nm Gen 6 PCIe switch is one of the key building blocks for a new generation of data centres specifically targeting emerging AI workloads. A key focus for data centres in the future will be power consumption, high-speed communications, scalability and flexibility.

By delivering up to 160 PCIe 6.0 lanes with PAM4, FLIT/FEC, and multicast, Microchip PCIe switches eliminate bottlenecks between CPUs, GPUs, and peripheral equipment such as storage and DPUs. The switches also help reduce power consumption and cooling requirements. Integrated PQC-capable security and ChipLink-enabled diagnostics further simplify the secure deployment, debugging, and scaling of next‑generation AI infrastructure.

www.microchip.com

Related articles

Anritsu and AMD show PCI Express compliance up to 64 GT/s
EU digital package promises €155B in savings with streamlined AI and data rules
PCI-SIG announces PCIe 7.0 and optical interconnect revision for AI and data centres

If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News

Share:

Linked Articles
10s