Partners in the project have built a groundbreaking compute node prototype combining 3DIC with multi-chip-module integration technologies, heterogeneous compute elements with Arm cores and FPGA acceleration and the UNIMEM memory system, all powered by a high-performance, high-productivity software stack.
The ExaNoDe prototype is part of the disruptive change required to provide the necessary compute density and power efficiency for an operational exascale machine. Taking as a basis an innovative interposer developed by CEA, ExaNoDe allows the combination of multiple system-on-chips (SoC) chiplets, forming a three-dimensional integrated circuit (3DIC). This delivers multiple advantages, such as higher chip fabrication yields thanks to the smaller chip size, reduced costs of customization, increased flexibility to slot in compute elements and reduced inter-chip communication distances, resulting in improved energy efficiency.
The researchers involved have modelled an ExaNode with state-of-the-art 7nm Arm-core based chiplets on a silicon interposer and HBM2. Simulations show that such an implementation would enable modular, cost-effective and energy efficient multi-teraflops heterogeneous compute nodes.
The UNIMEM memory system, which was created in the EUROSERVER project and is being brought to scale in the EuroEXA project, allows the creation of shared memory among multiple compute nodes. The UNIMEM shared memory is accessible through a non-coherent global address space, and is made visible to the programmer via a native UNIMEM API, standard MPI-3.0 and GPI-2. Advances in OmpSs-2@Cluster and OpenStream allow programmers to exploit the ExaNoDe architecture through a multi-node task-based programming model. In order to increase the resilience and improve the manageability of the compute node, the software stack also includes virtualization, with check-pointing and virtualization of the UNIMEM capabilities.