MENU

European server project promotes ARM on FDSOI

Technology News |
By eeNews Europe

The three-year project, which started September 2013, is expected to deliver an innovative scalable computer system architecture this year and a hardware-software prototype implementation by the end of 2015.

The rise of cloud-enabled smart devices, cloud-based client services and the Internet of Things (IoT) is expected to create an opportunity as they will drive a shift in the needs of the IT infrastructure which is already under pressure to reduce power consumption even as it tries to scale up to serve increasing numbers of applications.

The Euroserver project is advocating the use of low-power ARM processors in a server architecture that uses 3D integration to scale processors, memory and I/O, all managed by system-wide virtualization and efficient use of resources by cloud applications. The group is aiming for a factor of ten improvement in energy efficiency over traditional server and microserver architectures.

Microservers are typically servers designed to serve applications that don’t individually require high levels of computing performance but that may have to be done in large numbers and or may have critical latency aspects to performance. In the past servers tended to aim at ever higher performance and in recent years were the almost exclusive domain of the x86 processor architecture of Intel. Lower power microservers are now expected to take up an increasingly diverse number of data handling opportunities. And microservers have long been the chosen ground on which ARM has chosen to fight Intel in data center

John Goodacre, director of technology and systems at ARM and a visiting professor of computer architectures at the University of Manchester, said that the groups collaborating in Euroserver reflect the microserver profile. "TU Dresden is interested in the handling of databases in embedded telecom. Eurotech is a systems company looking at more deeply embedded applications," said Professor Goodacre. He also noted that that Spain’s Barcelona Supercomputing Center is present which reflects an interest in scaling up to take on high performance computing.

Next: The unit of compute


The Euroserver project is leveraging the availability of an octa-core processor chip that ST makes on its FDSOI manufacturing process and also sees importance in using the latest 3D manufacturing techniques, courtesy of project participant CEA-Leti to build the best power-performance trade off it can. "We’ve tried to take a holistic view of the challenge. It’s a mixture of both hardware and software," said Professor Goodacre.

Unit of compute, move the task

One of the foundation stones of the Euroserver project is an idea that Professor Goodacre laid out in a keynote speech at the DATE conference in 2013 – the ‘Unit of Compute.’

Figure 1: Unit of Compute. Source: ARM

This is the idea of what is the minimum requirement within a computer node to allow its memory to be used by the outside world coherently with a minimum of overhead. A ‘unit of compute’ is managed by a single symmetric multiprocessing (SMP) operating system within a coherent region of memory. It has a processor system – from one to many cores – local memory, provides a coherent view of its memory to the outside world and has a path to access remote memory attached to other ‘units of compute.’

The net result allows the simplification of memory handling compared with traditional server computer architectures and is scalable. And such simplification can result in energy saving argues Professor Goodacre.

Next: The Unimem model


The project has developed a universal memory model or Unimem to build on this by making a the memory a key focus of the architecture and dispensing with some traditional cache coherency requirements that have traditionally been implemented in servers.

In a presentation Professor Goodacre observed that there is no need for sequential consistency to be imposed in most data center workloads. Applications tend to partition datasets and it is best to place the processor and its cache near the dataset of a particular application task.

In other words, rather than moving data sets around at great energy expense in terms of moving data and imposing cache coherency requirements it is more energy efficient to keep the data set still and move the task to a processor that is near the required data set. "We have tried to come up with a software-centric architecture," said Professor Goodacre.

The Unimem approach not only maintains a consistent and coherent access from each compute node to its local DRAM but manages access to the system-wide memory resource. "Most importantly it can be implemented using available ARM technology with little additional hardware overhead," Professor Goodacre said. <!– /* Font Definitions */ @font-face {font-family:"MS 明朝"; mso-font-charset:78; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1791491579 18 0 131231 0;} @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1107305727 0 0 415 0;} @font-face {font-family:Cambria; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1073743103 0 0 415 0;} @font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-520082689 -1073717157 41 0 66047 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"MS 明朝"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"MS 明朝"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} @page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} –>

Next: Would you like chiplets with that?


Chiplets

The project will make use of 64-bit ARM cores but makes the argument that at present levels of integration for servers chip costs are at the level $400 to $800 per unit and likely to double as production moves on to FinFET processes below the 20nm node. For reasons of yield the project sees a benefit in only implementing in leading-edge processes what needs to be and minimizing die size. These processor die become "chiplets" in the Euroserver nomenclature and sit on top of an interposer that carries peripheral circuitry.

In the physical implementation each chiplet will be an octa-core Cortex-A53 part implemented in 28nm FDSOI. And four of these chiplets will go on top of an interposer in a packaged part.

Figure 2: Server in a package, potentially including hybrid memory cubes

The Euroserver project started in September 2013 and for its first year has been working to flesh out and validate the computer architecture. One of the aspects of the architecture is to try and minimize use of long-distance interconnect and such bus standards as PCIe which were largely designed and optimized for performance rather than per-bit transferred energy consumption.

"We’ve spent time looking at software access patterns and the communication between the islands of coherence. We can see how to achieve 100 nanoseconds compared with typical traditional figures of 500 microseconds," said Professor Goodacre.

<!– /* Font Definitions */ @font-face {font-family:"MS 明朝"; mso-font-charset:78; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1791491579 18 0 131231 0;} @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1107305727 0 0 415 0;} @font-face {font-family:Cambria; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1073743103 0 0 415 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"MS 明朝"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"MS 明朝"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} @page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} –>

Professor Goodacre said that the use of the ARM processor or the FDSOI process were not the most critical things in achieving a highly efficient and scalable architecture compared with the main thrust; how data is handled. However, having a low power processor on an intrinsically low power process all helps.

Next: How much does it cost?


"Using ARM allows us to design a low-power system," said Professor Goodacre. "And 28nm FDSOI gives us a very interesting power management lever with back biasing and retention modes. So it’s the FDSOI, the chiplets, the 3D stacking, the software that all together make the difference," said Professor Goodacre.

While European Commission funded projects are not supposed to be used as a means of subsidizing commercial operations, if it should spark European-based commercial success in datacenters at the expense of Intel’s x86 ecosystem few tears will be shed across Europe which has seen the strength and depth of its electronics base decline for many years.

The total cost of the project is €12,925,771 (about US$15.6 million) of which European tax payers are expected to provide €8,599,929 (about US$10.4 million).

Related links and articles:

www.euroserver-project.eu

News articles:

Why HTTP won’t work for IoT

Digital and analog I/O modules have built-in web servers

HP outfits data center with 10,000 sensors


Share:

Linked Articles
eeNews Europe
10s