ARM to boost processor performance by 50x with new AI instructions

ARM to boost processor performance by 50x with new AI instructions

Technology News |
By Nick Flaherty

The DynamIQ cluster technology will allow up to eight completely different cores to be used in a big.LITTLE style. The move is aimed at a wide range of applications, including driverless cars and automotive driver assistance systems as well as enterprise servers.

“By 2020 we expect to see a lot of artificial intelligence deployed from autonomous driving platforms to mixed reality,” said Nandan Nayampally, General Manager of ARM’s Compute Products Group. “Even with 5G you cannot purely rely on the cloud for machine learning or AI so as performance continues to grow it needs to fit into ever smaller power envelopes.”

Cluster technology is at the heart of the ARM strategy for future devices, he says.

“We started cluster with the ARM11 4-core cluster ten years ago, and then big.LITTLE was six years ago, and we used the CoreLink SoC [fabric] to scale these into larger systems,” said Nayampally. “DynamIQ is the next stage, complementary to the existing technology, with up to 8 cores in a single cluster to bring a larger level of performance. Every core in this cluster can be a different implementation and a different core and that brings substantially higher levels of performance and flexibility. Along with this we have an optimised memory sub system with faster access and power saving features,” he said.

This would allow several small cores and several large cores to operate independently and switch code between the different cores depending on the processing requirements. “For example, 1+3 or 1+7 DynamIQ big.LITTLE configurations with substantially more granular and optimal control are now possible. This boosts innovation in SoCs designed with right-sized compute with heterogeneous processing that deliver meaningful AI performance at the device itself,” he said.

Each of the cores could also have its own tightly coupled accelerators which would bring new levels of responsiveness with AI and machine learning (ML) with computer vision, says Nayampally, but external accelerators or AI designs would interface through the existing CoreLink ports.

“The compute need for AI or ML is not just uniform, it needs different types of compute, but the CPU will remain fundamental to AI,” he said. “The ARM compute libraries launched last month are optimise for vision and ML and these offer convolution filters and convolutional neural network (CNN) components, and DynamIQ builds on this further with additional dedicated processor instructions will deliver 50x improvements in AI workloads over today’s systems and this is easily available through standard development tools and the ARM libraries,” he said.

This will be available as an extension to ARMv8, not waiting for the next generation ARM instruction set. This will be included in new Cortex A processor cores later this year. Partners such as NVIDIA have already modified ARM processor cores for AI applications.

“We have yet to announce processors that will work with DynamIQ and you will hear more in the short to medium term,” said Nayampally. “DynamIQ supports a variant of the ARM v8 that is fully backwards compatible and we work closely with our lead partners to develop these technologies.”

Related stories: 



Linked Articles