The architecture includes maths capabilities that could be used by other software as part of a heterogeneous system architecture. That could include neural network software but ARM executives stressed that Bifrost is first and foremost an architecture for raster, tile-based graphics processing units (GPUs).
The previous architecture – Midgard – is the one that underlies ARM's T-series Mali GPUs and has up to 16 unified shader cores and SIMD [single-instruction multiple data] instruction set architecture. Bifrost supports up to 32 unified shader cores with a scalar ISA, full hardware cache coherency and something called clause execution.
Top level architecture of Bifrost showing up to 32 universal shader cores, Source ARM.
Inside the shader core showing quad-thread fragment management and execution engines. Source: ARM.
The primary goal, according to Sean Ellis, GPU architect with ARM, was to achieve more performance per square millimeter of silicon and per line of "real-world" shader code. And this has been achieved to tune of about 50 percent through the use of a new scalar, clause-based ISA, with quad-based arithmetic units
Whereas Midgard GPUs use SIMD vectorization Bifrost GPUs will use quad vectorization in which four scalar threads from a 2 by 2 pixel are executed in lock step. Each thread fills one 32-bit lane of the hardware and four threads doing a vec3 FP32 add takes three cycles. In short quad-vectorization is compiler friendly and improves resource utilization.