Intel with an old take on big.little for Alder Lake

August 23, 2021 // By Nick Flaherty
Intel with an old take on big.little for Alder Lake
Intel has revealed a dramatic change in its general purpose chip architecture with a version of the big.little approach adopted by ARM a decade ago.

Intel’s next-generation desktop chip, code-named Alder Lake, is the company’s first hybrid architecture to integrate two core types – the Performance-core and Efficient-core. This is similar to ARM’s big.little approach which used a small core optimised for low power consumption with lower performance alongside a larger, higher performance core. Both cores could run the same code depending on the context, avoiding the problems of having a scheduler to allocate tasks to multiple cores. This has traditionally been a limiting factor for the system-level performance of multicore chip designs.

Intel’s hybrid approach is based on the threads, with a thread director. This is an improved scheduling technology that adds in more monitoring of the core to determine the context. Intel hopes this increased monitoring combined with the thread approach and three independent fabrics will avoid the potential for a performance bottleneck.  

The compute fabric can support up to 1Tbyte/s, which is 100 GBps per core or per cluster and connects the cores and graphics through the last level cache to the memory. This has a high dynamic frequency range and is capable of dynamically selecting the data path for latency versus bandwidth optimization based on actual fabric loads. It also dynamically adjusts the last-level cache policy to be inclusive or non-inclusive depending on the utilization.

Related ARM big.little articles

The I/O fabric supports up to 64 GBps, connecting the different types of I/Os as well as internal devices and can change speed seamlessly without interfering with a device’s normal operation, selecting the fabric speed to match the required amount of data transfer

The memory fabric can deliver up to 204 GBps of data and dynamically scale its bus width and speed to support multiple operating points for high bandwidth, low latency or low power/.

These connect up the different types of processor cores, controlled by the Thread Director. This is built directly into the hardware and provides low-level telemetry on the state of the core and the instruction mix of the thread. Thread Director is dynamic and adaptive, adjusting scheduling decisions to real-time compute needs rather than using simple, static rules determined at compilation time and this allows the operating system to place the right thread on the right core at the right time. 

Traditionally, the operating system would make decisions based on limited available stats, such as foreground and background tasks. Thread Director uses the hardware telemetry to direct threads that require higher performance to the right performance core at that moment. By monitoring the instruction mix, state of the core and other relevant microarchitecture telemetry at a granular level, the operating system can make more intelligent scheduling decisions

Intel has also extended the ‘PowerThrottling’ API, with an EcoQoS classification that informs the scheduler if the thread prefers power efficiency to schedule the threads on Efficient cores rather than the performance cores.

Next: Efficient Core vs Performance Core 


Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.