Exploiting parallelism in neural network workloads for scalable acceleration hardware

February 17, 2021 // By Tony King-Smith
Exploiting parallelism in neural network workloads for scalable acceleration hardware

The reality is almost all automotive neural network applications comprise a series of smaller NN workloads. By considering the many forms of parallelism inherent in automotive NN inference, a far more flexible approach, using multiple NN acceleration engines, can deliver superior results with far greater scalability, cost effectiveness and power efficiency

When considering the design of a hardware platform capable of executing the AI workloads needed for automated driving, many factors need to be considered. However, the biggest one is uncertainty: what capabilities does the hardware actually need to execute the worst case NN workload, and how much performance is needed to safely and reliably execute that? And how much of the time does that worst case workload have to run?

Many automotive system designers, when considering suitable hardware platforms for executing high performance NNs (Neural Networks) frequently determine the total compute power by simply adding up each NN’s requirements – the total defines the capabilities of the NN accelerator needed. Tony King-Smith at AIMotive looks at the techniques to exploit parallelism in NN workloads to realize scalable, high performance acceleration hardware.

Company: 
neural network

Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.