
The reality is almost all automotive neural network applications comprise a series of smaller NN workloads. By considering the many forms of parallelism inherent in automotive NN inference, a far more flexible approach, using multiple NN acceleration engines, can deliver superior results with far greater scalability, cost effectiveness and power efficiency
When considering the design of a hardware platform capable of executing the AI workloads needed for automated driving, many factors need to be considered. However, the biggest one is uncertainty: what capabilities does the hardware actually need to execute the worst case NN workload, and how much performance is needed to safely and reliably execute that? And how much of the time does that worst case workload have to run?
Many automotive system designers, when considering suitable hardware platforms for executing high performance NNs (Neural Networks) frequently determine the total compute power by simply adding up each NN’s requirements – the total defines the capabilities of the NN accelerator needed. Tony King-Smith at AIMotive looks at the techniques to exploit parallelism in NN workloads to realize scalable, high performance acceleration hardware.