Cadence Design Systems has launched a scalable series of configurable machine learning accelerator IP blocks for system on chip designs with NXP as one of the first customers.
The Tensilica AI Platform has three product families optimized for varying data and on-device AI requirements. The blocks include random sparse compute to improve performance, run-time tensor compression to decrease memory bandwidth, and pruning plus clustering to reduce model size for embedded designs.
The families are based around a new companion AI neural network engine (NNE) that consumes 80 percent less energy per inference and delivers more than 4X TOPS/W compared to Tensilica’s standalone DSPs.
The accelerators are aimed at intelligent sensor, internet of things (IoT) audio, mobile vision/voice AI, IoT vision and advanced driver assistance systems (ADAS) applications.
The AI Base family includes Tensilica’s HiFi DSPs for audio/voice, Vision DSPs, and ConnX DSPs for radar/lidar and communications, combined with AI instruction-set architecture (ISA) extensions to add voice control or speech recognition.
NXP is using the Tensilica HiFi neural network library as part of its eIQ Machine Learning Software Development Environment in its i.MX RT600 crossover microcontroller.
“Integrating a Cadence Tensilica HiFi 4 DSP into the NXP i.MX RT600 crossover MCU not only provides high-performance DSP capabilities for a broad range of audio and voice processing applications, but also increases inference performance, enabling AI even in very low-power, battery-operated products. The HiFi neural network library allows NXP to take full advantage of the AI capabilities of the HiFi 4 DSP and integrate it into NXP’s eIQ Machine Learning Software Development Environment supporting the TensorFlow Lite Micro and Glow ML inference engines,” said Cristiano Castello, Sr. Director of Microcontrollers Product Innovation at NXP Semiconductors
- Tensilica plans 2048bit wide DSP core
- Faster architecture accelerates Tensilica DSP
- Tensilica Vision Q6 DSP to boost embedded vision and AI
The AI Boost family adds a companion NNE, initially the Tensilica NNE 110 AI engine, which scales from 64 to 256 GOPS and provides concurrent signal processing and efficient inferencing.
The AI Max family is based around the scalable NNA 1xx AI multicore blocks. This scales from the single core NNA 110 accelerator to the dual core NNA 120, quad core NNA 140 and octocore NNA 180. The cores are connected via a network on chip (NoC) that is compatible with the AXI bus that runs across the ret of a chip.
The cores support common machine learning frameworks via the Tensilica Neural Network Compiler, which supports TensorFlow, ONNX, PyTorch, Caffe2, TensorFlowLite and MXNet for automated end-to-end code generation; Android Neural Network Compiler; TFLite Delegates for real-time execution; and TensorFlow Lite Micro for microcontroller-class devices.
“AI SoC developers are challenged to get to market faster with cost-effective, differentiated products offering longer battery life and scalable performance,” said Sanjive Agarwala, corporate vice president and general manager of the IP Group at Cadence. “With our mature, extensible and configurable platform based on our best-in-class Tensilica DSPs and featuring common AI software, Cadence allows AI SoC developers to minimize development costs and meet tight market windows. By enabling AI across all performance and price points, Cadence is driving the rapid deployment of AI-enabled systems everywhere.”
"Scaling low power on-device AI capabilities requires extremely efficient multi-sensory compute. Cadence and the TensorFlow Lite for Microcontrollers (TFLM) team have been working together for many years to co-develop solutions that enable the most cutting-edge, low-footprint use cases in the AI space. The trend for real-time audio networks to use LSTM-based neural nets for the best performance and efficiency is a key example. Working closely with Cadence, we are integrating a highly optimized LSTM operator on Tensilica HiFi DSPs that enables the next level of performance improvements for key use cases like voice-call noise suppression,” said Pete Warden, Technical Lead of TensorFlow Lite Micro at Google.
“On-device AI deployment on our KL720—a 1.4 TOPS AI SoC targeted for vehicles, smart home, smart security, industrial control applications, healthcare and AI of things (AIoT)—is key to both our customers’ success and our mission to enable AI everywhere, for everyone. Cadence’s high-performance, low-power Tensilica Vision DSPs pack a lot of compute capacity with AI ISA extensions plus the necessary AI software to tackle the latest AI challenges, said Albert Liu, Founder and CEO of chip designer Kneron.
The NNE 110 AI engine and the NNA 1xx AI accelerator family are expected to be in general availability in the fourth quarter of 2021.
- Neural network core is optimised for robotaxis
- Imagination IP in Chinese AI integration
- Neural network accelerator cores target the edge
Other articles on eeNews Europe