The prototype was developed as part of a research project on “Updatable and Low Power AI-Edge LSI Technology Development” commissioned by the New Energy and Industrial Technology Development Organization (NEDO) of Japan.
Quantized DNN Engine
Socionext’s new architecture is based on “quantized DNN technology”, which reduces the parameter and activation bits required for deep learning. The architecture provides better AI processing performance at lower power consumption. It incorporates bit reduction including 1-bit (binary) and 2-bit (ternary) in addition to the conventional 8-bit, as well as the company’s original parameter compression technology, enabling a large amount of computation with fewer resources and amounts of data.
To pair with the AI engine, Socionext has also developed a new on-chip memory technology to provide efficient data delivery, which cuts the need for large amounts of on-chip or external memory.
The newly prototyped chip integrates both new technologies to confirm functionality and performance. The prototype chip achieved object detection by “YOLO v3” at 30fps, while consuming less than 5W of power – 10 times more efficient than conventional, general-purpose GPUs. The chip also features a high-performance, low-power Arm Cortex-A53 quad-core CPU.
Deep Learning Software Development Environment
To complement the chip, Socionext has also developed a deep learning software development environment that incorporates TensorFlow as the base framework. The new environment allows developers to perform original, low-bit “quantization-aware training” or “post-training quantization”. Users can apply the optimal quantization technology to various neural networks and execute highly accurate processing.