Ceva goes non-DSP with neural processor
Ceva (Mountain View, Calif.) looked at more than 120 neural network software architectures as it devised its hardware, according to Liran Bar, director of product marketing for Ceva’s imaging and vision DSP production line. These include the well-known AlexNet, GoogleNet and ResNet.
Although the NeuPro family members are well suited to image-based applications, where Ceva also offers the DSP-based Ceva-XM, it also addresses multiple other classification and pattern recognition applications and is described as being suitable for AI at the edge. These applications include natural language processing, real-time translation, authentication, workflow management and other learning-based applications. NeuPro is not intended to do the training of neural networks, which is usually done off-line in a data center with weights then downloaded to edge equipment.
The NeuPro architecture is based on a sea of 8bit by 8bit MACs that can be paired up to perfrom 16bit by 8bit calculations and 16bit by 16bit calculations.
There are four AI processors:
the NP500 is the smallest processor, including 512 8bit MAC units and targeting IoT, wearables and cameras;
the NP1000 includes 1024 MAC units and targets mid-range smartphones, ADAS, industrial applications and AR/VR headsets;
the NP2000 includes 2048 MAC units and targets high-end smartphones, surveillance, robots and drones;
the NP4000 includes 4096 MAC units for high-performance edge processing in enterprise surveillance and autonomous driving.
NeuPro AI processor family and typical target applications. Source: Ceva Inc.
“The trend in AI is towards lower 8bit or even 4bit or binary networks, but these lower resolution cases have not been proved beyond academic exercises. We support 8bit and 16bit fixed point data. The 16bit accuracy is for retraining purposes,” said Bar.
Next: Performance, availability
To provide a seamless handover between vision processing and neural network analytics the NeuPro includes a vision processing unit (VPU) based on Ceva-XM. The VPU also handles the CDNN (Ceva Deep Neural Network) software and provides software-based support for new advances in AI workloads. CDNN is Ceva’s neural network software framework that allows developers to generate and port their proprietary neural networks to the processor. CDNN supports the full gamut of layer types and network topologies.
The NeuPro hardware architecture is designed to reduce bandwidth to external memory via DDR and includes convolution, and fully connected activiation and pooling layers. It also performs “on-the-fly” pooling and activation pipeline processing.
The IP cores are being aimed at 16nm manufacturing process technologies and have been benchmarked as providing 30 times the performance of the Ceva-XM4 and 20 times the performance of the Ceva-XM6 when running ResNet-50.
Initial licensing is scheduled for 2Q18 with general release in 3Q18.
Related links and articles:
News articles:
CEVA and Brodmann17 partner to make AI pervasive
Neurons firing at the interim stage
Movidius upgrades VPU with on-chip neural compute
Imagination launches flexible neural network IP
ST preps second neural network IC