The library, a collection of low level building blocks for imaging, vision and machine learning that will become available at the end of March as open source software, was being demonstrated on ARM’s booth at the Mobile World Congress in Barcelona.
The library includes common functions for machine learning frameworks and includes neural networks, colour manipulation, feature detection image reshaping and General Matrix-to-Matrix Multiplication (GEMM), which can be at the heart of implementing convolutional neural networks on maths-capable processors.
The “show-and-tell” on the booth was by way of an application running on a standard mobile phone that attempts to estimate the calorific content of foodstuffs in picture, such as popcorn, chocolate or seeds.
How many calories in a bowl full of seeds? How deep is the bowl?
The demo operates by cutting away the background from the foodstuff in image and then comes up with an estimate of the volume of the foodstuff. It uses image recognition on a trained neural network to decide what that foodstuff is and then goes to an on-device look up table to find out the per volume calorific content of the identified food. Because everything is done on the smartphone data bandwidth and latency from communicating with the crowd are not an issue although battery life on the mobile phone might be.
The demonstration was prepared by ThunderView, a division of Chinese software developer Thunder Software Technology Co. Ltd. (Beijing, China), otherwise known as ThunderSoft, which is developing the calorie counting application.
There would appear to be a few potential sources of inaccuracy here. For example, the background cut-away routine seemed to underestimate the size of an image such as seeds and in particular one must question the ability to determine the volume of the material in shot. But as a demonstration of the ARM Compute Library it served its purpose.
Next: How much better than OpenCV on Neon?
In an article on ARM’s website Roberto Mijat, a software product manager at ARM, quoted Yu Yang, the CEO of ThunderView, as saying: “Our engineers were able to improve performance of the critical path of our software, in particular convolution, using the ARM-optimised CPU and GPU routines, and this has helped us achieve our performance targets and getting our solution to market sooner.” (see Watch your weight blog).
ARM is claiming that in a given test improvements of about 14x or 15x can be achieved running routines from the Compute Library when compared with running OpenCV on Neon.
OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. Neon is the 128-bit SIMD (Single Instruction, Multiple Data) architecture extension for the Cortex-A series processors.
ARM information from MWC demo.
The ARM Compute Library runs on any ARMv7 and ARMv8 CPU and any Mali Midgard and Bifrost GPU an ARM spokesperson said. The performance varies depending on core implementation, the amount of support for ML and the number of cores. Both single and multicore processing are supported although it remains unclear as to whether heterogeneous computation is supported.
The spokesperson said: “The library includes all the key building blocks for ML, such as SGEMM, we are working on integration with popular frameworks such as TensorFlow and Caffee.”
The variable availability of resources at run time is not something catered for automatically. “This needs to be taken care of by the developer/platform,” the spokesperson said in email correspondence.
And the library is free. “Will be released at the end of March under an MIT license; a very permissive open source license,” the spokesperson said.
Next: Dedicated hardware IP?
A couple of questions remain:
1) How does Compute Library achieve such a striking result compared with OpenCV on Neon?
2) And the biggest question is whether this is a software holding position being adopted by ARM to help it define the best form of future hardware support for machine learning? ARM could then introduce a configurable family of licensable hardware IP. Or is it the case that computer vision and machine learning needs are either so diverse – or so close to mainstream processing – as to not require direct hardware acceleration?
Certainly, Qualcomm has spoken about machine learning support within its Snapdragon line of application processors. In 2016 it came up with an SDK for neural network software that piggybacks on the existing Kyro CPU, Adreno GPU and Hexagon DSP cores inside the Snapdragon 820 processor. Meanwhile plenty of startups are claiming to have or be working on best-in-class hardware for machine learning. Besides the likes of Synopys, Cadence and Ceva offering machine learning support there are such startups as TeraDeep, Graphcore, BrainChip and KnuEdge with their own machine learning processors.
Related links and articles:
Neural network/machine learning startup articles: