The Vision-In-Package (VIP) system, as they call it, packs a camera system with a low-power processor (ARM Cortex M4/M7 with 8MB RAM), a high-dynamic range imager, optics, and a communication interface. The system occupies only around 4 cm3 and weighs less than 20g including a battery cell and features a complete facial analysis pipeline running in real time and fully embedded within the VIP system.
The software is compact and stand-alone with no external dependencies. It is comprised of a minimal version of the uKOS operating system (developed under the μKernel project - www.ukos.ch) and a face analysis package running on it. Unlike existing systems that run on powerful hardware architectures, the VIP system requires several orders of magnitude less CPU time and memory and the analysis pipeline runs at around 4-5 frames per second at QVGA resolution.
First, all the faces in an acquired frame are detected, which typically takes less than a hundred ms to run and requires only a few hundred KB of RAM memory. Then facial attributes, such as corners of the eyes and nose, are located within each detected face region and the face undergoes a normalization step (a rough geometric transformation that aligns the eyes horizontally and scales the face to a standard size, together with a photometric normalization that re-moves non-linear intensity variations caused by shadows and non-uniform illumination). Then actual face recognition takes place, extracting descriptive features at landmark locations to uniquely identifying people in a database of registered faces. New individuals can be registered to this database instantly at any time with just a single click and without requiring any re-training.