The company has a 16nm chip design in progress due to tape out at end of 2020. The company is aiming at processing the highest frames per second per watt. Machine-learning supported surveillance is one opportunity. The company claims that cycle-accurate simulations indicate MLSoC will be able to achieve 1000fps/watt for ResNet50 working with 224 by 224 frames from conventional image sensors.

The company claims this is 10x to 30x improvement over alternatives. The chip when it appears will offer performance from 50TOPS at 5W to 200TOPS at 20W and therefore offering 10TOPS/W.

Applications include: semi-autonomous and fully autonomous vehicles; untethered robots; secure diagnostics, secure computer vision

Kavitha Prasad, vice president of system solutions at, presented MLSoC at the Linley Spring Processor Conference on April 7 and opened by saying sima means ‘edge’ in Sanskrit.

The design is targeting 1GHz clock frequency in 16nm manufacturing process and includes up to four camera lines; a video pipeline including licensed image signal processor and computer vision processor cores; an ARM subsystem and LPDDR4 or LPDDR5 data connections out. Prasad the choice of DRAM interface is still under consideration. As is the ARM core. This will be either a CortexA6x or Cortex-A72 or Cortex-A75, Prasad said.

The MLSoC includes a security block that performs encryption and it also includes a safety block that enables designs that meet ISO 26262 and ASIL automotive standards. Am Arteris network-on-chip efficiently connects all these subsystems.

Next: Value add

Sima’s true value add is in the machine learning accelerator (MLA) block. This is a tile based approach with multiple tiles able to be connected either off-chip in component arrays, or on-chip, by a proprietary AXI-based interconnect. This provides scalability to the architecture.

This block runs complex neural networks at much less power than GPUs. The tool chain is also being aimed at the mainstream with support for TensorFlow PyTorch and ONNX and other frameworks

Prasad reported a number of favourable benchmarks for MLSoC. expects the MLSoC to achieve 2,280 images per second (IPS) on ResNet-50 inference when running at a batch size of one (batch=1), which is typical for real-time video analysis. And while consuming just 4W.

Multple benchmark results for one mosaic tile at 50TOPS and 5W. Source:

In the case of untethered robots Prasad estimated that use of MLSoC could extend time away from the docking station from 45 mins to 8 hours.

When asked how intended to compete with established incumbents such as Intel-Mobileye and Nvidia, Prasad said that bringing the power profile down is key because customers wish to extend their workloads and are power-constrained.

Prasad admitted that the Mosaic tile is optimized for matrix multiplication and convolutional neural networks in particular but could support recurrent neural networks (RNNs) and long short-term memory (LSTM) networks.

Krishna Rangasayee is the founder and CEO of and the company has recently welcomed Moshe Gravielov, a director at TSMC, on to its board of directors.

Related links and articles:

News articles:

Development environment eases machine learning on to microcontrollers

GrAI Matter, Paris research gives rise to AI processor for the edge

ResNet-50 – a misleading machine learning inference benchmark for megapixel images

Groq enters production with A0 tensor processor


Linked Articles