MENU

Ambarella shows transformers on edge AI chip

Ambarella shows transformers on edge AI chip

Technology News |
By Nick Flaherty



US chip startup Ambarella is demonstrating multi-modal large language models (LLMs) running on its new N1 system on chip silicon.

Ambarella aims to bring generative AI to edge endpoint devices and on-premise hardware by supporting the transformer models used for LLMs and video analysis across a wide range of applications such as video security analysis, robotics and other industrial applications.

Ambarella will initially be offering optimized generative AI processing capabilities on its mid to high-end SoCs for edge AI, from the existing CV72 for on-device performance under 5W, through to the new N1 series for server-grade performance under 50W.

The N1 is a derivative of the Ambarella CV3-HD architecture, initially developed for autonomous driving applications, the N1 series of SoCs repurposes all this performance for running multi-modal LLMs in an extremely low power footprint.

Compared to GPUs and other AI accelerators, the Ambarella chips are up to 3x more power-efficient per generated token, while enabling immediate and cost-effective deployment in products.

For example, the N1 SoC runs Llama2-13B with up to 25 output tokens per second in single-streaming mode at under 50W of power. Combined with the ease-of-integration of pre-ported models, this new solution can quickly help OEMs deploy generative AI into any power-sensitive application, from an on-premise AI box to a delivery robot.

The ability to support both language models and video is key at the edge, and Ambarella says its architecture is natively well-suited to process video and AI simultaneously at very low power.

Generative AI will be a step function for computer vision processing that brings context and scene understanding to a variety of devices, from security installations and autonomous robots to industrial applications. Examples of the on-device LLM and multi-modal processing enabled by this new Ambarella offering include: smart contextual searches of security footage; robots that can be controlled with natural language commands; and different AI helpers that can perform anything from code generation to text and image generation.

All of Ambarella’s AI SoCs are supported by the company’s new Cooper Developer Platform and Ambarella has pre-ported and optimized popular LLMs, such as Llama-2, as well as the Large Language and Video Assistant (LLava) model running on N1 for multi-modal vision analysis of up to 32 camera sources. These pre-trained and fine-tuned models will be available for partners to download from the Cooper Model Garden.

“Generative AI networks are enabling new functions across our target application markets that were just not possible before,” said Les Kohn, CTO and co-founder of Ambarella. “All edge devices are about to get a lot smarter, with our N1 series of SoCs enabling world-class multi-modal LLM processing in a very attractive power/price envelope.”

“Virtually every edge application will get enhanced by generative AI in the next 18 months,” said Alexander Harrowell, Principal Analyst, Advanced Computing at Omdia. “When moving genAI workloads to the edge, the game becomes all about performance per watt and integration with the rest of the edge ecosystem, not just raw throughput.”

Both the N1 SoC and a demonstration of its multi-modal LLM capabilities are on display this week at CES 2024 in Las Vegas.

www.ambarella.com

 

 

If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News

Share:

Linked Articles
10s