Edge AI accelerator for challenging Generative AI applications

New Products | May 22, 2024

By Jean-Pierre Joosting

accelerator PCI LLMs Edge AI Vision GenAI LVMs Inferencing

EdgeCortix® Inc., a leading fabless semiconductor company specializing in energy-efficient AI processing at the edge, has unveiled its next-generation SAKURA-II Edge AI accelerator.

This state-of-the-art platform, paired with EdgeCortix’s innovative second generation Dynamic Neural Accelerator (DNA) architecture, is engineered to tackle the most challenging Generative AI tasks in the industry. Designed for flexibility and power efficiency, SAKURA-II empowers users to seamlessly manage a wide range of complex tasks including Large Language Models (LLMs), Large Vision Models (LVMs), and multi-modal transformer-based applications, even within the stringent environmental constraints at the edge. Featuring low latency, best-in-class memory bandwidth, high accuracy, and compact form factors, SAKURA-II delivers unparalleled performance and cost-efficiency across the diverse spectrum of edge AI applications.

Well-suited for numerous use cases across the manufacturing, industry 4.0, security, robotics, aerospace, and telecommunications industries, SAKURA-II features EdgeCortix’s latest generation runtime reconfigurable neural processing engine, DNA-II. Leveraging this highly configurable intellectual property block, the edge AI accelerator delivers power efficiency and real-time processing capabilities while simultaneously executing multiple deep neural network models with low latency. SAKURA-II can deliver up to 60 trillion operations per second (TOPS) of effective 8-bit integer performance and 30 trillion 16-bit brain floating-point operations per second (TFLOPS), while also supporting built-in mixed precision for handling the rigorous demands of next-generation AI tasks.

The SAKURA-II platform, with its sophisticated MERA software suite, features a heterogeneous compiler platform, advanced quantization, and model calibration capabilities. This software suite includes native support for leading development frameworks such as PyTorch, TensorFlow Lite, and ONNX. MERA’s flexible host-to-accelerator unified runtime is adept at scaling across single, multi-chip, and multi-card systems at the edge, significantly streamlining AI inferencing and shortening deployment times for data scientists. Furthermore, the integration with the MERA Model Library, with seamless interface to Hugging Face Optimum, offers users access to an extensive range of the latest transformer models, ensuring a smooth transition from training to edge inference.

“SAKURA-II’s impressive 60 TOPS performance within 8-W of typical power consumption, combined with its mixed-precision and built-in memory compression capabilities, positions it as a pivotal technology for the latest Generative AI solutions at the edge,” said Sakyasingha Dasgupta, CEO and Founder of EdgeCortix. “Whether running traditional AI models or the latest Llama 2/3, Stable-diffusion, Whisper or Vision-transformer models, SAKURA-II provides deployment flexibility at superior performance per watt and cost-efficiency.”

The SAKURA-II Edge AI accelerator will be offered as a stand-alone device, two different M.2 modules with varying DRAM capacity, single and dual-device low-profile PCIe cards. Customers can reserve M.2 modules and PCIe cards for delivery in the second half of 2024.

www.edgecortix.com/en