SambaNova AI chip runs models with up to 5 trillion parameters

SambaNova AI chip runs models with up to 5 trillion parameters

Technology News |
By Nick Flaherty

SambaNova Systems has announced an AI chip to power its full stack large language model (LLM) platform.

The SambaNova SN40L chip supports both dense and sparse computing through a reconfigurable architecture and can choose between large DRAM or fast  HBM memory interfaces to run and train AI models with up to 5tn parameters.

“Today, SambaNova offers the only purpose-built full stack LLM platform — the SambaNova Suite — now with an intelligent AI chip,” said Rodrigo Liang, co-founder, and CEO of SambaNova Systems. “We’re now able to offer these two capabilities within one chip – the ability to address more memory, with the smartest compute core – enabling organizations to capitalize on the promise of pervasive AI, with their own LLMs to rival GPT4 and beyond.” 

The new chip is just one element of SambaNova’s full-stack LLM platform for generative AI.

“We’ve started to see a trend towards smaller models, but bigger is still better and bigger models will start to become more modular,” said Kunle Olukotun, co-founder of SambaNova Systems. “Customers are requesting an LLM with the power of a trillion-parameter model like GPT-4, but they also want the benefits of owning a model fine-tuned on their data. With the SN40L, our most advanced AI chip to date, integrated into a full stack LLM platform, we’re giving customers the key to running the largest LLMs with higher performance for training and inference, without sacrificing model accuracy.”

The SN40L can serve a 5 trillion parameter model, with 256k+ sequence length possible on a single system node. This enables higher quality models, with faster inference and training at a lower total cost of ownership.

A new embeddings model for vector-based retrieval augmented generation enabling customers to embed their documents into vector embeddings, which can be retrieved during the Q&A process and NOT result in hallucinations. The LLM then takes the results to analyze, extract, or summarize the information. It also includes an automated speech recognition model to transcribe and analyze voice data. 


If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News


Linked Articles