IBM shows first dedicated AI inference chip

IBM shows first dedicated AI inference chip

Technology News |
By Nick Flaherty

IBM has shown details of its first AI inference chip, built on Samsung’s 7nm process with 22bn transistors.

Telum is the first processor from the IBM Research AI hardware Centre in Albany, New York, and is the first to use on-chip acceleration for AI inferencing rather than having to go off chip to a separate processor or GPU. The chip has eight Z processor cores with a deep super-scalar out-of-order instruction pipeline, running  at 5GHz, and all cores can access the AI accelerator and memory.

The three year project redesigned the cache and chip-interconnection infrastructure that IBM uses to provide 32MB cache per core, and allows clients to scale up to 32 chips. The chip, with 17 layers of metal, measures 530 mm2.

A Telum-based system is planned for the first half of 2022.

Related articles

Telum is intended to operate close to mission critical data and applications to conduct high volume inferencing for real time sensitive transactions, particularly in finance, without invoking off platform AI chips that may impact performance.

IBM Research also points to its 2nm chip design from the neighbouring Albany Nanotech Complex.

Other articles on eeNews Europe


If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News


Linked Articles