IBM shows first dedicated AI inference chip
IBM has shown details of its first AI inference chip, built on Samsung’s 7nm process with 22bn transistors.
Telum is the first processor from the IBM Research AI hardware Centre in Albany, New York, and is the first to use on-chip acceleration for AI inferencing rather than having to go off chip to a separate processor or GPU. The chip has eight Z processor cores with a deep super-scalar out-of-order instruction pipeline, running at 5GHz, and all cores can access the AI accelerator and memory.
The three year project redesigned the cache and chip-interconnection infrastructure that IBM uses to provide 32MB cache per core, and allows clients to scale up to 32 chips. The chip, with 17 layers of metal, measures 530 mm2.
A Telum-based system is planned for the first half of 2022.
- IBM shows first chip built on a 2nm process
- GlobalFoundries and IBM in $2.5bn legal fight
- UK, IBM to invest £210 million in AI, quantum computing centre
- IBM. AMD team on confidential computing for AI
Telum is intended to operate close to mission critical data and applications to conduct high volume inferencing for real time sensitive transactions, particularly in finance, without invoking off platform AI chips that may impact performance.
IBM Research also points to its 2nm chip design from the neighbouring Albany Nanotech Complex.
Other articles on eeNews Europe
- ARM Nvidia deal goes to full investigation in the UK
- Nvidia details its BlueField DPU technology
- Intel shows engineering silicon of its biggest ever ‘chip’
- VE-Vides Project looks to trustworthy IP and verification
- How long has the semiconductor industry got?