Xilinx creates “UltraScale” FPGA architecture for move to 20-nm process
Xilinx has implemented a new architecture it terms the “industry’s first ASIC-class programmable architecture”, UltraScale. The UltraScale architecture was developed to scale from 20nm planar, through 16nm (and beyond) FinFET technologies, and from monolithic through 3D ICs. It not only addresses the limitations to scalability of total system throughput and latency, but directly attacks what the company calls the number-one bottleneck to chip performance at advanced nodes: the interconnect.
At 28 nm, Xilinx says, the ability to place 2million logic cells on a device took it into new customer sectors, especially those concerned with very high data rates and throughputs, at low latency. UltraScale is designed to serve that constituency; it is structured, Xilinx says, for 3D chips. The company confirms that, for the immediate future, “3D” means what has otherwise been termed 2.5D – separate dice laid side-by-side on a passive silicon interposer. UltraScale increases routinng on each die and also provides for very wide interconnected between chips via the interposer, and to off-chip memory. It is also easier to route, the company says, adding, “An innovative architectural approach is required to manage multi-hundred gigabit-per-second levels of system performance with smart processing at full line rate, scaling to terabits and teraflops. The mandate is not simply to increase the performance of each transistor or system block, or scale the number of blocks in the system, but to fundamentally improve the communication, clocking, critical paths, and interconnect to address the massive data flow and real-time packet, DSP, and/or image processing. The UltraScale architecture addresses these challenges by applying leading-edge ASIC techniques in a fully programmable architecture. It supports massive data flow with optimised wide buses that support multi-terabit throughput; has multi-region ASIC-like clocking, power management, and next-generation security; and has highly optimised critical paths and built-in high-speed memory, with cascading to remove bottlenecks in DSP and packet processing. It provides massive I/O and memory bandwidth with latency reduction and 3D IC wide memory-optimised interface.”
In its largest current devices, Xilinx says that it had been becoming progressively more difficult to utilise all the on-chip resources for these high-data-throughput designs. By removing the routing bottlenecks and congestion, and with upgrades to the Vivado toolset, utilisation of over 90% device resources is once again practical.
Serial-data bandwidth on large FPGAs is now very large – the problem shifts to on-chip where, “you have to get the work done,” – increases in real and effective routng tracks help with growing compexity. These high-speed designs need buses from 512 to 2048 bits wide; clock skew may be as much as half the clock period, so UltraScale uses ASIC-like clock domains, breaking the design into smaller clock regions and improving clock distribution over the chip.
Typical applications that Xilinx will seek for the new parts include 400G OTN with intelligent packet processing and traffic management; 4X4 Mixed Mode LTE and WCDMA Radio with smart beamforming; 4K2K and 8K displays with smart image enhancement and recognition; and the highest performance systems for intelligence surveillance and reconnaissance (ISR).
Xilinx will be building both its Virtex and Kintex families in 20-nm technology, and expects to have samples of one – it will not say which – by year-end. The Artix family will stay in 28-nm technology, and extensions of the Zynq family will transition at some future time.
Xilinx, www.xilinx.com