Redesigned FPGA fabrics solve tough mid-range challenges

October 23, 2018 // By Ted Marena
New cost, power, and performance demands on FPGAs in a growing variety of mainstream, mid-range systems applications have led to fundamental changes in their design. Most FPGA vendors tend to focus on data-center workload applications, but a large percentage of users require different architectures for mainstream applications.

Vertical markets needing mid-range FPGAs include networking, cellular infrastructure, defense, commercial aviation, industry 4.0, and other traditional FPGA applications. Such applications are driving a new set of dynamics.

System designers must achieve a combination of low power and cost without forfeiting performance and security. This means looking at FPGAs differently, using new process technology choices, device architectures, transceiver strategies, and built-in security measures. Also important is a new fabric design that’s able to meet mainstream performance requirements while minimizing power and cost.

 

Optimizing FPGA fabric for performance, power, and cost

To meet mid-range system requirements, an alternative class of FPGAs is being manufactured on silicon-oxide-nitride-silicon (SONOS) non-volatile (NV) technology on a 28-nm node. These FPGAs typically consume one-tenth the static power of alternative SRAM FPGAs, and half the total power. Some attributes of these FPGAs (such as the non-volatile configuration memory) directly reduce static power, while total power reduction indirectly affects reduction of die area in other cases.

These FPGAs also use a traditional fabric, one of the key features of which is LUT-4 for the logic element. Six-input LUTs can provide speed benefits, which is important for data-center acceleration, but four-input LUTs are the better choice for a power- and cost-sensitive mainstream markets. It’s been well-established that four-input LUTs can make more efficient use of a die area than six-input LUTs. A given user design can be implemented with less silicon area using a 4-LUT architecture than using a 6-LUT architecture.

One contributing factor is that a six-input LUT requires 4X more configuration memory bits (64 versus 16), but can only accommodate about 1.6X as much logic as a four-input LUT. This traditional observation applies even more strongly to advanced fabrication technologies because SRAM configuration memory hasn’t scaled as fast as ordinary logic, due to the need to mitigate the risk of single-event upsets (SEUs). In contrast, FPGAs that use a SONOS configuration cell are immune to SEUs.


Fig. 1: This cluster of three four-input LUTs
has eight inputs and two outputs


Fig. 2: This cluster of two six-input LUTs
also has eight inputs and two outputs.

Consider an FPGA cluster of 12 four-input LUTs versus a cluster of 8 six-input LUTs. The total logic capability of the cluster (i.e., the amount of user logic that the cluster can accommodate) is similar in each case. The larger fan-in of the six-input LUT means fewer levels of logic may be traversed by the critical path within each cluster, potentially reducing the total contribution of intra-cluster delay to the critical path. However, from the outside, the two clusters appear similar; they have a similar typical number of incoming and outgoing signals, and so the total length and delay contributed by the inter-cluster wiring is similar in both cases.

Figures 1 and 2 show clusters with different LUTs, but similar numbers of incoming and outgoing signals.

Design category: 

Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.