Struggles ahead for corner-case, statistical simulations
So I have been thinking about the design environment for analog designers and the challenges that plague mixed-signal design engineers in the new technology space as we design on process nodes that have significant layout-dependent effects (LDEs) as well as strong correlation among devices.
The challenge from a design perspective is complicated. Typically in the past the designer relied on corner simulations to provide feedback into how a circuit will behave in silicon that meets required design for manufacturability goals set by the company. So the question is what happens when the designer in these new technologies cannot rely on corner models but must run extracted simulations even at the block level because of all the LDE effects and correlations and interdependencies amongst devices?
I co-authored a paper with some colleagues that addresses the concern about corner models that hits on part of the problem I will discuss in this article. The article is “Corner Models: Inaccurate at Best, and it Only Gets Worst…” (Proc. IEEE CICC, 2013). Here is a brief statement from the conclusions drawn in the article that highlight some of the issues: corner models “cannot accurately bracket +/- 3σ variation in every performance measure for every circuit; “appropriate” corner models are not just circuit-dependent, they also vary with the device sizes and biases used within a single circuit, and can be different for different measures of circuit performance for the same circuit.”
This paper mentioned highlights some of the reasons why corner models will not work for complex mixed-signal design but why. Look at this image from a paper I presented at a modeling conference (Figure 1).
Figure 1: Effects on global and local parameters as the technology shrinks.
As shown in the image, as we shrink the technology, the global variations and local mismatch merge and it is no longer valid to just run corners. Furthermore, if we look at the electrical performance of transistors and devices as the technology shrinks we start seeing more interdependencies through LDE including pattern densities and device performance.
Simply put if we look at the variance of two random variables in these smaller technologies, there exists correlation amongst the parameters. Shown in equation 1 is the variance of two generic parameters X and Y. If the expected value is computed for the variance of these two parameters that are correlated, the result (Equation 1) shows that a covariance exists that cannot be ignored for high performance analog.
EQ1: var (X + Y) = var (X) + var (Y) + 2 * cov (X,Y)
Therefore, with all of the mentioned effects that can change performance designers of complex high performance analog circuits are going to have to run statistical simulations to have confidence that the circuit is behaving as we desire to meet performance targets. But how when there are so many devices? This is the topic I would like to focus on next.
As stated, the smaller technologies have significant LDEs that cause devices to behave differently from a simple schematic representation of a circuit. The full effects of the LDE parameters are usually only accessible through some type of extracted layout of the final circuit. This means that typically the extracted circuit is a flat netlist representation of the circuit represented in the schematic.
So as already stated, because of the smaller geometries and the correlation that exists between devices, we need to run statistical simulations to understand the true performance of a circuit where there are strong interdependencies through layout and device parameter correlations.
So what does this mean for the designer that must run extracted statistical simulations? In the landscape of tools available to designers that run extracted simulations, there are limited options for selecting devices from extracted netlists for inclusion in a statistical simulation. This means that the idea of selecting devices for statistical simulations from the schematic is no longer a viable option. So what can be done? To address these issues many designers may just select the top-level instantiation of the circuit in question by selecting /I*/I* etc. This allows the designer to not have to pour through the netlist to select the transistors desired to vary in a random manner.
Will this work? In some cases maybe but the number of devices and parameters that exist creates another problem that will be discussed shortly. What if the designer selects the particular devices desired to vary, by pouring through the netlist? I would say that given the interdependencies as I stated earlier, selecting the correct devices to vary is problematic unless the designer knows the sensitivities in his or her circuit. Let’s investigate further.
If the designer chooses to run statistical simulations from selecting I* to avoid pouring through the netlist and not knowing the device interdependencies, the number of parameters increases significantly and can lead to what Colin McAndrew calls spurious correlations. Let me illustrate.
When the designer has more devices chosen for statistical sampling than the number of runs, the probability for what is called spurious correlations is high. This means that as the simulator moves parameters in a random manner there is a high probability that similar parameters will move in unison since there are not enough simulations to see the whole statistical space. For instance, in the scenario below shown in Figure #2, the parameters are along the x-axis and the assigned number runs along the y-axis. As can be seen in this simple example, there is a chance with the number of simulation run that all the parameters move in the same direction (indicated by either all +1’s or All -1’s).
This is a valid point in the space but it does not mean that it is a valid +/-3Sigma point. It could be an extreme outside of the normal Gaussian distribution and represent some large sigma value that we would not see in production. The problem is that with fewer samples than number of parameters, these spurious correlations could indicate that your results are great when in reality the exact opposite can be true. Additionally, the results could also show some bad results that are real but at some extremes of the boundary much greater than some 3+ sigma number and would not occur in production – however a bad result is still worth investigating to see if the circuit is not deficient in some obvious manner (for instance BW, SR, Speed, Gain etc.). See Figure 2.
Figure 2: Simple example of statistical runs and extremes that can occur for a limited sample space.
So, what if the number of statistical samples is small for a given Monte Carlo statistical run for varying the device parameters? Colin McAndrew has provided a nice image (Figure 3) that shows the correlations that can be expected from limited statistical samples given that the number of parameters are more than the number of statistical runs (# samples in the plot) for different methods of randomizing the parameters. The plot clearly shows that for smaller number of samples there is a strong correlation between parameters and can result in the spurious correlations as already mentioned that can lead to erroneous outcomes.
Figure 3: Spurious correlations as the number of statistical samples change.
Given this plot and the need for statistical simulations it becomes easy to see that there is a difficult problem that analog designers face in these new technologies. Because of the need for statistical simulations, the designer needs to think clearly about his/her circuit sensitivities to be able to specify a large enough statistical sample for simulating spurious correlation free results that are valid. This means more design thought is needed on the circuit sensitivities. The designer needs to be smart about the choices of devices for statistical simulations. Furthermore, the designer will need to really know his or her circuit more than in the past because the simulation cost from a pure compute resource point of view is prohibitive. Then there is the question what if the designer arbitrarily chooses I* from the top level of the block to run statistical simulations? If the designer choose all of the devices in a decent size block the simulations will take a long time to finish if they even do. If so is it in a timely manner?
I present this article to challenge the design community to arrive at a solution and to make the analog design community aware of the real challenges faced designing high performance analog in the new nanometer technologies. Having a large compute farm to run massive statistical simulations is not the answer. Designers must become smarter and more aware of the design sensitivities or the results expected could be very different in silicon. This is a very costly option as many times designers will get one maybe two shots to get the circuit correct or have to live with reduced performance – not an option companies introducing new products will see as acceptable in many cases.
I would like to acknowledge and thank Colin McAndrew (Fellow NXP) for his valuable insight and contributions to this article.
Are you aware of these challenges?
What do you think is needed for satisfactory simulations results?
Brandt Braswell is a distinguished member of the technical staff at NXP Semiconductors and focuses on the development of data converters, with an emphasis on delta-sigma conversion.
This article first appeared on EE Times’ Planet Analog website.
Related links and articles:
In analog, EDA tools cannot replace common sense
Taking analog circuits to lower voltages
The perils of figure of merit based circuit selection
Analog synthesis remains remote