In this video, we will introduce the subject of statistical variability. we will discuss how parameters of devices vary. How to model such variation and what the effect of such variations is on device characteristics. Statistical variability is due to imperfections during fabrication processing, and atomic-level variations, for example edge roughness. Rather than having perfectly smooth edges, in reality, we have edges that are to some extent rough. Statistical variability can have serious consequences in circuit design. I'll give you an example, you will recall that we've been plotting the log of ID versus VGS, between 0 and the maximum value. 0 would be in order to turn the device off in digital operation and VDD to turn it on as much as possible. And draw the maximum current from it. And we have been using a log ID axis in order to reveal several orders of magnitude of current here. encompassing weak inverse and moderate inverse and, and strong inverse. And so we used to draw a single curve that was a straight line in weak conversion and then curved in moderate and strong inversion. But if the parameters of the transistor vary, then for each transistor you measure, you're likely to get a different curve. So you might end up with something like this. Now this can be disastrous and the reason is that some devices will have an off-current that is pretty small. But some of them will have orders of magnitude more of current. Recall that this is logarithmic axis over here. And the means that you cannot really turn those devices off. If you had a threshold shift of 1 or 2 10th of a volt, you can have a change in the off current by orders of magnitude. Now some terminology. first of all chips are made on silicone disks. they can be as large as a large plate. And here I show you one such disc. It's called a wafer. And on it, every square is one chip. For example, this one. And chips are often called dies, one chip is one die, dies is plural, but sometimes die is used also for plural. Now, variability can be observed between two transistors on the same chip. So this will be called variability within a die. Or, you could have two different chips on a wafer. So this would be chip-to-chip or die-to-die variation. Now wafers are made in lots. Several wafers, are fabricated together and in a given lot, let's say we have two wafers, there could be wafer to wafer variations. The same device, but on different lots. Of course, you do expect to see some variation from one to the other. Or you could have one lot fabricated at a given time. And another lot fabricated at a different time, possibly a different day or even a different month. You do expect to have lot-to-lot variations. And finally you also have fab to fab variations. You may have different fabrication facilities. Both of them are supposed to run the same process. But of course we, you do expect variations from one fab to another. The types of, of variations we get are global. this means that there are variations in the average value parameters, between let's say wafers or lots. Or you would have local variations due to random some variation even in adjacent transistors, and sometimes this is called mismatch. And there the dimensions of the devices play a key role. So let me give you an example in very simple terms. Let us say we have two pairs of neighboring square-gate devices. These two and these two. So these are large devices, these are small device. Now, the question is, which pair shows better matching characteristics. Are these two devices better match than these two devices? So now, we see here that we have some lack of smoothness. We have some roughness in the surface of the channel in the two cases. And likewise, for the bottom of the silicon gate, we also have some roughness. And these are local, random variations. Of course, the variations here cannot be repeated there. They're different, but their average value is similar. In other words, i, i, if you would like to define an average of excite thickness in the two cases, it will be similar in these two for these two devices. And that would lead you to a similar excite capacitance per unit area, which will lead you to a similar threshold voltage for these two devices. Now, if you go to these two devices, things can be very different, right? Because the local variations cannot be expected to average out if you only have such small dimensions. So that means that this device can have an oxide capacitance per unit area which is significantly different from that of this device. And consequently, the thresholds can be significantly different. You can say similar things about other parameters like doping concentration. So we do expect to see the gate area figuring in an important way in the matching characteristics of transistors due to local variations. So let's talk about the threshold variation with respect to it's mean value. What I will do now is, on 2 axis I will plot the thresholds of 2 devices next to each other. Identically laid out as before, and on the horizontal axis I will have the deviation of the threshold of device 1 from it's mean value. And on the vertical axis I will do the same for the second device. So, here is one situation. in this case we have large W and L. This is the deviation from the mean of the threshold voltage for device 1, and this is deviation from the same mean for device 2. Let's take one device represented by this asterisk. This means that for this device, if we go down vertically the threshold is found to be about 15 milivolts above the mean. And if you go like this, you may find that the second device, has a threshold that is less say 17 mil volts above the mean. You can see that both of them deviates from the mean by a similar amount. Similarly if you go to a device here, you can see the deviation for both of them is something like minus 20 mil volts. So there is correlation of the threshold in the two devices. Now, different symbols here represent different lots. So the asterisks represent devices all obtained from the same lot. The open circles represent devices obtained from a second lot and so on. You can see that all devices within a lot tend to have similar threshold deviations. And all devices from another lot also have similar threshold deviations, but the two differ. So the mean value will differ from lot to lot, but the local variations are rather small, and the reason is that we have a large W and a large L. Intuitively we explain what happens then, averaging takes place and therefore the local variations do not have much of an effect. Now, if instead you small w and l, then the plot looks something like this. By the way, these are measurements. So this is the deviation of device one from mean, the deviation of device two from the mean in terms of the threshold voltage. And you can have situations like let's say, this one, where it shows that for this pair, device one deviates from the mean by something like 30 mill-volts and device two deviates from the same mean by something like -50 milli-volts. You can see that the correlation is much smaller, and the reason is you have a very small w and l. So the first plot shows you what happens if the main affect is global variations in the average value of the physical parameters. Whereas the second plot shows you where, what happens when local random variations dominate, and you have large mismatch between adjacent Identically laid out devices. This can be a serious issue especially in analog design, where you really try to match the characteristics of two devices next to each other. Now how do we model variability? The ideal way is to focus on independent physical parameters, for example, oxide thickness and substrate doping. And for some of these perimeters you use relative variations, for example, the oxide thickness is equal to some nominal value plus some deviation because of global affects, plus some deviation because of local affects. For this local affects, we have already seen that, in order to suppress them, you need to have a large gate area. And infact it turns out that the variance of this is proportional to one over the gate area, or variance is the square of the standard deviation. You can do similar things for substrate doping and mobility, you can model them again using relative variations just like we did here. Now let us take some parameters, for example, the flatband voltage. Now, it doesn't make sense to talk about relative variations, because let's say if flatband voltage is equal to zero nominally, then any variation from it would correspond to an infinite percent variation. So, it really does not make sense to talk about relative variation, so we talk instead about absolute variations. So VFB has a nominal value plus a change due to global parameter, variation the change due to local variations, which are not normalized to the mean. But the variance of the local variation still turns out to be inversely proportional to the gate area. Now let's take delta W as another example. Delta W, I'll remind you, is the correction you need to apply to the mask width of a transistor in order to arrive at it's real channel width. And that delta W turns out, again, to have a nominal value, plus some change due to global variations, and some change due to local variations. So, let's say the device looks like that. Now the variations, the local variations here cannot be expected to depend on the gate area, simply because of the nature of delta W, were talking about how W is different. If the device has some bumpiness along this edge, and this edge, the longer the channel is, the more this this bumpiness will average out, and then you expect the variation to, to local effects to be small. So, it turns out then, that this variance, the variance of this quantity is inversely portional to L, for the reason I just mentioned, not to WL. Similarly, the delta L, which is the corresponding variant, correction we need to apply to the master length, to arrive at the real length of the device has some local variations that turn out to be to have a variance inversely proportional to W. [COUGH] Now, for independent statistical variables, we can add variances. And, for example, for the flat band voltage, assuming that the global and local variations are independent, we can take the sum of the two variances to arrive at the total variance of the flat band voltage. Now, for two devices we can define a correlation coefficient as it is done in statistics. For example, we have the correlation between the flatband voltage of 2 devices, 1 and 2. If we have very large devices. Then the local variations would be small, much smaller than the global variations. And then the correlation coefficient between the two is approximately 1. This is close to what we saw in the delta VT plots that I showed you a couple slides ago for large devices. On the other hand, for very small devices, the local variations become large. And then you have almost 0 correlation coefficient. Which is close to the case for the small devices, in the same delta VT plot, that I showed you. Similarly for other parameters. Now there's some important composite parameters which are not fundamental parameters like the oxide thickness and substrate doping they're not independent parameters. One is the threshold voltage. The threshold voltage will depend on oxide thickness, substrate doping, flat band voltage, and so on. So, the threshold voltage is modeled the same way as the flat band voltage. It has a nominal value, plus a variation due to global, effects, and a variation due to local effects. The variation due to local effects, turns out to have a variance inversely portional to the gate area for basically the reasons I mentioned before. And this constant AVT is measured and it is an important parameter, at least in analog design. Another important parameter, is the so-called beta, which is W over mu CX prime. This is the coefficient of proportionality in front of all of our drain current equations. this one is modeled in the relative sense, so you have the nominal value plus the nominal value times the relative variation due to the global effects plus the nominal value times the relative variation due to local variations. Now the local effects, again, have a variance that is inversely proportional to the gate area. And this A sub beta is an important perimeter, that circuit designers like to know. Both of these lead to the conclusion that if you want 2 devices matched well, both in terms of threshold and beta, you need to make their dimensions large. This is why when you look at the layout of an analog chip you'll find often, devices that are significantly larger than the devices you find in digital circuits. I would like to briefly mention about, something about the correlation between different parameters. Let us take the threshold voltage. The threshold voltage is given by this formula. We have derived this formula. This is the body effect coefficient, and it is inversely proportional to the oxide capacitance per unit area. The beta parameter I showed you a moment ago is proportional to the oxide parameter, to the oxide capacitance per unit area. Now you can see that the two quantities VT 0 and beta are correlated because they both depend on oxide capacitance per unit area and therefore on oxide thickness. If you treat these two parameters as in, as independent parameters, then you may under predict the effect of variations on your circuit. And the reason is that the two parameters are correlated. To give you a very simple example, let us say we're trying to calculate the on current in, in digital operation. And the, for a device in saturation, the simple model had predicted this type of current, right? Now let's say that the, on a given day in a given [INAUDIBLE], the oxide thickness turned out to be a little larger than the ec-, nominal ideally expected value. That would make the oxide capacitance per unit area smaller, both here and there. This means that the threshold will become larger than expected, and beta would become smaller than expected. Now if beta is smaller than expected it will have an effect on current. The current will be smaller than expected. But at the same time because the threshold is larger than expected VGS minus VT would be smaller and that also will contribute to the current being smaller. So you can see that, oxide thickness affects this and this in such a way that they combine in the worst possible way to give you a smaller current. So you need to take such correlations into account. I would also like to say that correlations sometimes exist between different types of devices, for example, between nMOS and pMOS. If their oxide is formed the same way, and sometimes even between transistors and other devices, such as resistors made of poly-silicon. Because the gate of the transistors and the body of the resistors is made in the same way. Now, how do we simulate statistical variability? A popular way is to do that is called the Monte Carlo simulations, in which you assign common random values to the global parameters for many transistors, and separate random values to local ones for each transistor. So let's say you have two adjacent devices the, you give the same nominal value to their parameters, and a different random value to their local parameters. And then you run these simulations, each time with a different combination. 10s or 100s of times, and you super impose the results, and you get the an idea of the effect of variability on the characteristics of the transistor, or the characteristics of the circuit you're running. Now, of course, this is a time consuming process, and sometimes, instead, people use the so-called corner simulations in which they combine device parameters in various worst-case combinations. For example, they may take all of the variations that contribute to making, your current small. And then, they combined variation is such a way as to make your current large. so you may end up for example, with a set of corner parameters. That makes the device the fastest and another one that makes the slowest. Such a simple corner combinations may be adequate for digital design, much of digital design. But for analog design the type of performance that you are seeking, may not be adequately expressed in terms of what the current is or what the speed is. You may have addiitonal combinations of parameters you may have to take into account. So in analog and in the radio frequency of circuits corner combinations, although they are used by themselves, may not be enough. Finally, let's talk very briefly about measurement of variability. Special test structures are used, and they are often placed in scribe lines. Scribe lines are the lines that seperate one chip from another on a wafer. This is where we cut to seperate the individual chips. And since that area is wasted after you cut it, you might as well use it for something. So it is used to place test structures there and measure certain parameters. sometimes dependent parameters are measured. For example, threshold and body-effect coefficient. We know that both of these depend on the number of parameters that may be common, for example oxide thickness. But indirectly from this, there are ways to find the statistical variation of other parameters that are independent, for example oxide thickness. And substrate doping through a process called backward propagation of variation. So, in this video we have briefly talked about the subject of statistical variability in device characteristics. We distinguished two main effects, the global effect, and the local effect. And we talked about how the local effect depends on the geometry specifically the, the size of the things like the gate area [UNKNOWN] or L depending on the parameter. As I mentioned, variability has become a subject of great importance. And the reason is as you saw already that we need to make the devices large, in order to reduce the local variability effects, but on the other hand, the push is to make devices smaller and smaller. So we have reached a point where variability is a prime concern.