Next we're going to talk about how to use ordinary, bivariate, generating functions, to calculate moments. now this sections has quite a bit of calculation in it. Got a quite a bit of math on the slides, and they're a little bit more for reference than for me to talk about. but still, many of the calculations are, are familiar, and I think you'll see that it's, that it's worthwhile to, include all the details. But really I don't want all that math to obscure the bottom line, that actually it's very easy for us to calculate moments. once we have a good expression for the ordinary bivary generating function, in many cases. so I'll go through the details in this section, and then in the next section, we'll see lots of details that illustrate how simple the calculation can be. So we start off with the ordinary binary generating function that we defined for uh,[COUGH], combinatorial classes with parameters. [COUGH] we sum over all objects in the class z to the size of the object, u to the cost. and we have our elementary identity that I'll write over on the right. I'll write more classic probability expressions just you know for reference and for checking. the kind of manipulations that generating function manipulations that we'll be doing. so first thing is enumeration. so how many objects are there of size n? [COUGH] That's the sum on K of PNK. so some over all costs are the ones that size n you get the number of objects of size n. From the ordinary binary function how do you get that? Well it's just the coefficient of z to the n in the binary generating function evaluated with z equals one. So that pz1 then the u goes away and it's all one it's just sum on p everything in the class, z to the size of the class. but again, that's can be collected with uh,[COUGH] all the ones of the same size that's exactly the same as piece of nz to the n. So piece of n is the coefficient of z to bn in the[COUGH] ordinary bivariate generating function evaluated at u equals one. Or you can do the sum, and substitute p and k in there again and get back to p of zu. Okay, so that's enumeration. And again, this math over here is not really needed except to check your understanding of what's going on. Now let's look at the cumulative cost. the way that we compute averages when using generating functions is take the total cost in the objects of size n and divide that by the number of objects of size n. That's the average. that's a little bit different calculation than working with probabilities but since we have this powerful mechanism for counting, I am forgetting counting results, it's natural to do it and we can see from the ordinary bjf how this works. So we're interested in the total cost in objects to size in. So, again, that's for all values of k, we multiply, the number of objects, the size, and the cost k, by k, that gives the cost for all of those objects, then we sum over all values of k, we'll just call that, q sub n. what's that value in terms of the Generating Function? Well what we do is we differentiate with respect to u, that's the marker variable for the cost and then evaluate it, z equals 1 and u equals 1. So that's the partial with respect to u of the ordinary bi-variate generating function. A value at it u equals 1 just from the definition that's the sum on all objects, cost times z to number of objects and again if you collect all the objects of a given size n. That's, exactly the same as q sub n z to the n. So it's the coefficient of z to the n in the partial with respect to u of the OBGF evaluated at z equals 1. or you can look at this representation on the right of the generating function. when we, sorry this one at the top. If you differentiate with respect to u, it brings down a k. So that you get k, p and k. and that's the familiar way to think of the total cost. but in terms of the, bivariate generating function, it's a simple partial derivative and then evaluate at 1. So that gives us what we need to calculate the mean cost of objects of size n. we take the coefficient of z to the n, in the partial with respect to u, of the OBGF. Evaluate at u equals 1. And divide by coefficient of z to the n in the OBGF evaluated at u equals 1. and that's in many cases, a simple calculation. And again, that's the average, which maybe you're more familiar in the form over here, kp and k over, over pn and we can calculate then the variance in the same way. definition of the variance is the sum on k, k minus the means squared times the probability in, in terms of the ordinary bi-variate generating function. it's not hard to check but that's the second partial with respect to u evaluated at z equals 1. n subtract of at the mean, subtract of the mean squared just as with normal probability calculations. So that's a overview of calculating moments from an ordinary bi-variate generating function. just using partial derivatives with respect to the variable that marks the cost and then evaluating at that variable equals 1. And we'll look at an example next. So this is our bit strings example for what's the uh,[COUGH] number of zero bits in a ran, a random binary bit string. we talked in the last section about our construction In the OBGF equation 1 over 1 minus z times plus u. So now we'll start with this equation and calculate the average number of 0 bits in a bit stream. And again we know the answer to this it's going to be half. but this will illustrate the calculation and we'll use the very same straight forward method for more difficult problems. so how many binary bit strings are there? Well we evaluate at u equals 1. So that's 1 over 1 minus 2z, and then take the coefficient of z to the n. Now that's 2 to the n, so that's as expected. to do the cumulated cost, the total number of zero bits in all bitstrings of length n, we need to compute the partial with respect to u of the OBGF. So to compute the partial with respect to u, it's 1 over 1 minus z minus z u. all that's relevant is the minus z u, so it's that to the 1 minus z minus z u to the minus 1 power. so it's minus that squared and then uh,[COUGH] times the minus z. The two minuses cancel, so the partial, with respect to u of that, is over here on the right, z over 1 minus z minus z squared. and to calculate the accumulated cost, we just evaluate that, at u equals 1. So if u equals 1 that's z minus z, 1 minus 2z squared. So we want for the accumulated cost, the coefficient of z to the n, and z over 1 minus 2z squared. and that's an elementary calculation. from s, power series that, that it's n times 2 to the n minus 1. And so now the average is just divide those 2. and that's n over 2 as expected. so it's just computing the partial with respect to u, which you know, once you've practiced a few times and remembered the definitions, it's not so difficult a calculation at all, and it's certainly one that can be done automatically nowadays. so we're not going to compute the variance because there's an easier way to do it that we'll talk about in just a second. [COUGH] Okay now I mentioned the idea of horizontal, and vertical generating functions that go with bi-variate generating functions, and so now, I want to take a look at moment calculation using these methods. so, let's look at the moment calculations in terms of the horizontal, ordinary generating function. So that's a generating function that gives the cost for an object of a given size. So what we do is we take the coefficient of z to the n, in the bi-variate generating function, and we'll call that little p sub n of u. It's a generating function for the cost and all the objects of size n. and it's just the sum over all objects that are of size n, u to the cost of that object. and again that cheat sheet over here in the terms of, of p and k that's that's a sum on kp and k, u to k. So it's the generating function for cost. and so that's only got one variable, and since we're interested in knowing the number of objects of size n and the average number of zero[UNKNOWN] an object of size n, we can use that generating function directly to get the answer that we want. So the first numeration if you evaluate that, u equals 1, you get the number of objects of size n. so that's just evaluating the UV 1 that's summing the for all values of k the number objects of size n cause k, that's equal to the number of objects of size n. So this paying use the useful thing both the numeration and for accumulated cost. If you just differentiate the function and evaluate it at 1. Then at some kpnk, which is your accumulated cost, qn. So if we know that pnu evaluated at 1 to enumerate, differentiate evaluate at 1 to get the accumulated cost, and just divide those 2 to get the mean. and even simpler calculation usually. And then the calculation for the variance just involves the second derivative. And again, that's easy to check from the definition, of the variance. So this method, using horizontal ordinary generating functions, is the one that we use most often, to calculate average values and other moments of parameters. So let's look at using this example for 0 bits and bit strings. So again, that's our starting point, the 2 variable ordinary binary generating function. the horizontal version is the coefficient of z to the n in that. The coefficient of z to the n in that is 1 plus u to the n. It's equal to sum of over n of z to the one plus u to the n and so that the coefficient of z to the n. That's our horizontal ordinary generating function. So now we can do the two steps to get our average. First, enumeration, evaluate at one, you get two to the n immediately. U equals one, two to the n. To get the accumulated cost we differentiate with respect to u and then evaluate at one. So differentiate with respect to u brings down an n. and then evaluate at one to the n minus one immediate. And so the average is the ratio of those two, which is n over 2, as we expect. And then taking the second derivative, and then, adding mean subtracting the mean squared, immediately get, n over 4. and that's an easy calculation that tells us that the standard deviation, is square root of n. the average is n, and what that means is that the distribution is concentrated. A typical value is going to be close to n over 2, and I'll talk to you, in just a minute, about what that means. For now, the point is that, given a 2 variable obgf, if you can extract the coefficient, of, us, z to the size, then, you get a generating function that's easy to work with to, calculate moments. Now we can also do moment calculations with vertical ordinary generating functions. I put this slide up for completeness. I'm not going to spend a lot of time on it because actually I, in the lectures, we're not going to do any derivations this way. but it's the. same idea, just on the other variable, the vertical instead of the horizontal. If you take the coefficient of u to the k in the ordinary bivariate generating function, you get the generating function for the cost of objects of size k. And it's the same idea, you're just holding the cost constant. And that's sum of p and k z to the n. And so to enumerate we do it the same way, with the the bi-variate generating function. But, accumulated cost is a coefficient of z to the n in sum of kqk of z, where. q k of z is the bivariate generating function. So we're, have to extract another coefficient but there's some problems where it's easier to do the calculation this way. and so, and it gets down to the mean again, and you could do the variance the same way but I'm going to skip it. so and again this is just for completeness this is not the way to determine the average number of zero in a bitstring but and some mechanical as you might do it and it illustrates the idea. So if we pull up the coificant of u to the k, we get the vertical generating function. In this case it's z to the k over 1 minus k to the k plus 1, and then we enumerate same way as before to get the 2 at the end, but the cumulated cost is coefficient of z to the n in sum on k, k times the vertical ogf... and that's a familiar, geometric series, with with an extra factor, and here's the details, for those of you who are used to those sums, that sum evaluates to z over 1 minus 2 z squared. the coefficient of z to the n and that is n to the n minus 1 so we get n over 2 as before. And again, in this case there's more work to get the same answer, but in some case is where it's less work. And it's just a matter of the order in which we extract the coefficients, usually, it's better to use the horizontal where we extract the coefficient z to the nth first. and in this simple example, we're always getting N over 2. So why the, the focus on the moments? Well we're always interested in the average, what value can we expect. and, this is classical in probability, so I'm just going to then take a slide to mention these. These classical results. So, if we have a random object of size N with mean use of N and standard deviation sigmus of N. Then we have these famous inequalities, the Markov inequality and Chebyshev inequality. It, it says that the probability of being much larger and small, or smaller than the mean, must decay. And an upper bound on the rate of the decay is measured in units given by the standard deviation. That's why the standard deviation is important. The Markov inequality says, you're not going to be, too many standard deviations away from the mean. the Chebyshev inequality, says that the probability that you're, t standard deviations away from the mean is less than 1 over t squared. so actually, we can get much more accurate results than these for many of the things we study cause we have the full distribution. But still the basic idea is the same and what we say is that if the standard deviation is of smaller order of growth than the mean then we call the distribution concentrated because the means that as in gets larger the average value tends. Towards the mean. We can think of the mean value as a typical value. We can calculate out some specific bounds but usually it suffices to, in the analysis, to learn that the distribution is concentrated. And we move on knowing that we could do more detailed analysis if we want. So that's, this is just a formal statement of that. If a distribution is concentrated, then then your value of your random value over the mean approaches one in probability, in a very specific sense. so, when it's concentrated we say the expected value is typical. That's the one that you really do get expected to get. And this is just, an example for a random bit. If you have 100,000 random bits. You expect about half of them to be one bits and about half of them to be zero bits. the standard deviation is, of that, it's square root of N over 2, so this, it's only about 5,000. So that says that the probability that a value, the random value that you get Is between 49,900,000 and 50,100,000 is better that 99%. That's a kind of result every one in practice with really high probability. We have a good idea of what the value of the random variables going to be. That's what we care about or concentrated. Uh[COUGH] so this is just a plot of this distribution, which is the binomial distribution. as n increases, there's another curve. and so what it shows is that as n gets higher, the curves comes tighter and tighter together towards the mean. and so that's just true in many, many cases. In fact, this particular curve describes lots of distributions. we'll get to that later. Alright so here's just another example really generalizing our number of zero bits example. so let's talk about the class of all M-words. so that's just Uh,[COUGH] strings with every you know, integer between one and m or sequence of sets as we talked about last time. so the size again is going to be the number of letters of a length of the word but let's take as a parameter the number of occurrences of a particular letter in the word. So that's our bi-variate generating function, so to now construct that we use u to mark one the letters, a particular letter, and then we don't mark all the others. So that's n minus 1z. So we had m equals 2 before, we had to use z plus z. Now we have general m. It's in minus 1z. And that immediately translates to 1 over 1 minus, N minus 1 plus u z. We have N minus 1 z N u z, and that's it. So now that's our ordinary bivariate generating function, if we want to enumerate, we just evaluate that at u equals 1. Evaluating that at u equals 1, then we want the coefficient as z to the n and 1 over 1 minus mz. That's m to the n. So that checks. and if we want to the accumulated cost, we differentiate with respect to u, and find the coefficient of z to the n. and that's the same kind of calculation that we just did for 2. and we get n m to the m minus 1 So the average number occurences of a given letter, in a random n word, within n letters, is n over n. And again, that's as expected. and we can also go ahead an compute the variance, and it's n over n minus n over m squared. and the square root of that, then, is proportional to the square root of N, so ,uh, it's concentrated. And that's a, direct analysis, using the symbolic method, followed by, calculations using partial derivatives, the basic method that, we're going to use to evaluate parameters. And, I mentioned the practical application of hashing algorithms, and this is from, our algorithms books, or Kanuth book, or, any, book on algorithms. Where uh,[COUGH] we, organize a table of values, by, hashing them to random values between 0 and N Input are n items into m keys interested into length of all the list because we are going to have to search the list sequentially. so that's the strategy, I was never interested in was the average number of keys on the list. and again it's trivial that the average is N over M but that's also what we just proved with the symbolic method. Uh[COUGH] but the extra analysis to get the standard deviation. tells us that it's concentrated which means as n increases it's going to that values going to be typical. and that's useful to know in practice. And we're not likely to have that case where they all fall on one list and nobody falls on the others. So[COUGH] that's just one practical example in many-many of the applications that we look at. What we want to do is find the average value, show that the standard deviation is of small order and that's the case in many-many-many of the applications that we study. So that's the process of calculating moments from the ordinary binary generating function. and again, with the illustration, from familiar binary numbers, but, also one, practical application. next, we'll look at applying these same kinds of calculations in other applications. [BLANK_AUDIO]