We're going to finish off by looking at the idea of. Counting directly with generating functions. This is going to be first step to ease us into really important role that generating functions play, in analytic combinatorics. so it's a, so it's a really an alternate view of what generating functions can do for us, and it is very combinatorial. I will be much more formal, thorough and extensive coverage of this, but still it is good to look at the same problem then, we just looked at from this point of view in this lecture. So what we're going to do is define a class of combinatorial objects that have an associated size function. And the generating function's going to be a sum over all members of the class. We'll still get to the same point of getting a, an equation that's, a generating function must satisfy. And from that point on, the analysis will be the same. but getting that equation is much more natural and fundamental with this kind of correspondence. It seems kind of abstract. But, so, let's look at a specific example. So lets say T is a set of all binary trees. then we're going to define a size function. that looks like absolute value. if you have any binary tree, it's going to be the number of internal nodes in that tree. [COUGH] Now, what we're interested in is a counting sequence and that's what we're interested in before. It's the number of binary trees from the set of all binary trees that have size function equal to n. So, the, that's the number of binary trees with n internal nodes. That what we were counting before. This is just a, a formal definition, tree by tree. Now, what's that good for? Well. If we define the generating function to be the sum over all possible trees, of Z to the size of the tree, then that, treats each term individually [COUGH]. Each tree contributes individually to a term in the generating function, but it's exactly the same as if the trees of size N were collected together and counted by T sub N. So that's a fundamental idea on, of this equation we, some over all possible trees. but since it's Z to the size of the tree, all the ones of the same size are going to contribute to the coefficient of that size, say N, and give us T sub N. so it's the same generating function looked at a different way. And so and the reason that's helpful is that [COUGH] we can look at the way that we define the combinatorial structures. to give us the equation that we want for the generating function. and so, this is how it works for binary trees so definition of a binary tree. with [COUGH] is that it's a root node with a binary tree on the left and a binary tree on the right. and say on the left there's T sub L internal nodes, on the right there's T sub R internal nodes. if we take that sum of all possible trees and it's got all possible left nodes. All possible right nodes then but we had [COUGH] Z to the size of the whole tree that's the size of the left, the size of the right, +1. So, the decomposition that we use to define what we mean by, what is a binary tree? immediately gives this equation on the generating function. there's, it starts out with a one. That's, for the empty tree that doesn't compose. So the double sum is for non empty trees. so that's a first key step to understand that the, And the decomposition, the recursive decomposition, the way that we define binary trees, immediately translates to an equation on the generated function. so now that equation, those T sub N and T sub R are just afformal variables and they're independent, so, we can immediately split that sum up into Z times sum overall Teesabellsi to the T sub L over all T sub R the T sub R and then get the answer that T of Z equals, one plus Z, T of Z squared. Just the same as we found before by worrying about all the counting, but this is much more direct. Really the equation says that, a binary tree is empty or it's a root connected to two binary trees. And actually we're going to see a very formal way to really go right from the [COUGH] description of what it is right down to that generating function equation. so here's another way to look at it, just to make sure so the generating function is really got a term for each tree. So those are all the possible trees and it just keeps going. So each tree of size 3 is represented. It's a sum over all trees, Z to that tree size. now when we're worrying about the counts, we're just collecting all the terms for the same exponent, we're just doing the algebra. so now but if you multiply TTfz, of Z * T of Z, it's like taking two trees and composing them. Well that's what we mean by binary tree, take two trees and compose them. . so T1+zT(z^2), of Z = 1 + ZT^2, If we write the things out symbolically as trees then you can see immediately that this tree here comes from one of those, and one of those and so forth. it's in fact, some mathematicians prefer to work with, the symbols in, in some cases in, in commonotorics, people, go very far working with completely symbolic representations. It's a good way to think about it, it's, there's a term in that generating function corresponding to each tree, when we're trying to find out how many there were of each size, we're just doing the algebra of, collecting by size. Now, that's, important, but there's, another idea that is related to this, that, I want to introduce, and that's all about cost. So a lot of times, it's not just about the size, it's about some other property of the, of the structure. and so so, we'll use, we'll do two examples. One's a very easy example. How many one bits are there in a random bit stream? Well everybody knows it should be about half. but still that'll be a warm up. We'll get the right answer for that. A more complicated thing is that's [COUGH] if we're using a binary tree in a computer representation, it's important to know how many of the nodes have both leaves external. you can save space in that way in a lot of situations. so if we have a random binary tree, how many leaves are there? then maybe that's not so easy. in fact with generating functions we can have a unified approach to studying perameters. and again that's one of the key benefits of the way, analytic combinatorics way of looking at things. and I want to do these examples to show the advantages of using generating functions to help us do calculations and analysis like this. so again it's the same kind of idea. We're going to have a class it's, the same idea. We're going to have a class of combinatorial objects. our model is going to be that all objects of each size are equally likely. and that's realistic in a lot of situations. so lets look at how the calculations look when we're trying to find the value of a parameter. so let's say the all, set of all objects in the class is p and then we have a size function which, for every object in the class, we have a defined size. then we have the counting function so that's the number of objects that have a size n. so that's what we've been talking about up to this point. but now let's say we also have a cost associated with each object. And then, in that case we're going to be interested in the number of objects that have a given size and a given cost. And, were, want to know that, so we can do the calculation of the average value. So that is the expected cost of an object of size n, is going to be the probability that the object cost of an object of size n is k. which is the number that F costs K divided by the total number. then we're assuming their all equally likely. times k so that's the definition of the expected cost. and so that's all a familiar calculation if we know all these quantities. But what. This point of view buys for us is, well let's notice that we can factor out the p sub n, and we can just count up the total cost. of all objects of size n. That's called the cumulative, the accumulated cost. so every object's got a cost associated with it. and then we take all the objects of size n. Add up all their costs. And divide that by the number of objects of size n that's the expected cost. It's a trivial calculation but still it's an, an important distinction. so the, the idea is we, if we can compute the accumulated costs, then we can get the expected costs just by dividing the accumulated cost by the number of objects. now what's that have to do with generating functions? Well again [COUGH]. we start out with the, same situation. let's take a look at the generating functions. So we have already talked about the counting generating function. If we sum over all objects in the class Z to their size, then just algebra collects their terms to get the [COUGH] coefficient of Z to the N as the number of objects of size N. but we can have a similar. Generating function to compute the cumulated cost. If we look at the function c of z which is the sum for every object in the class. Its cost times Z to the size. Then, those things are going to collect by size. And so the coefficient of z^n in that is going to be nothing other than for all values of k, the sum of k times p and k. It just collects the objects of cross k by size. And it will get them all. So the coefficient of z^n in that function c(z) is exactly the accumulated cost. So that gives us the average cost. We just extract coefficients from those two generating functions. the bottom line is. If we want to compute the expected value of a cost, we have two GF counting problems to solve. and their similar. we know how to solve we've already done examples where we can solve GF counting problems. And now we can get average values of parameters by solving GF counting problems. so again, it's a, it's a. We, at trivial calculation, to say, we're going to compute the average by computing the total and divide by the number. But still, it's fundamental because now everything is counting. And we're going to have very powerful mechanisms for counting. So let's see how it works for the two examples that I mentioned. How many one bits in a random bit stream? Okay so these are sort of orbit strings. the number of bits is our size function so that's going to be the size function for any bit string then in with the cost functional B will call ones B that's the number of one bits in a given bit string. number of bit strings of size n that's besoben. and that's 2n. [COUGH], and then CeceVan, we can use GS to get that, but, no let's no, no hit, and we're happy with that. And then, the'cumulated cost function Cece Van, is the total number of one bits in all bit strings of size N. So counting g f, so that's b of z, sum over all bitstrings z to its size and that's equal to two to the n, z to the n. One over two, one over 1-2z. so that's, b of z. And actually, we can, get that formally just from the definitions. But, I'm sure you believe that one. what about the cumulative cost GF? So, [COUGH]. So that's that's the function. so now, what we want to do is similar to what we did for the Catalan. Is to use a, a recursive description of what a bit string is, to get us a, a formula for this function. And well, what's a bit string? It's either a zero or a one, followed by another bit string. So [COUGH], for if we're going to have for all bits strings the number of 1's. That's going to be e-, equal to. for all bit strings you can put a zero or a one in the front and that, that'll give you all possible bit strings. In this whole set of bit strings there's one, one bit. plus there's two. How ever many one bits there are in the bit string b prime. And that's summed over all b, b prime. And we had, length of b. But it's the length of b prime, plus one. So again, that recursive description immediately gives that formula for the accumulated cost, GF. And so now, just doing the math. that the first term gives us a ZV of Z, and the second term gives us a 2Z C of Z. So that's an equation for the accumulated cost function and we've got the what B of Z is. And so we can just solve that equation. It's [COUGH] Zb of Z is one over one minus 2Z. And then C of Z, bring 2Z Cof C over to the left hand side. And so we divide by another factor of 1 - 2Z. and so that's a proof that C of Z is equal to Z over 1 - 2C squared. So now we have explicit expressions for both the enumerating GF and the accumulated cost GF. And all we need to do is extract coefficients from those two functions in order to get the average cost. 2zz/(1-2*z^2) / 1 - 2z^2 is, that's a standard generating function that we've seen several times before. And so the bottom line is, the coefficient of z^n and the accumulated cost is n*2^n. N minus one, coefficient is z to the n, in [COUGH] and the numeration is to the N, and that gives us the result that the expected cost is N over two. Again that's a warm up we kind of knew its n over two. now lets do a problem that you might have a lot of difficulty solving some other way and that's these in binary trees. So again a leaf in a binary tree is an internal node whose children of both are external. so [COUGH] for example, in the trees of size two, each one of them has one, leaf. So the if we wanted to go down to do all the counts, So, t of n's the number of binary trees with n nodes. TNK is the number of n known binary trees with k leaves. So, T2-1 =two. There's two of them that have one leaf. and CN is the average number of leaves in a random n node binary tree. So that's the ratio of those two. So C21. = 1 so for five there's four of them that have one leaf, so I'm sorry for three, there's four of them that have one leaf, so T31 equals four. There's one, the balanced one, has two leaves, so T31 equals one. And so, [COUGH] if you want to compute the average number of leaves in a binary tree with three nodes, it's four plus two times one divided by five, or 1.2. And similarly for fourteen, we find that eight of them have one leaf and six of them will have two leaves and that gives a solution. So what's the average number of leaves in a, in, in a random binary tree? If we treat them all likely how do we get that counting sequence? and again, that's actually a practical problem that plagued programmers when binary trees first came into use because these things were a wasteful of space and people want to know how much space they could save. O.k. So let's [COUGH] use cumulative counting to solve that problem. So. set of all binary trees internal nodes is a size function leaves is our cost function number of leaves in the tree. the number of binary trees is size n of catalyn numbers as c analyses that we did and now what we want to do is develop a generating function for the cumulative cause which is the total number of leaves in all binary trees of size n. so the counting GF, we already did that one. so that's just summarizing that. and the accumulated cost GF, that's for the [COUGH] at c of z sum over all trees, number of leaves times z to the size. [COUGH], and then, the average number of leaves in a random endo binary tree is just the ratio of the coefficients of those. We already know the coefficient of the number of trees, that's the catalyn number. So, we're looking for coefficient of Z to be in in C of C. So, that's the next thing we're going to do is try to find that cumulative cost function. so [COUGH] that's the, that's the function. Now again we're going to use the same. kind of decomposition that we did when we're enumerating trees. And, and actually usually, that's what happens. The, same formal description, that gave us the number of trees, it's going to give us, the cost. And that's why this methods so powerful. we, we, do the work to figure out the composition, of the way to describe it. and then we get to apply it twice. Once to find out the total number, and the other to find out the total cost. and so. [COUGH] that the composition immediately leads to this equation for the cumulated generating function. So there's the tree that is just a leaf. so that's accounts for the Z term. and then for all the other trees there's a left and a right. So that's T sub L nodes on the left and T sub L nodes on the right. and however many leaves they are on the left [COUGH] they're. In the tree, is the total number of leaves on the left, plus the total number of leaves on the right. The leaf can't, if it's got more than one node in it, the leaf can't cross between the two trees. so, this equation here holds exactly. Again, the plus one for the root but that same decomposition gives us this same equation. And again, T sub L and T sub R are independent so that immediately [COUGH]. allows us to break these break these sums up into independent sums. So we have leaves of t sub l, z to the t c l, t sub l. times the sum on T sub R there's, two terms that are similar, one where we, for the T sub L and one for the, T sub R and then, and then we have the double sum, so we distribute over those, so that's just elementary distribution, and then those things, we have expressions for every one of them, leafs of T sub L Z, to the T sub L that's Z of Z. And sum t sub r t z to the t sub r, that's t of z and we have two of those. So that gives us [COUGH] a simple equation for. Z of Z, the in-cumulative generated function. All that's left is to extract co-coefficient's from that generating function. So this is the summary of the deriration so far. we're looking for that CGF. we did that decomposition, and we have an equation for it. we know the number of tree's is the catalan numbers. and so our accumulated generating function is c of z, z plus 2z, t of z c of z, and we solve for z of c. We get z over one minus 2c t of z. catalan numbers multiplied by 2z then subtract one, you get z over square root of one minus 4z. So that's an explicit expression for the cumulated cost function. C of Z equals Z / 1 - 4Z. And we can extract coefficients from that the same way that we did for the Catalan numbers using the binomial theorem. the end result is, 2N minus two choose N minus one. And very much the same calculations. Just with a slightly different result. that's accumulated cost and then the final thing is to divide those two and if you divide those two almost everything cancels. except for an N plus one times NN on the top. 2N times 2N minus one on the bottom. so that's three N's and two N's and that's an asymptotic 2N over four. So about a quarter of the. [COUGH] Nodes in a random binary tree are leaves. [COUGH], and that's an example of the use of Accumulated generating functions to discover the average cost of a, of a quantity. And we'll be seeing lots and lots of derivations like this and actually many of'em will be much simpler because we have coherent ways to deal with explaining the decomposition in terms of the generating function. and we'll talk about that when we introduce analytic combinatorics. So that's counting with generating functions. And so now I want to finish by Giving you a few exercises that you might do before the next lecture. so the first one is to just practice solving an recurrence, and this is an example that shows that the initial conditions really matter. So solve recurrence with one set of initial conditions, and then solve the same recurrence with just that one initial condition changed. and you can see the impact of just that one change. on the recurrence and also get some practice on sovereign recurrences. another thing that You might do is practice expanding some unknown generating functions. And there's many exercises like this in the book. these are just a couple that you might try. So what about one over square root of 1-Z? natural log of one over one - z. There's a bunch of ways to do that there's a hint or one way that might work. that'll give you some practice in how we extract coefficients from generating functions. And you can try some other exercises there as well. So, what I, think would be useful, for people to do to, make sure they understood the material in this lecture. one thing is to, if you've got access to a symbolic math, math system, [COUGH] or if you're used to using one. do things like, check the initial values on that Equation that we got for accumulated costs for leaves in binary trees. Similar to the check that I did just that the regular Catalan recurrence holds. And if you don't have a symbolic math system, look around to see if you can do that in some way. Some are freely available. I think again the best way to learn this material is after you've listened to the lecture and got some idea of the overview of the material is to read the text carefully.'Cause the text really does tell the story in, in some detail. and then it's certainly worth while to check your understanding by really try to write up full solutions to those exercises. maybe using TEC. Or, or maybe using HTML plus Jack and really seeing that you can create the math that is the solution to those exercises. That's an introduction to generating functions.