[MUSIC] Binary search trees can take on many different forms and structures, even if they contain the exact same data. Consider these two binary trees. Both of these binary trees are correct binary trees in the sense that the left side always contains nodes less than the root and the right side contains nodes only greater than the root. Both of these containing numbers 1 through 7, they're both correct binary trees, but they have a very, very different structure. So these two trees, as I mentioned, contains all the same values. And the way we built these trees was by varying the insertion order of the trees. Here I built a tree by inserting 4, then 2, then 3, 6, 7, 1, and 5. By doing that we create a tree that is perfectly balanced and the number of nodes on the left side is equal to the number of nodes on the right side. On the other hand, I created a tree that we have nodes that are entirely balanced to the right. The root node in that being 1, so everything is going to be greater than 1, and everything's going to be on the right side of the tree. So we want to think about how can we build trees that are both efficient, and what happens when we build a tree that's not so great? What is the average case, what is the worst case, and what's the very best we can do? So let's think about how many possible ways there are to insert data into the same binary search tree. So if we're having numbers 1, 2, 3, 4, 5, 6 and 7 we can think about how many possible ways we can insert. So this is a pretty simple puzzle in the sense that we can just choose one of them and say that's going to be the first number we insert. So let's say we insert 5. By inserting 5 first we know that 5 is going to become the root. At that point we have the choice of six other nodes to insert second, and we could make any choice here. We could choose 2. Because the first choice we made was the choice of 7, the second choice we made was the choice of 6 nodes. Next choice we'd make is the choice of 5 nodes, then we have a choice of 4 nodes. So the number of possible ways we can insert the exact same data is going to be equal to 7 factorial, or the number of nodes in our data n factorial. So there are n factorial different ways to insert the exact same data into a binary search tree. So what that means is we know that there's a whole slew of different possibilities of how our binary search tree might be structured. In the very worst case, we know our binary search tree may just be a linked list with the smallest node at the root, and we increase it in increasing order. So if we insert it by 1, 2, 3, 4, 5, we can see that that tree is going to look like a linked list that's going to move to the right at every single position. So if we need to find a node in that worst case binary search tree, we're going to see it's going to take order n searches through that data. So every single element of that data's going to be searched. This is the exact same thing as searching through a linked list, because in a linked list we can't skip any elements. So we're stuck with this order n runtime. On the other hand, a sorted array has the ability that we can jump directly to the middle. So as we discussed in arrays, we can access any arbitrary element. In a sorted array we can say, if all the elements are in this array, and they're in a sorted order, then we can just simply jump to the middle of the array and say, okay, three, is that data acceptable? Is that data I want? Maybe. If not, I can move left, if it's less than that, or right, if it's greater than that. So we see that this is going to be a logarithmic running time. So this is going to be O(lg(n)). And we find that the average case of a binary search tree is going to be a binary search tree that is pretty well balanced. If we think about all the possible ways we can insert in binary search tree, only one of these ways is going to result in a tree that leans directly right. Only one way results in a tree that leans directly left. On average, half our data is going to be to our left, half our data is going to be to our right. So an average case binary search tree is just like a sorted array, O(lg(n)) time. Remember that all of our operations depend upon find. So when we're finding an element in a binary search tree, we insert it by just adding one more line of code. We remove it by just considering four particular cases that in itself only call find. So we find for a binary search tree the running time of find is going to be the running time of both insert and remove, because insert and remove only do constant time operations after finding the element. On the other hand, a sorted array, we've discussed the idea that an array has to be contiguous memory. So if we insert and we move from an array, this is going to be O(n) time. And from a sorted list, we have to find the element, so we have to spend the time to find it and then we can insert quickly. The best outcome, the only algorithm that runs in log n time for both find, insert, and remove is this average case binary search tree. We can't do that with an array. We can't do that with a sorted list. Both of these have some run time that's going to run in O(n) time. Because of that we really would love to be able to get to this average case binary search tree. That's going to be the gold standard we're aiming to. But this worse case is going to be the very worst case. Notice that this O(n) is true for every single operation, and will never help us outperform an array. To help us start to quantify what it means to have a balanced binary search tree let's define one new term. This term is going to be the height balance factor, or b, of a node. This height balance factor is going to be the difference in height between the left subtree and the right subtree. So b is going to be equal to the height, Of our right subtree minus the height of our left subtree. So here the height of the right subtree is going to be 1. Here the height of the left subtree's going to be 1. 1- 1 = 0, so we say the balance factor of this root node is 0, that this node is perfectly in balance. On the other hand, if we look at this tree, we can see the height of our left subtree is the empty tree, so we define that height to be -1. And the height of the right subtree is going to be one, two, three, four. So 4 minus -1, is going to be equal to 5. So we say the balance factor of this tree is going to be positive 5. This means that this tree is heavily swayed to the right, that the right has a height difference of five compared to the left side. Ideally we want to keep that balance factor small. And let's think about how we can actually do that. We're going to define that a balanced binary search tree is going to be a binary search tree where every single node in this tree has a balance factor with a magnitude of either 0 or 1. By having a magnitude of 0 or 1, that means the balance factor can either be -1, 0 or 1. Only if these three are the only balance factors that appear in every single node in the tree do we say this binary tree is balanced. So here this left tree, every node's going to have a balance factor of 0, so we're in great shape. Here, this node 6, this has a balance factor of 0, left and right subtrees are balanced. This node 7 is going to have a balance factor of -1. So this subtree right here is in balance. This subtree is fine. The subtree becomes out of balance once we get here, when the node 5 has a balance factor of 2. Notice that the right side of the tree has a height difference of 2, compared to the left side of the tree. So here, this tree's out of balance. Obviously, it gets more and more out of balance the farther we go up. Balance factor here is 3. Balance factor here is also 3. And then the balance factor of this root node is 5. So there's n factorial different ways to create a binary search tree with the same data. The worst case binary search tree will be a binary search tree that mimics a linked list, has a long string of notes. The average case, though, will be a tree that's height is proportional to a logarithm of the number of nodes. So it's going to be a relatively balanced tree. We're able to quantify the idea of balance by defining a balance factor which determines the height difference the left subtree and the right subtree. And we defined a balanced binary tree to be a binary tree where every single node has a balance factor of either -1, 0 or 1, where we say that the magnitude of the balance factor is no greater than 1. In the next video we're going to dive into understanding how we can create an algorithm that maintains the balance of a binary search tree, so I'll see you then.