Next, let's take a look at the tree data structure. Trees are used everywhere in computer science. As a way to motivate a tree, think about a corporate hierarchy. You can have an employee and you can have the boss. A boss can have many employees that report to the boss. And so the way these relationships work out is: the boss could be considered the parent [node], each of the bosses reports could be considered a child node. A child has a single parent. A parent has many children. A root node is the top most parent of the hierarchy. As an example, let's think about a corporate hierarchy. In a corporate hierarchy, the CEO would be the root because the CEO does not report anyone; everyone reports to the CEO. The CEO's directs would be the immediate child nodes of the CEO root node. For all of the CEO's directs, their reports would be children of each of the CEO's directs. And the CEO's directs would be the parent of each of those children. When you get to the very bottom of the hierarchy, that is, you have nodes that don't have any children at their own, those nodes are called "leaf nodes." Let's think about a couple of common operations on trees. One is to traverse the tree. Broadly speaking, there are two options, depth-first search and breadth-first search. Both will visit every single node in the tree and we'll take O(n) time because you're visiting every node in the tree. Remember: when you're visiting everything, by definition you have to look at everything, and so it is O(n), where n is the number of things you're looking at; in this case, the number of nodes in the tree. In a depth-first search, you search in one direction, in this example, left down from the root until there's no more child nodes, then you go up a parent, and attempt to go left again until you can't find any other child nodes, and this way, you continue exploring your entire tree. Contrast this to a breadth-first search where instead of going all the way down first and then trying to make progress by going all the way down, you start at the top, you started the root, look at all the immediate children. Then, for each of immediate children, you look at their immediate children, you add them to the end, in this way, you're exploring layer by layer. Contrast this with depth-first search where instead of searching level by level, you're actually going all the way down and starting with the leaf nodes and building your way up. Let's take a look at an actual interview question that was asked at Microsoft. "Implement breadth-first search and explain it to someone who is not familiar with the algorithm." In a breadth-first search, you traverse in one direction in this example, from left to right, across each level of the tree, by examining every child of a parent. For parent node A, you would search child B and C, then B becomes the parent and you search B's children, next, C becomes the parent and you search C's children. As you collect children, you continue to examine the children of each successive layer until you've exhausted the entire tree by visiting every node, breadth-first search takes O(n). As a fun aside, we actually made use of queues in order to do a breadth-first search. For what we did, was we started with A, we added all of A's children to the end of the queue, and when we were done with A, A was removed from the head of the queue, then we looked at B, added all of its children to the end of the queue, remove to B, then we looked at C out of all of its children to the end of the queue and remove C. In this way, a breadth-first search is actually implemented by appending the children of the first element in the queue to the end of the queue, and then removing the first element of the queue and progressing to the next head of the queue. DFS can actually be written the exact same way, but instead of using a queue, you can actually is a stack. I will leave that exercise to you to prove yourself that DFS and BFS are actually identical, but the only difference is that for keeping track of nodes, you're actually using a queue versus a stack. Let's take a look at several special cases of tree-like data structures. A binary search tree, often abbreviated "BST," it a tree where every left child is smaller than the parent node, and every right child is larger than the parent node. There can be at most one left and one right child per parent. If a parent is deleted, the entire BST moves around to regain compliance and adhere to the rule that the left child must be less than the parent and the right child must be larger than the parent. This lends nicely to a binary search, which is an example of a search that takes O(log n) time. Remember when we talked earlier, O(log n) searches usually are exploiting some underlying structure of the problem. Here, we know that if for a given parent, the left is always going to be less and the right is always going to be larger. Let's assume you're searching for the number 7. We know that 7 is less than 8, which is the root. That means we can eliminate the entire right half of the tree completely. Next, we go to 3, because 3 is the next root of the left sub-tree. Is 7 greater than or less than 3? It's greater than, so we can now eliminate the left side of that sub-tree. We're now left with the tree for 6. We can compare 7 to 6 and determine that 7 is greater, therefore, we go to the right, lo and behold, we found 7 in the tree. Here you can see the having an action where every single step of the way we're halving the problem. In this particular case, for a binary tree, searching for an element takes O(log n) time. Another tree-like data structure is a trie, which is usually used as an alphabetical tree. It predicts words that a user intends on typing based on letters that are already typed. Let's take a look at this example, we start with nothing, which is our root based on the first letter that you type, you can start looking at what are possible words to autocomplete with this trie. If you type in C, the trie only has cat, So it would try to autocomplete to cat. But let's assume we typed in B, there could be a set of suggestions that are shown to you. One could be ball, bat, and then be. Each additional letter that you type, you can go further down the tree and you can narrow down the possibilities, and this way you can give a very quick and personalized experience to your user. Let's talk about combining data structures. Most of computing is just different combinations of data structures. For example, JSON stands for JavaScript Object Notation, and you've already seen it. It's how the Twitter API represents tweets, and how much of the information on the web is structured and communicated. A JSON object is an array of objects, let's take a look here. You've got the main JSON object, and you've got an array over here. There's some primitives which are the numbers. There's some maps which have key and values. Next, let's take a look at Google interview question, "When designing a simple load balancer for Google.com, explained which data structures you use." What is load balancer? This could be the first clarifying question that you ask. A load balancer takes a request and distributed to a server. For example, Google serves billions of requests on a given day. If every single request that came in from users all went to a single machine or server, that server would die instantly. The goal is to distribute the search requests that are coming in across different servers, that way, as Google scales up the number of servers that it has, it can also handle more and more user requests. The goal of a load balancers to make sure that incoming requests get routed fairly across servers, that is, no one server takes all the requests, but it gets evenly distributed. How would we go about visualizing this problem? Let's think about the fact that we might have n servers, and drawn this would give us a clue of which data structure to use. We can think about using an array as follows. Indices zero through n represent which server that we would be using. The element is the address of one of these servers. Upon receipt of each request, the load balancer assigns it a random number between zero and n, the number of servers and generating a random number is very fast, it would be constant time [O(1)]. Upon receipt of each request, the load balancer assigns it a random number between zero and n, where n is the number of servers. Once you've assigned this number, you can go to that particular element of the array, figure out the address of that server, and route the request to that server. However, removing or adding a server would actually take O(n) time if we used an array data structure, because you would have to create a new array. Because remember: an array has to be contiguous, it has to be one block. On the other hand, we can use a HashMap as follows, we can key the HashMap using indices from zero to n, where n is the number of available servers again, and the value of the HashMap would be the address of the server. Upon receipt of each request, load balancer assigns a random number and looks at the address by using the key. This would achieve the same runtime of O(1), but would also have the benefit of leveraging most HashMaps implementation of constant-time addition. Most implementations of HashMaps also have constant-time removal and updates, making a hash table faster all-around for this particular exercise. We can also improve response speeds by having the load balancer periodically query all of the servers for latency and bias our random number assignment towards lower latency servers. That is, if there's a certain server that's doing really well, we can send it more traffic.