Hey, I'll be your interviewer today. >> Hi. >> Let's get going. Okay, so your job is to write a data structure that will have three public functions, insert, remove, and get random. >> Okay. Insert. Remove. Random. >> Okay, so before you start coding, can we kind of pause for a second, and can you just go over your high-level approach? >> So what I was thinking is have an array where I can start putting the values. And then sort of getRandom would be just returning a random element from the array. >> Okay. >> And so, yeah, I guess I need an array. >> Okay, so. Okay, so you're allocating 10,000 elements. But what happens if you have more than 10,000 elements in this array? >> Are they going to fit in memory? >> Yeah, you can assume they fit in memory but you don't know how many elements you're going to have. >> So I can't just like pre-allocate, like? >> Okay, well, is there a different data structure? Maybe something similar to this where you don't have to worry about kind of pre-allocating the amount of memory? >> Right, right. So yeah, I can use like a list, an array list. >> Okay. >> Yeah, how about an array list. >> Okay, yeah, that's much better. So. So, okay. >> Okay, so before you start jumping into coding again, can you kind of explain how each function is going to work? >> Right, so if I have this array, then, I mean, yeah, so. The array list implements a collection, then I can. Like I've done the array list, implements the collections. >> Mm-hm. >> So that means it already has an insert and remove, so I can just pass that directly here. And then the getRandom would get, and then I can just do like a random. Like I was explaining, I can just get like a random element. >> Okay. >> And that would be what returns, and these two would be just calling the insert and remove for the rest. >> Okay. So for insert and remove, what would the run times of those functions be? >> So inserting an array list, it's like, I don't know, it takes like four or five operations. >> So is this going to be constant time or linear time, linear squared? >> Yeah, constant, yeah, it's constant. >> Okay, can you kind of explain that a little bit more? >> So because it's an array list, it's constant. >> Okay, what about remove? >> It should also be constant. It has like. I don't really know how it does it really well, but I think maybe it just sort of swaps it to the end and then you can remove from the end very easily. Because the array has like a size internally, the array list and then- >> So keep in mind it also might help if you kind of clarify what RUSC functions you're taking. >> Right, yeah, so. >> And what's it returning? So what exactly, insert and remove, taking this argument. Yeah, so they would just take the value. >> Yeah, so for with remove, right, like if you have, if it's throwing ten elements and you want to remove five, how would you do that? >> So it just takes the index, right, to remove it and I can just. >> So which index are you removing from? >> Which index? So you're asking me how the array list does it? >> Yeah. >> Because I mean, okay. >> So in the context of run time, because you were saying you weren't sure whether or not this is constant or linear. So when we do the remove, what will the run time of that be? >> So I think that the array list, so. So I guess it has to find the value to remove. >> Yeah, yeah, it's a little tricky because the implementation of an array list is kind of under the hood. But as you can imagine, if you want to remove element five, it kind of has to like search for that element inside the array list. >> Okay, yeah so. I guess the worst case would be like the size. >> Yeah, so that would be linear. >> Right, right. So linear, yeah. >> So insert would be constant, remove would be linear, and let's talk about getRandom. Can you explain how getRandom is going to work? >> Well, I have the array so I can just get a random element. >> Yeah, but how does the random, this random helper function, how exactly is that helping you get the index? You want to give the value up front? >> So it's just a random number. >> Okay, so like what exactly is random returning? I'm a little confused about that function. >> So. I mean, it's just a random number generator, it just returns a random number. >> So is it like a double? Or is it Returning any random number. >> No, an integer. >> Okay, so how would that work if you have five elements? >> So I guess this doesn't work, right? So I need to. >> Okay, so let's say to kind of clarify this a bit that for the sake of this problem random is just returning a random positive integer. >> Right. >> Just some random positive integer. >> Yeah, I think between zero and some constant. >> Okay, so in that case, then you're taking that random value, and then you're modding it by the size. Yeah. >> Okay, and what would the run time of this be? >> So, I don't know how the implementation of random is, but I think maybe constant. >> Yeah, yeah, so we can assume that random runs in constant time in which case then yeah, that will be constant. So this is one approach. So you have this array list, insert is constant, remove is linear, and getrandom is constant. Is there another approach you can do that will be better than that? >> Better how? So this is the worst case here is linear. Can we do approach that is better than linear, or have different trade offs, perhaps? >> Okay. Right, right. So, yes. HashMap. And then, so insert. It's fast for a HashMap, but also remove. So remove. >> Okay, let's take a break for a second. So before again, you start coding this can you kind of explain how each of these functions will work? >> Right, so it's very similar. So the HashMap also implements a collections, so I can also do insert and remove like I was saying, but for HashMap the remove is fast. It should run in constant. >> Okay, and how would getrandom work? I could do the same thing. I could just return one of the values at random, no? >> So, does the HashMap have this skitter function where you can kind of give an index? >> I guess they're not indexed. I can just get the first one. So would that be random, though? >> It's a hash, so it should be random. >> So can you kind of clarify what you mean by just return the first one? >> So I think the HashMap has a first and a last method, so I could just return first. >> Okay, and then how is this random? So let's maybe take a break and kind of explain how getrandom should work. >> Okay. >> So think of it this way. If you have this data set and it has five elements and you call it getrandom it should return one of those value with equal probability. >> Okay. >> So if there are five elements there should be a 20% chance that any of those elements are returned. >> So. >> So in this case if you just return the first element aren't you just always returning, right? >> Right, right. So this doesn't change, right. So if I insert something, yeah. Okay, okay, but it should work always. Okay, so I could instead get out a random index. Right, and then I can just like iterate until I find that number incrementing a value. >> Mm-hm. >> And then when I get, I can iterate in a HashMap right? >> Yeah. >> So I can just do that, and so if I have like, then. And okay, I can get like, I don't remember how to iterate in a HashMap. >> Before we jump into that let's kind of take a step back again and think about this is the best way we kind of want to approach it. So let's kind of summarize. So, we've gone over a couple of different approaches. We've talked about using an array list, and for that the insert will be constant, the remove will be linear, and the getrandom will be constant. So, with this one we established insert and remove will be constant time, and what will the run time of getrandom be, using this approach. >> Well, I have four, so I guess it will be linear. >> Okay, so in this case random will be linear. So, in both cases kind of the worst run time for one of the functions is linear. Is there anything else we can do, another approach we can take to maybe have a better worst case in linear? How about where every function is better than linear? So how do you mean better than linear? >> So, for getrandom, right? You have to kind of iterate over every single input element, right? And likewise with the array list approach for remove we had to iterate over every element to identify the one to remove. >> Okay. >> So is there another approach where we don't have to iterate over every single element for one of the functions? So if I could have the values sorted, then that would be faster? >> Okay, how will sorting help you? >> What I was thinking is that it would be better than linear. >> Okay, but how exactly? >> So, I'm thinking that for instance I had an array and the values were sorted finding out a value would be better than linear, right? That would be with a minor surge. Something with that. >> Okay, so do you know the run time of binary search? >> I don't know, log? I mean, it's better than linear, right? >> Yeah, so if you remember, binary search you have your input, right? And every iteration you cut it in half, so that's a logarithmic function, so yeah, that would be log in. So which is yes, faster than linear. So, if we were to kind of keep things sorted, How would you do that? What data structure would you actually start? >> [CROSSTALK] tree. >> Okay, what kind of tree? >> Like a balanced tree? >> okay, so what do you mean by a balanced tree? >> Like, if I. One of the trees, so like a red black tree or something? That it keeps things sorted? >> Okay, well, let's think about it like this simplest situation that you want now, right, is to have some tree kind of keep things in sorted order, right? >> Right. >> So is there a specific tree where maybe it's good for keeping things very sorted? >> So like the red light tree, it does that, no? >> Okay, is there a data structure you could use that's a little simpler than that, that still keeps it sorted? >> Like a search tree, yeah like a binary search tree. >> Okay, do you know how a binary search tree works? >> Right, so I think that what I have is that all of the values in one side are smaller. And all of the values on the other side are bigger. >> Mm-hm. >> So if I do that, then finding an element for removal should be however many levels the tree has. >> Mm-hm. [CROSSTALK] So yeah, as you were saying earlier this would be a logarithmic, right? >> Right. And so the insert would be the same, it would be log. >> Okay, that looks good. So in this case, the insert would be login, the removal would be login and what would Get Random be? >> I could do it probably in I guess. So I can, like the trees I can do like recursion. >> Okay. >> And that would also give me a log. >> Okay, let's go that up. >> Okay >> Okay, so, if I'm understanding correctly, you're trying to write the node class for the tree. >> Right. >> Okay, you don't worry about that you can assume that there is some binary search tree, data search tree already defined that already has a node class within it. >> Okay, so like, I don't know binary search tree? >> Yeah, you can just assume that exists. >> Root. I'm just going to neutralize it. >> Yeah, that is an empty tree. And so this would just be the root, insert, then root, remove and so the get random. I can. So I can still use the random. >> Yeah. Can I have a. Something that return something between zero and one? >> Yeah, sure, so what are you trying to do? >> So I want the probability of returning, the root like this, To be like one over the root size. >> Okay, so you're basically saying that in this case the root size is the tree size? >> Right. >> So if you have ten elements, then you want the probability of it returning the value at the root to be one over ten. >> Yes. >> Okay, you can assume that in this situation we can just assume that random returns a value between a double between zero and one. Is that okay? >> Okay, so I need to change that? >> Yeah, you can just do that. >> Okay, so that would work, then so. So that would, that does it? >> Okay, one issue I'm seeing here is that what would happen if this tree weren't perfectly balanced? Okay, so let's assume that there's an even number of elements in this tree and you have your, let's say there's four elements in the tree. >> Okay. >> And you have your root, and then you have say two elements in the left sub tree and then one element in the right sub tree. So here you're saying return the value at the root, with a one over four probability. Which is fine, that's 25%. But then you're saying, essentially return all of it from the left sub tree with a 50% probability, and return an element from the right sub tree with a 50% probability. >> Right. >> So since there's only one element in the right sub tree, you're basically making a 50% chance that we returned that one element on right sub tree. And remember it needs to be uniformly random so it should be 25% for everything. >> But if it's balanced, like how can it have two on this side? >> So remember like balance just means that kind of like the minimum dev and the maximum dev, the difference between the two is no greater than one. So for example, if you have four elements in the tree, you can have one at the root and two in the left sub tree and then one in the right sub tree. So obviously if you have a tree or if it's like an even number of elements in it you can't possibly have you know perfectly symmetric tree. So it's definitely possible that one left sub tree has a different number of elements than the right sub tree. >> So if you were complete this would work, right? Yeah, that would be fine, because if it were completed, then you're right. Whether or not you go left or right, you want 50/50 chance so, but you can't assume that. >> So. >> So is there anything, any way you can think of of handling that? So if I, If the tree has its size, so if I just have like the size of all of them. Then. What I can do is, this is 50, 50, in this case for instance I can just say proportional to the sizes. >> So how this would work. >> Trying to make sure that this is correct. But, so the problem here is that I don't have the same number so I have two here and I have one here. >> Mm-hm. >> So I want to go there with double the probability I go here? >> Mm-hm. Okay, and how does that adjust that? >> So, if I have the left size being twice as big, and this is not correct. >> Okay, we're actually running out of time. Do you have any questions before we wrap up? >> No, I think I don't have any questions, no. >> Okay, that's it then. The next interviewer should be here soon. It was good talking to you.