We conclude this week with the case study where we solve another well-known programming problem with the help of lazy data structures. The problem is the water pouring problem. The water pouring problem is formulated like this. You're given some glasses of different sizes. For instance, you might have a glass of size 4 units and a bigger glass of size 7. Your task is to produce a glass with a given amount of water in it. For instance, you could say, well, we want six units of water in one of the glasses, so clearly it has to be the second one. We want six units of water in here. However, you don't have a measure or a balance. All you can do is fill a glass completely, empty your glass, or pour from one glass to another until the first glass is empty or the second glass is full. What we're after is an algorithm that will produce a sequence of moves, where a move can be one of these three, that ends in a situation where one of the glasses holds the required amount of liquid. Before we jump into the code, let's develop a strategy. We are given these glasses. I keep the same example as before, so 4 and 7 just the same example. We need to produce a target in one of the glasses. What we can do is, well, we would start in the empty state and then from that state we can do moves. For instance, we can do the move which says fill the first glass. I'm going to start the indices at zero, so the first glass has index zero, the second class index one. I would have a move fill 0 and that would give us a new state where the small glass is filled and the big glass is still empty. Or I could have another move which says fill 1 and that would give us a state where the small glass is empty and the big glass is full. Or I could have another move, let's say empty glass number 0 and that would of course do nothing because the glass zero is already empty, so that would lead back to itself. Those are possible results of an initial move. I can then follow that with further moves. For instance, here I could have a move pour from 0-1. That now would give me a state where the first glass is zero, and the second glass is filled with four. I could go on, I could then say, for instance here, fill 0 would give me 4 in zero and 4 in one, so that's 4, 4. Then I could say again, pour 0, 1 so that would give me a glass where now the second glass is 4, 7. The first class has one unit in there because I pour three units from the first to the second, and so on. Similarly, I can expand here the paths as well. The idea is then I would essentially do this exploration until I find a path that ends in the required targets with state. How do I do the exploration? Well, one idea is using a lazy data structure, I could simply compute all possible paths and then pick one that ends in the correct target state. But of course, there's an infinite number of such paths. I have to answer the question, how do I avoid exploring a subspace of the search space without finding the correct solution where the solution is somewhere else. The idea to avoid that would be that we produce shorter paths before longer paths. We first produce essentially all paths of length 1, then we produce all paths of length 2, then length 3, length 4, and so on. At each step, we produce all possible paths that originate in the empty state and that have a length up to the length of that step. That avoids falling into an whole of infinite paths because we go essentially from smaller to larger. The second thing to watch out for is cycles. Another thing I could do here from this state is I move pour 1, 0, which would go back to the original state. That's essentially useless because I have just done needless work, and that will never be on the shortest solution to the target state. So I want to avoid such cycles, and the way I do that is that I simply will keep track of what target states have already been reached. If a path leads to one of the target states that have already been reached, then it's discarded. That edge here would be discarded by this step of the algorithm. Before we put that in code, let's spend some thought on how we're going to represent things. I've already set glasses on indices numbered from 0 for the 1st class, one for the 2nd, and so on. The type glass is just an alias for the Int type. Then we have to represent states or configurations. That means how much water is in what glass. A state can be represented as a vector of Int with one entry per glass. That means, for instance, the vector 2, 3 would be a state where we have two glasses that have two and three units of water in it. For instance, vector 2, 3 would be a state where we have two glasses, and the 1st glass has two units of water in it, whereas the 2nd glass has three units of water in it. Those were the states, what about the moves between states? A natural representation for them would be as an enum, where one case would be empty a glass, another move could be fill a glass, and the 3rd move could be pour the content of glass from to glass 2, until either from is empty, or to is full. Let's set up things in the worksheet. We have the type glass and state as discussed, and then we have a glass pouring which contains its parameters, the state with all glasses full. That means for every glass, we know what the capacity of the glass is because that's in the full state. That's the parameter for my pouring glass. Then we have our enum of moves which is empty, fill, or pour. What I'll do is I'll already describe what each of these moves does to the current state, and I do that in an apply function. I can apply a move to a state and that gives me a new state. What does the application do? If the enum, that's this here, is empty, if the move is empty glass, then the state would be the state updated. So it's the old state except at the index glass, where the state will be zero. If the move is fill a glass, then the state is the state updated such that at the index glass now, the glass will be full. So we have the original state where the glass is at capacity. The 3rd move is pour from one glass to the other, whereas here we 1st compute the amount that is going to be poured from one state to the other. The amount would be, well, we can only pour what we have in from, so that state of from, and we can also only pour what remains in to. So that will be full of two minus state of two is what's still empty in glass 2. Those are two upper bounds. What we can do, the amount that we will pour is the minimum of those two amounts. Once we have computed that, then the new state would be the state where the from glass has its contents diminished by amount. So it's state of from minus amount, and the to glass has its state augmented by amount. So it's state of to plus amount. That's the move. The next thing we could do is, well, what moves do we have? Can we enumerate all the moves because that will be useful if we want to enumerate all the solutions. Yes, that's an easy exercise. To enumerate all the moves, let's 1st enumerate all the glasses. The glasses is just the range from zero until full.length. So that's the possible indices we have. Then what moves could we do? Well, we could empty one of these glasses, so that's the expression for g taken from glasses, yield move to an empty g. Or we could fill off one of these glasses, so that's the 2nd one, for g taken from glasses; yield move.fill, g. Or we could pour from one glass to another. So that would be for g1 taken from glasses, g2 taken from glasses. If g1 is different from g2, we can pour from one glass into itself then yield move.pour g1 to g2. The next step to model is a path. How do we do that? The way we want to do that is that we want to essentially record paths in reverse order. A path is a list of moves, but what we want is essentially the history. The last move in the path should come first, and the first move from the initial empty state would come last in that history. A path would have a history which is a list of moves. Then it's also useful to already record what the end state of that path would be. That's the two elements of a path. If you want to print a path, then what we do is we print the history in reverse so that now we go forward from the empty state, and then we print an arrow and reprint the end states, but that's the end state, the configuration that that path leads to. We also have a utility method, extend. That's tells us what path results if we add and move, move to the existing path. The path that results is given here on the right. The history of the path would be the move that we extended with and the existing history. The end state of the path is simply the move applied to the end state of the current path. That's the apply function that we have defined up here, which precisely tells us essentially what moves to two states of the new m state is the move applied to the olden state. Then we start from the empty state. That empty state is simply default state where we map all contents to zero. Our start path is the path that has no history because we just started and has the empty state as current state. One interesting observation is that so far we haven't actually written an algorithm to compute the solution. We just model the domain. We said, "Okay, how do we represent classes, states, moves, what moves do we have? What are paths?" That was it, basically. A good domain modeling helps to keep the actual search algorithm very clear and very short. Let's go to the algorithm next. There are two parts to it. One part builds the infinite list of possible paths, and the second part then picks out the right solution from that very, very large list. The first part is done with a path from a method. PathsFrom is similar to the front method on natural numbers that we've seen. It takes an initial list of paths and initial set of states that were already encountered. It gives us essentially all lists that start in these paths. That would be a LazyList of list of paths. The idea here is that the pathsFrom that it gives is ordered by length. Each element in that LazyList is a list of paths. The first element, if we start from the empty state, would be all the paths that have length one. The second element in that LazyList would be the list of all the paths of lengths two and so on. The paths that are in the outer LazyList are ordered by length. How would we produce that right-hand side with impulseFrom? If you go back to our strategy, so we're given a list of paths all of the same length, and we want to extend it by essentially one move each of those paths. What we're after is essentially the next circle around here. Call that the frontier. The frontier is essentially the set of paths extended by one. The frontier is computed here. We say, okay, let path range over all the paths that we have now, let move range over all the possible moves. Let's extend the path with the move that gives us essentially the next path and then gets keep the path if its end state is not already in explored. If it's already in explored, then that would be a cycle, and we can drop that path. The path that survive that criterion form part of the frontier here. So they are returned. That gives the frontier. Now we want to compute all the possible paths that start from an initial set, so that would then be the initial pass that we have followed by the pathsFrom, now the new frontier and the set that we've explored is the previous set that we've explored plus all the end states of the frontier paths. So frontier and map and state. We combine the two, that gives us a set of paths that's explored now. Of course, that gives us an infinite list of paths, but that's not a problem because this thing here is a lazy cons, and we produce a LazyList. We will expand that list only to the degree where somebody wants to find an element in the list. Now that we have explored the search space using pathsFrom so that we can now span an infinite list of paths, let's find this path that lead to a solution in that infinite list. That's the solution's method. It gives us again a lazy list of paths, and it contains all the possible paths that we have constructed in pathsFrom that end in the target state. I've got half a glass that contains the required amount target of units of liquid. What we do then is we construct the search space here in pathsFrom. We start with the empty state and what we have explored, nothing so far, so no state has been explored. Now that we can construct all possible paths that start in the empty state using pathsFrom. We use that to pick those paths that end in the required state. We want to pick those paths that end in a state such that there is a glass with target unit in it. The way we do that is we spend the possible search state with pathsFrom. The initial path is just the start. The initial set of explored state is just empty because initially all glasses are empty. That's the state we know, that's the original state, and that's the only state in the set. Then we take an arbitrary pathFrom, the search space. We ask whether the end state of the path contains the target. If that's the case, we yield the path. Now we have a lazy list of paths where every path is a solution to the original problem. Furthermore, the LazyList will be ordered that shorter paths come before longer paths. That's the whole algorithm. To use it for an example problem, let's define a problem. Let's say it's pouring with the vector of two glasses of size 4 and 7 as discussed. Now to find a solution to have a certain target state, what we write is problem.solutions with the target. Let's say we want a glass with a size 6 in it. That would give us a LazyList of paths. What we want is the shortest. Now to get one solution, we just take the head of that list. That will give us the shortest possible solution to the target state if one exists. Indeed, it has found one. Here it is. It ends in the vector 4, 6. One of the glasses has six in it. Here's the sequence of moves. We fill the second glass the one that contains seven units. We pour from the second glass to the first. We empty the first and so on. I invite you to try it out and verify that we will indeed get the vector that it claims to get here at the end. That was a solution to the water pouring problem that was quite elegant and short and that made crucial use of lazy infinite data structures to achieve that. Of course, it's not the only possible solution. There are many variants possible. We could, for instance, do a different choice of representations. Here we define specific classes or enums for moves and paths. One could have also encoded them. We defined object-oriented methods where it was natural. One could have also used naked data structures with functions. The present elaboration is just one solution, not necessarily that the shortest or most efficient ones, but it's reasonably short and reasonably efficient. Some guiding principles for good design that I employed implicitly everywhere. One is, naming is really important. Name everything you can. Avoid long expressions with lots of anonymous functions and things like that. It's true that sometimes it's hard to come up with a good name, but it's definitely always worth the effort. Try to name everything you can and try to find good names for the definitions that you have. The second observation is that scoping is important. One should put operations in the scope where they fit naturally. The third observation is essentially to come back to data abstraction, one should keep degrees of freedom for future refinements. The interface of a module should not be concerned with the implementation details in that module.