In the last unit, you've seen the concept of a for-expression. What we're going to do now is use for-expressions in a novel and interesting application, namely the generation of random values. So far we have seen for-expressions operating on lists, on sets. In the previous course, we have also seen them on databases and options. All these things are collections of data items in some sense. So the question is, are for-expressions good on before collections? And interestingly the answer is no. All that's required is some interpretation of themap, flatMap and withFilter functions that make sense for the typing question. There are many domains outside collections that have such an interpretation and an example that we are going to see now are random value generators. So what are Random Value Generators? You know about random numbers from a language like Java, here you would import java.util.Random, create a new random number generator, take the next random number of with nextInt(). The question now is, is there a systematic way to get random values for other domains? For instance, we want random booleans, random strings, random pairs, lists, sets, or trees. So here's a way to do that. We define a trait Generator with some element type [ T ]. And that would generate random values of type [ T ]. Here's the outline of the trait. So it's Generator, here's the type of values that get generated and here's the function generate that does that. Let's look at some instances of this type. First instance, generates random integers, so that would now be packaged as follows. The integers generator is a new random generator of type [Int]. It defers to the java integer Generator and it's generator method just picks off the next random number from rand, the Java random number generator. What about booleans? Well, once we have a generator for integers, booleans is easy, we create a new Generator[Boolean]. Its generated method just generates a random number using the integers generator and asks, whether a random number is greater than 0. If it is, we return True. If not, we return False. What about pairs? Well, here's a generate that gives you pairs of integers. It's generate method calls generate twice in the integers generator, and packs the results in a pair. Now all this works, but it's also a bit cumbersome, each time we have to set up a new anonymous class of type generator to find a generate method, and so on. Question is, can we do without all that boilerplate? Ideally, what we would like to write is something like this. To get booleans, we just say, for x taken from the integers generator yield x greater than 0. Of a pairs, we would like to say that let's have two generators of two arbitrate types t and u, call them lowercase t and u, let x be taken from t, y be taken from u, return the pair of x and y. Now if you want to do that, then a good question is what does the compiler expand this to? Here's what it would do. Booleans would be expanded to a call to map, and pairs would be expanded to a call to flatMap followed by a call to map, according to the expansions of for expressions that you have seen in the last unit. So you see that as long as we have map and flatMap defined on the Generator class, we can actually use the convenient syntax with for expressions. So let's see how we would define map and flatMap on the generator class. Let's start with map. So here's the type of map, it takes a function from the random value type T to a new random value type S, it gives you a generator of S. And the way it would do that, it would generate random numbers of type T using it's own generate method. Then apply F to those random numbers and those give you the random numbers on the S type. There's a twist here in the code to self.generate. If we had written just generate, that would be according to the expansion rules in Scalar this.generate. But the, this., in this new anonymous class would refer to the current method that we define here. So it would be a recursive call to the generate method which would not terminate. What we need instead is we want to call the generate method of the object one further out. That's this generate method here. And the way we can achieve that is that we define an alias for this value of this object out here using the syntax self arrow. So that would define, an alias name for the, this., term of the class over here. So writing then, self.generate, would defer to this method over here. Another way that could be achieved in both Java and also Scalar would be to prefix the, this, with the name of the class so we could have written, Generator.this.generate, and that would have done the same thing. So the second thing we have to do is define flatMap on generators, here it is. Again, it's useful to compare with map so flatMap would give us back now a Generator of [S] from the function f. So it takes a random value to a whole sequence of random values to a generator and as our type is again, Generator of [S]. Its generate method is implemented, as you see here. So what you would do, is first generate a random value of type T using self.generate as before. Apply the function f to it so that it now gives us a complete generator on the new domain [S]. And to pick a random value in that domain we invoke generate here, again. Now that we have defined the general machinery that we need to do class generator, we can look at some specific generators again. So booleans, here's the syntax we wished for. What does that expand to? Well, the compiler would expand it to a map. And if we look at the map, operation on generator then that's what it comes down to. It would say well the function f, that's this closure here, and you'll see it here get apply to the generate method from the random number generator on the left hand side of the map, the receiver of the map. And that of course can be simplified by just doing the so called bitter reduction. So we apply the function to the argument and we reduce it in one step to what you see here. So that's exactly the generator for booleans that we started with. Let's try the same thing with pairs. So here's the rather the expanded syntax of the for expression for pairs. If we expand the map, then that's what we get here. If you expand the flatMap, we get this expression here. So let's look at this in a bit more detail. You get a generator for pairs of T and U. Its generator method gives you the generator here, that does the generation of the pairs like that. And then we call generate again on this nested generator. If we simplify that expression, then we get, again, the expression we started with, a pair where we generate on the t and then generate a value on the u generator and collect the two in a pair. Let's have a look at some other building blocks for generators. A useful, even though very simple, building block is the single generator that is in a sense a borderline case, in that it always gives you back the same random value [T] so the value is not that random after all. So you get the value that you want to return and generate each time returns that value. Another building block is the choose generator, that would give you an integer in the interval between low and high. The way it would do that, it would take an arbitrary random value from the integers, and would normalize it to be in the interval between low and high using this modular expression here. The last generator, oneOf, can pick an arbitrary value from a list of choices. So you can call it for instance like this, OneOf, three colors red, blue, yellow, would give you a random color that can be red, blue or yellow. OneOf takes argument T* which means that you can give it as many choices as you want. And what it would do then is it would choose an integer between 0 and the number of choices that you have passed, and it would pick choice which is at this index in the list of choices that are passed by T*. So with these building blocks, we can now set out to write random value generators for some more structured types. Let's start with lists. How would you generate a random list? Well one way to do it is first to flip a coin where the list should be Empty or nonEmpty, that's done with this generator. Here we got with the booleans, we record the result in isEmpty then, if the coin gave us that the list should be empty, we always return the emptyList. Otherwise, we return a nonEmptyList. So how would we turn always the emptyList? Well, that's just the generator single that always returns a Nil. How would we get a generator that generates a nonEmptyList? Well that's another for, where we say, now for getting an nonEmptyList of integers, we have to pick a random integer here in the head left arrow integers generator and then we have to follow that with a random list. So we have here recourse of call to the list generator. The random variable that comes out of that is called tail, and all that is left is that we need to compose the head random integer with the tail random list. So here's an exercise for you. Can you implement a generator that creates random Tree objects? Such objects would be of type Tree, and the Tree trait would have two case classes, two cases. A Tree could either be an inner note consisting of two subtrees, or it could be a Leaf consisting of an integer. So, let's open a worksheet to see how we would do this. I've opened a worksheet generators which contains some basic generators that we need, the integers and the booleans. What we would do now is go bottom up. So how to do a generator for leaves. Well that one is simple. We just say well we need a random number. And for each random number that we get we could use a leaf with that random number. So that covered leaves on Inner nodes. Here's a generator for Inner node. What we do is we generate a random tree, call it l, generate a random tree, call it r, and produce a new inner note, call it inner of l and r. Now finally, the trees generator as you see here. So as in the case of list, we flip a coin whether we want a leaf or an Inner node, that's done here. If we want a leaf, then we turn to the leafs generator to produce a random value otherwise we return to the inners generator. Let's see how this would work in action. So I'll take the trees generator and generate a random value. So what did we get here? Well we got a tree that consists of two leaves in a Inner node and then they themselves are the left tree of another Inner node with a leaf on the right. That's just one possible random trees. The next time I run this operation, then of course, I would get a different node. This time the tree I got was much smaller. Let's try a third time. Now, it was much bigger. So you see what you get is really real random values in the tree domain. An important application of random value generators is random testing. So you know about test, in particular unit tests. There the ideas you come up with inputs to program functions or a set of program functions and then you have an assertion or a postcondition that should hold when these functions are run on the input. So the post condition is a property of the expected result. And then you'll run the tests to verify that the program satisfies the post condition. That the point is, after all the tests pass hopefully, then you only know that the program satisfies the postconditions on these test inputs. There might be others where the program still could fail. So typically you would need to be smart about finding a lot of good test inputs that exercise the program in all possible program parts. So a question is, can we do it without all the hassle of finding these test inputs? Can we do completely without the test inputs? And the answer is yes, in some cases at least, if we can simply generate random test input. So all the hard work of coming up with the input values, we leave to the random value generator. So here's an example how we can do that with the generators that we have defined so far. Here's a test function, it takes a generator that gives us input values in sum domain T, and it takes the number of times we want to run the test on different randomly generated values. And then it takes a test function that returns true or false. So the idea is it returns true, if the test passes, and false, when it fails. And what it would do then is for i, from 0 to number of times generate a random value and then assert that the test function returns true and if not, it will give you this assertion error. If everything goes well, then it would print that we have past the numTimes tests. So here's a use case of this test function. So we would apply it to the generator that gives us pairs of lists. And then the function would say, well given two random lists (xs, ys), two arbitrary lists, xs and ys. Is the length of the always longer than the length of the list xs itself. Question to you. Does the property always hold? Well one way to answer this question is to simply try it out. We have all the random number generators. We have the test function. I have assembled them in another worksheet where you see all the generators here. And finally, we have the test function that you see here at the end. Let me just run this test on pairs of lists and the function would be that given two arbitrary lists xs and ys, we want to postulate that the length of the concatenation is greater than the length of just xs. And what do we get? Well we get a counter example an AssertionError which says the test failed for and here you have to counter example the first is a list consisting of a single element. You see that's a random value that we have in here. The second list is empty and in that case of course the length of these two list is one. And whereas the length of xs is also one. The two lengths are equal and the assertion fails. So the answer to this question here is obviously the property does not always hold, which we've just found a counter example. Now, the idea of these random tests and random value generators is embodied in a very useful tool you can use for you Scala programs. The tool is called ScalaCheck, and it is modeled to a large degree after a tool called QuickCheck which exists in the for Haskell programs. And variants of QuickCheck have been developed for quite a few other languages. Airline is one example and in the former ScalaCheck. Also full Scala. So, the idea is just as you seen, QuickCheck would come up with random value generators. It's a bit smarter than in what you have seen in that it can actually generate, sometimes, the random values if you just give it a type. So, it knows how to generate random values for types that have a certain form. And then you would run tests similarly to what you have seen with the test function. Only in ScalaCheck, it's called forAll. So you would just write forAll and then you would have the predicate that says, well on which domains you want to run your tests. In this case, two lists of Int and what property should hold? So that's the property that you want to test here. And the test was similarly to what you've seen run a prescribed number of times, which you can configure. ScalaCheck is quite a bit smarter than what we've seen, also in the sense that if a test fails, it can minimize the counter example. The counter example used of was a list with a rather large random number, that was the first it found but falsified the test. Whereas what QuickCheck and ScalaCheck would do in this case is they would then try repeatedly to find smaller and smaller examples until the example is something that is essentially a local optimum. So it would typically, such a small example would, be easier to understand as a counter example than just an arbitrary random variable. If you want to find out more about ScalaCheck, then there's a tutorial on the course page and ScalaCheck will also be used in the first assignment that you are going to do in this course.