In this unit, we are going to do a recap of Scala's collections. You've already seen them at length if you followed the previous course on function programming so what we do now is just review some of the core principles and highlights. If you have seen the material before then you might want to fast forward through parts of this course now. First, Scala has quite a rich hierarchy of collection classes. The base class, one of the base classes, is called iterable and it gives you collections you can iterate through and then there would be sub categories. There would be the sequences, the sets, and the maps. If you look at sequences in turn they could be indexed. Which means you have fast random access, or they could be linear which means they are optimized for sequential scan. And then you would have concrete implementations. A standard one for indexed sequence would be vector. A standard one for linear sequence would be list. That's a list. You have already seen. And then there are also types that, in a sense, get imported into those hierarchies from Java. So there would be the Java arrays, which are a kind of indexed sequence. And the [INAUDIBLE] strings, that the Java strings which are also a kind of index sequence. I've drawn the line here dotted, because these cannot be by their nature, sub types of index sequence. We can't really add new super types to a Java class. What we do instead is we make them behave like in the sequences through the use of implicit practice. Similar to sequences, sets and maps would also have several subcategories and several implementations. Having so many collection types could be confusing, except for the fact that they all share a common set of general methods. So it's very easy to apply, a usage profile that you know from one collection, to another. The methods that collection's share are things like picking the first element, picking some arbitrary element, but in our case more importantly, core methods such as map, flatMap and filter are shared by all collections. Other important methods would be the fold operations, foldLeft and foldRight, that map binary operations on a collection and reduce it to a single value. Let's have a look at map, flatMap and filter in more detail. What we're going to do is study how they're implemented on the List class that you've already seen. Let's do map first. So, idealized map takes a function from the list element type T to some arbitrary type U. So, U is a type parameter of map. And it would give you back a list of U. And what it would do is it would do a pattern match over the list itself. If this list is a cell consisting of a head element x and a tail excess then we would apply f to the head, map f over the tail. And compose the results with. If the latest current list is nil, then the result list is also nil. So that was map, if you want to learn about flatMap, the easiest way to do that is to study the differences with respect to map. So the first difference you see here, the function we passed to flatMap doesn't return a single element but a list of elements of an arbitrary type U but it again returns a list of U's. So it's implementation is again a pattern match, the function f gets let's apply to the first element head but since it's list now it gets concatenated with the recursive call to flatten up rather than just prepend it with colons. So instead of a colons here, the double colon you'll have the concatenation operation which is written plus plus. And again for Nil flatMap gives gives you back. The last operation we are going to look at is filter. So filter takes a predicate that's a function from the missed element type T to Boolean anticipate the List[t], it keeps all the elements of the list that satisfies the predicate. The way it does that is again with the pattern match. If the list is non empty with a [INAUDIBLE] x, then if the predicate holds for x, then you get back x followed by filter on the table with the predicate. Otherwise you just filter on the table, and you drop the first element. And in the case of nil, you get back nil as before. In practice the types and the implementations of these methods are quite different in the Scala collection library. And that's for two reasons, the first one is that in the Scala collection library map filter and flatMap are not just defined for lists, that are defined for arbitrary collections and iterables. And that means that their types need to be more general than what you see here. The second difference is that in the implementations of map, flatMap and filter lists, we want these methods to be Taylor recursive. If they weren't Taylor recursive, then applying them to lists over a certain lengths probably over a couple thousand elements would give you a stack overflow. So to make map, flatMap and filter work even for very, very long lists we need a Taylor recursive implementation. This one is not because you see for instance here the call to filter is actually nested the call to. So we need to refactor this implementation to give us something that works in constant stack space. The second thing I want to cover which is related to collections is four expressions. Four expressions are useful because they give you a simpler notation for something that comes down to combinations of these core methods, map, flatMap, and filter. Here's an example that you've seen already in the first course on function programming. It's a combination of flatmap filter and map. If you look at it for a sufficiently long time, you'll probably figure out what it does, but why go through all the trouble if there's a simple notation like this one here. So this one here, I would argue, it's immediately clear what it does, you select i step to the integers between 1 and n j between 1 and i. You ask whether the sum of i plus j is prime, and then you yield the pairs of i and j. So, you yield the pairs of all those integers in these ranges whose sum is prime. And it actually turns out that if you have this for expression then what the compiler of Scala would do is give you something that's equivalent to precisely this combination of map, filter, and flatMap. So the way this is done we are going to show next. Here's how the Scala compiler translates for-expressions in terms of map, flatMap and the lazy variant of filter. There are three rules that apply to the different shapes of the for-expressions. The first one applies to for-expressions that consist of a simple generator, written with a left arrow here. So you have for x taken from some expression e1 yield some expression e2. And that's translated into a simple call of map, so it will be translated to e1.map. And when you have the closure tha takes x as an argument and gives you back the expression e2. The second rule applies to for expressions that follow the initial generator with a filter. Here with an if f, and it could be also fields that I express as s that follow this term. Such a for expression is translated to this one that you see here. The idea is that the filter f would appear as an argument x => f of an anonymous function that gets passed to .withFilter of the first expression. So essentially what you do is you take the generator expression e1 and you immediately prune down all the elements to those that satisfy the filer f and then you continue with the rest of the for expressions. You can think of withFilter as a lazy variant of filter that does not allocate a new list consisting of the elements that are kept by the filter. But instead passes on only those elements to any following cause to map, flatMap or withFilter. The third rule applies to four expressions that start with two generators in sequence. These four expressions are translated to a call of flat net, so the flat net would apply to the first generator here and it would contain in its result The for-expression that is essentially all the rest of the original for-expressions. So it starts with the second generator and keeps any other elements and the result expression E3. So you see that each of these rules translates to for-expression, to a for-expression that has one fewer element than the parent between the parenthesis. First for expression eliminates a single generator. The second one eliminates a filter element here. And the third one eliminates the leading generator in the for expression that is nested here. So if we apply these rules repeatedly. At some point we will be left without and generators or filters in the list and that is then the final result of the translation. So one thing that we haven't seen yet is that the left hand side of a generate in a foreign expression can also be a pattern and not the simple variable. An example of that you see here. So its a query over JSON data, we are given of list JSON objects in this data value and what we do now then is we step through the data, we match on all JObj's that contain some bindings. So we would in that case discard any other JSON values like sequences or strings that appear on top level here. We then look at the phone number binding and bindings, that's just a simple map application. We parent match to say, well that should be a J sequence of phones, we step through all the phones, each one gives us a phone. The phone itself is an object. We pick its number field. That should give us a J string of digits. We demand whether the digits start with the number 212. And if that's the case, then we yield bindings first name and bindings of last name. So what this would do, in a nutshell, would give us the first and last names of all persons that have a telephone number that starts with the area code 212. Patterns and generators act as implicitly filters. Essentially what happens here when you have a generator like this is then that the match with JObj(bindings) gets added as an implicit filter to these data. We'll see how in the translation scheme. So we look at generally the pat left arrow expression where pat is a pattern. For simplicity psyche we assume it has only a single variable x in the pattern. And that then would be translated to this expression here. So, we take the generate expression expression, run it through withFilter, and we say keep on with those elements that match the pattern otherwise return false, and so the elements will be dropped. And then that result in turn gets mapped with the function that takes a pattern and returns x, the variable that is in the pattern. Here's an exercise. According to the rules that you've seen for the translation for expressions, what would this expression translate to? This one or this one? I guess the answer is quite easy to come by. One indication already is that that for expression starts with two generators. We have seen that that should map into something that starts with a flat map so that would rule out the second one. And also if we check all the other details and we see that indeed this forward expression maps into this combination of map, flatMap and with filter.