In the last units, you have seen a data type rational described by a single class. But there are also situations where it's natural to describe what's conceptually a data type by a hierarchy of several classes. For instance, consider the task of writing a class for sets of integers with the following operations, include, and contains. Include takes an integer, and returns a set that contains the previous set plus that integer that got added, and contains asks whether a given integer is in the set, and returns a Boolean. IntSet is an abstract class which is made clear here with this modifier. Abstract classes can contain members which are missing an implementation. In our case, both incl and contains are not implemented. These members that are not implemented are called abstract members. Because abstract classes can contain abstract members that are not implemented, it follows that we cannot create instances of abstract classes because they would miss the implementations. So a call like IntSet would be illegal. Let's consider implementing sets as binary trees. There are two types of possible trees, a tree for the empty set, and a tree consisting of an integer and two sub-trees. For instance, the set, 1, 2, 4, 5 could lead to the following tree; we could have, let's say, a four at the top, then a two, and a one at the left, and there will be empty five, and that would be empty, and the two has again two empty sub-trees. That's one possible representation. We start with one of the numbers four, and then we notice that on the left sub-tree, that all the numbers which are smaller than the root number, and on the right sub-tree, that all the numbers that are larger. That's an invariant that we want to keep for the sets because it makes it efficient to both query with the contains method, and add new members with the include method. Let's look at the implementations of the empty set and the non-empty set. The empty set, you've see here, for it, the contains method is always false, empty set doesn't contain an element, and the include method would create a non-empty set with the included element, and two empty sub-trees. If we start with the empty set and we include two, then we would get the tree that you see here, two empty-empty. Empty sets are a special case of an IntSet, they're one of the possible implementations, and that we make clear with this extends clause. We say at the class empty extends IntSet, which means that it conforms to the interface of inserted implements all the methods that are defined in IntSet. Now let's look at a non-empty set. It's another extension of IntSet. It takes an element as the first parameter, and two IntSets as the left and the right sub-tree. Now, let's look at the method implementations. Let's take contains first. To ask whether the set contains an element X, we ask whether the element X is smaller, larger, or equal to the root element. If it's smaller, then we know that we have to look in the left sub-tree. If it's larger, we look in the right sub-tree, and if it's equal, than, just the set contains the element, and will return true. What about include? Include is similar to contains in that we also ask whether the included element is smaller, larger, or equal to the root element, elem. If it's smaller, then we insert it in the left sub-tree. But remember we are in a functional language, so this insertion would return a new tree. It wouldn't change anything, and that means that that new tree would have in turn to be wrapped with a tree that adds the element and the right sub-tree. If the X is greater than the element, and we analogy, go straight on the right element, so we include in the right sub-tree, and we wrap with element left. If it's the same as the element X, then there's nothing to do, the element is already in the set, and we can return the set itself. Let's look at the operation on include with this set that we have formed here. Let's say we want to include the element 3 in the set. How would we go about it? We would ask, is three greater or smaller or equal to four, when it is smaller, we go to the left sub-tree. It's greater than one, so we go to the right sub-tree, it's greater than two, so we go in the right sub-tree. Now we have an empty tree. Empty tree include what then give us three empty, empty as the new sub-tree. Now we'll wrap with two, and the empty-empty set. Now, we'll wrap with one. Its left empty set, and finally we'll wrap with four. It's right tree. We formed a new tree. We left the blue tree alone. We formed a new overlay tree, which is the red tree that now contains three as well as the elements that are in the blue tree that are reachable through the blue tree. At the same time, the red tree doesn't completely copy the blue tree, it contains some elements that are left untouched. For instance, this sub-tree here, that's now a part of the blue tree and it's a part of the red tree. This idea that we can create new data structures from old ones without changing the old ones incrementally is called persistent data structures. Because the old one persist. You just create new ones out of the old ones. It's one of the standard techniques in function programming. Some more terminology. IntSet is called a superclass of Empty and NonEmpty. Empty and NonEmpty are subclasses of IntSet. In Scala, any user-defined class extends another class. If no superclass is given, the standard class object that's defined in the Java package Java.lang is assumed. The direct or indirect superclasses of C are called the base classes of C. The base classes of NonEmpty would be first IntSet and then object. We've seen that there are overall three depths for contains and incl. One pair was abstract. That wasn't the class IntSet, where we have left out the implementations. Then there were two implementations, one for in class Empty and the other in class NonEmpty. In that case, we say that the definitions of contains and incl in Empty and NonEmpty implement the abstract functions in the base trait IntSet. It's also possible to redefine an existing non-abstract definition in a subclass by using override. If you have say a class base, which a complete method foo and an abstract method bar, then you can define a subclass that extends base and you can just implement bar like this. But if you want to change, if you want to re-implement foo to be two here, then you need the override. You can't just write def foo, that would give an error. The reason why you're forced to do that is that the compiler wants to make sure you don't have an accidental collision where you just define a method, think it's a new method, but that method accidentally replaces a method in the subclass. So override is essentially an opt in marker that says, that's what I intend. In the IntSet example, one could argue that there's really only a single Empty IntSet. It makes sense to have many instances of NonEmpty IntSets, but all Empty IntSets really are alike. It seems overkill that the user needs to create many instances for the Empty IntSet. In fact, we can express this case better with an object definition. An object definition would look like this, object Empty extends Intset, and then come the two implementations of the method, and then comes the optional end marker. This defines a singleton object named Empty. The structure is exactly like a class, but instead of a class that needs to be instantiated, an object exist already and there's exactly one object that exists, which is called Empty here. No other Empty instance can be or needs to be created. Singleton objects are values, so Empty evaluates to itself. An object in a class can have the same name. This is possible since Scala has two global namespaces, one for types and one for values. Classes live in the type namespace, whereas objects live in the term namespace. If a class and an object with the same name are given in the same source file, we call them companions. For instance, you could have a class IntSet with the usual methods and then you could have in the same source file an object IntSet that adds a method let's say singleton that creates a set with consisting of exactly one given element. A companion object of a class plays a role similar to what static class definitions are in Java. That means, in the companion object you would typically put methods that exist only once per class and not once per class instance. Static methods are absent in Scala, because Scala is a pure object oriented language than Java. In order to emulate what you would do with a static method, you define a singleton object with the name of the class and you put the methods in that singleton object. So far we've mostly executed Scala code from the REPL or the worksheet. But it's also possible to create standalone applications in Scala. Typically, such an application takes the form of an object that has a main method. The main methods have to follow a particular convention which is inherited from Java. They have to take a single argument of type array of string and the return unit. A main method then has a body that is executed when the program is called. You call the program by typing Scala in the command line and then the name of the object. Once this program here is compiled, you can start it with Scala Hello and you would see Hello World. Writing main methods is similar to what Java does for programs, but Scala also has a more convenient way to do it and you've seen it already, that you can have a Scala program, a Scala text and you can just put in the Scala program a single method that's annotated with main. That method then also gives you a main method for the program. You can have a method here. Let's say birthday, that takes a name and an age, and it prints Happy Birthday name ages already. Once this function is compiled, you can call it from the command line with "scala birthday," and then you pass just arguments for the two parameters here. "Peter" for the name, and "11" for the age. That's what you would get. Typically, using main as an annotation is a more convenient way to start, or to define a whole program. Under the covers then it will translate into an object with a main method as the JVM prescribes, but you won't see that; that's essentially the compiler wrapping this birthday method for you in a synthetic object that has the correct main method. Here's an exercise. Let's write a method union for forming the union of two sets. You should implement the following abstract class: class IntSet as before, but now it has a third method union that takes another IntSet and returns an IntSet. I've prepared a worksheet here where you will see the union method up here and already templates for the union method in Empty and NonEmpty, and so far the implementation is missing, which is indicated by this triple question mark. The task is to replace the two triple question marks with something that implements union. What's the union of the Empty set and another set s? Well, that's obviously s, so that was simple. What's the union of a NonEmpty set consisting of an element and a left and right subset, and some other set s? That's actually harder. Here's one way to do it. The idea is that what we need to do is, we need to reduce it somehow to include. We need to include with the union of something that is smaller. One way to do it would be to say, Well, we take the left set and form the union with the right set, and the union with s, and then we finally include elem in that. This is very recursive when union call gets replaced by two more and the call to include. How are we sure that this terminates. Well, the argument here, would be to say, "Well, each of these following unions here, unions sets that are smaller than the sets we are started with." We started with the current set, that set is clearly bigger than left or right, so that union here will essentially work on something which is smaller. Then finally, we take the union with s, and that union here has to work on the same right-hand side, but the left-hand side is one element smaller. Then we compensate, finally, by including the element at the end. If you do that, then the left-hand side set will get smaller until finally it will be the empty set, and when it's the empty set, then we know what it is. It's just s. Then the recursion terminates. That's a termination argument, and it's correct, but nevertheless, a call to union like that is pretty inefficient because essentially we decompose and reconstitute that set multiple times, and it will be much nicer if we could simply go through the second set InSet and say, "Well, pick all the elements of that set and just include them one by one into the current set. For being able to do that, we'd have to look inside the second set and say, "Well, what are the elements of the second set?" So far that interface doesn't give it to us. We'll learn techniques to decompose data structures, to find out what's inside data structures in the units that follow this one. One interesting consequence of class hierarchies is that they lead to dynamic method dispatch. That means that the code that's invoked by a method call depends on the runtime type of the object that contains the method. In fact, that falls out directly from our substitution model. We can put that to the test and just ask, let's say, e for the empty set contains one. That would be the contains method in the empty set, which is simply false with the substitutions and that would give us false. If we use the same method called contains seven with a NonEmpty set, then what that would give us is the implementation of contains in the NonEmpty set. That's this one here. Subject to the substitutions they substitute for actual parameters and that substitute the NonEmpty set here for this. We can trace this substitution and that will eventually give true. What we've seen here is that the sequence of reductions that gets performed depends on the value to the left-hand side of the method, and that's what we call dynamic binding or dynamic dispatch. Here's something to ponder for you. Dynamic dispatch of methods looks quite analogous to cause to higher-order functions. In each case, the target where the call goes is not obvious from just looking at a single expression; for higher-order functions, it depends what got passed as the actual argument to the parameter, and for dynamic dispatch, it's what is the value to the left of the dot. The question is, can we implement one concept in terms of the other. Can we implement objects in terms of higher-order functions. Or can we implement higher-order functions in terms of objects. Or maybe, can we do both. Implement one concept in terms of the other in either direction?