0:00

So here's a, simple example of a S3 class and method.

Â So in the, in the top I'm generating some random, normal data.

Â And, I'm calculating the mean.

Â So here the mean is -0.0307.

Â So that's not particularly complicated.

Â But here's a lot of stuff going on behind the scenes.

Â And so, first of all the class of x, which is this random normal vector, is numeric.

Â So I called the mean, which is a generic function on this numeric vector,

Â it will search to see if there's a specific mean method for numeric objects.

Â But there is no as it turns out there is no

Â method for a numeric object, so we'll call the default function.

Â For mean, which calculates you know, the sum of

Â all the elements, and divides by the length the vector.

Â 0:42

So you can look at the default method for the mean function by using

Â the get SE method function, and you can see that there are other arguments.

Â There's this trim argument, and a .rm, and as you go through,

Â the function, you can see that it checks a number of things.

Â It look, if you're doing a trimmed mean, it takes care of that first.

Â 0:59

And then, and then ultimately at the very bottom there, it calls

Â some internal C code that actually

Â calculates the meme for efficiency purposes.

Â [BLANK_AUDIO]

Â So that's sort of.

Â The meme is a simple example here.

Â Is, is.

Â And as, a slightly more complicated example I've

Â generated some, random data in a data frame now.

Â And I'm S applying The mean function over the data frame.

Â So one of the important things to known about data-frames

Â is, that each column could potentially be of a different class.

Â So you could have the first column be numeric like we have here

Â and then the second column be integer so when we add supply over the

Â columns and we call the mean function for each column the mean will, will

Â protect the class of that column and see if there is an appropriate method.

Â So for the first column it's numeric It will check for

Â a numeric method, there isn't one, so it will call the default.

Â And then the second column is actually integer, so it will check for an integer

Â method, again there's no integer method, so it

Â will call the default for that column too.

Â So, for, if you have a large data frame

Â and every column, is a, is a slightly, is a

Â different class, the mean function will check each column

Â to see if there's an appropriate method for that class.

Â 2:10

So in some cases it's possible to call the, a method directly, so

Â you may suffer some S3 methods are visible to the user, so you make.

Â So, for example the mean.default function, is you can

Â possible to call that directly without calling the generic function.

Â But in general the rule is you should never call methods directly.

Â You should always call the generic function and

Â let the, the appropriate message be dispatched automatically.

Â And this, this results in cleaner code, And and it's slightly more robust.

Â That way the name of the, of the method switches so if it

Â changes a little bit you won't have to worry about the underlining details.

Â With the S4 system this isn't a problem because you

Â cannot call the methods directly for the most part at all.

Â 2:55

So one last method, example for an S3 Class Method so here is the plot function.

Â And again I'm generating some random normal data so when I

Â plot it again it looks for a numeric method for plot.

Â And since there isn't one it just calls the default method for

Â plot, and it makes a little scatter plot, as you probably would've expect.

Â However, this is a slight variation.

Â I'm generating some random normal data, and then I'm converting, it into a time

Â series object, so I use the as.ts function to convert it into a time series.

Â Now, you'll notice I call plot, I call plot the exactly

Â the same way as I called it in the previous slide.

Â But now you can see the plot is totally different.

Â Instead of a plot with a bunch of circles on it it actually creates

Â a kind of type series kind of plot where all the, the lines are connected.

Â And also you'll notice on the x axis, there's a label on it for, called time,

Â which is different from the x axis that

Â was called for the default function in plot.

Â So here you can see that there is a

Â special plotting method for ts objects Or time series objects.

Â And that plotting method is actually being called here and not the default method.

Â 4:02

So, if you want to write new methods for new classes, if you often create a new

Â class, if you represent a new data type, you'll

Â probably end up writing methods for printing or showing.

Â You'll probably, may write a method for summary.

Â You'll often, you'll write a method for plotting because there

Â won't be a plot method for a new type of data.

Â And so there are a couple, there are basic, two basic

Â ways that you can extend R with the classes and method system.

Â You can write a new, you can create a new

Â class and then write a method for an existing generic function

Â like Print or Plot Or you can write new generic

Â functions, and new methods for those generics for your new class.

Â 4:42

So, for the rest of this lecture, I'm going to talk about S4 classes.

Â And, and so, one of these questions, is, you

Â know, why would you want to create a new class?

Â Right.

Â So what's wrong with the vector in lists,

Â and numerics and integers and logicals, et cetera.

Â While it's possible to survive on just the basic data types often it's easier

Â and more compact to think at a higher level about, more complex data types so,

Â for example, you might want to represent new types of data like gene expression

Â data Spacial temporal data, hierarchical datas that

Â you might want to represent a sparse matrix.

Â These are all data types that don't exist in r.

Â And so, you have to create a new class to represent them.

Â 5:19

There are often new concepts that haven't been thought of yet.

Â So, if you can think of, you know, there's a linear model type of

Â class, but you think of point process model is a different kind of class.

Â Mixed effects model is another type.

Â Things like that that haven't been thought of

Â that haven't been kind of implemented yet in R.

Â 5:36

And also you might want to hide certain implementation details from teh user.

Â And so you, you may be able to represent a new type of data using lists.

Â And vectors.

Â But then there maybe a lot of ugly details that are exposed

Â to the user that you would rather not, the user know about.

Â And so, And.

Â And one thing to, to emphasize, that when I say creating a new data,

Â type I don't mean, necessarily mean that the data type has never been seen before.

Â All I mean is that it's not known to R

Â and R doesn't have any special handling for those data types.

Â Types.

Â 6:09

So, creating a new class is done with the set class function.

Â At the very minimum you can get away with

Â just specifying the name of the new class, but often

Â when you specify a new class there will be

Â data elements associated with this class, those are called slots.

Â 6:25

You can define methods For this class using the set method function.

Â So, at a loose level, you can think of a class as like a list.

Â Every class has a bunch of slots.

Â So, these are kind of like elements of a list.

Â But each slot, it's a little bit more specific,

Â because each slot is an object of a certain class.

Â So you can't just put arbitrary data into any slot of a class.

Â You have to put the specific type of data, into each slot of a class.

Â 6:52

So, for example, here I have a simple example of creating

Â a polygon class, so there's no polygon, data object in r.

Â So there, you can think of a variety

Â of ways in which you might represent polygon data.

Â So, you can think of a polygon, as a set of vertices, and a set of

Â or it's just a set of vertices and

Â in thinking you can have lines connecting the vertices.

Â So here I'm creating a polygon class.

Â And in the representation, I I create two slots.

Â One is called x, which contains the x coordinates of all of the vertices.

Â 7:31

So if I create a polygon object the x

Â and the y slots have, have to have numeric data.

Â They can't have characters, they can't have integers, etcetera.

Â 7:41

So now I've created a polygon class the next thing I probably want to

Â do is create a method for, for example, for plotting this, a polygon.

Â And so when you call the set method function you have to specify generic

Â functions such as plot, and then you have to specify something called a signature.

Â And so the signature is basically the, the set of

Â classes on which the generic on which the method will operate.

Â So in this, case I'm going to want to

Â create a method for the plot generic function.

Â The signature is basically going to be this Polygon Class that I just created.

Â 8:15

So here's my method for the plot generic called the set method function.

Â The first argument is the name of the generic, and the second argument is the

Â signature, in this case, it's polygon and so you can see in this function it

Â takes an X and a Y argument and then the dot, dot, dot is for

Â kind of other arguments And so you can see that within so the x argument.

Â Is just the polygon object.

Â The y argument is missing there's no y argument here.

Â 8:41

And so I can see that within the function I actually called plot again.

Â Right so here I'm creating a method for the polygon object.

Â And within the method I actually call plot again.

Â But that's going to be the default method for plot.

Â Because I'm just plotting some numeric vectors in that case.

Â So what this function does is it plots the vertices.

Â And then it sits, to set up the plotting window.

Â And it kind of wraps the vertices around themselves so that they kind of connect.

Â And then it uses the lines functions to to connect, connect, the dots.

Â And then after that, it makes my little polygon here.

Â And so that's the plotting method for polygon objects.

Â 9:17

I know, notice that I, the plot, since, since plot already existed

Â as a generic function, I didn't have to create a new generic function.

Â 9:24

And notice that when I want to access the slots

Â of an object in an s4 class, I use the at

Â symbol to so it's kind of like a list but

Â instead of using the dollar sign I use the at symbol.

Â 9:36

So after I call the set method function, if I call

Â show methods on plot you can see that the the polygon method

Â has been added to the list here and then in addition to

Â the polygon method there is any method which is the default method.

Â 9:52

So here I am creating.

Â I'm using the new function to create a polygon, an object from the polygon class.

Â I create the x vertices and I create the y vertices and now

Â when I call plot on it you can see that it doesn't just

Â call the default plot method, it actually calls my method that I just

Â created and you can see it draws this little triangle by connecting the dots.

Â 10:14

So, that's a very simple example on how to

Â develop, create a new class and develop some new methods.

Â It's a in general developing classes and methods is

Â a very powerful way to extend the functionality of R.

Â And so just, so just to summarize, their classes define new data

Â type so they, they, they allow R to represent new types of data.

Â Methods extend generic functions to, to specify the behavior

Â of generic functions on these new classes that he developed.

Â 10:42

And as you develop these new data types and these

Â new concepts and make a familiar to R, the classes and

Â methods give you a way to develop kind of an easier

Â interface for users to kind of interact with new types of data.

Â And so it's really a handy way to

Â kind of create new ,to allow users to interact with

Â new kinds of data without, without having to

Â get bogged down, a lot of the implementation details.

Â And one of the most popular ways to kind of make

Â new classes and methods available to users is through R packages.

Â And so, most commonly you'll see these

Â kinds of things embedded within an R package.

Â