I'd like to talk a bit about Focal raster operators. These are also referred to as "Neighborhood operations" or sometimes moving "Window operations." So, the name "Focal operator" comes from Dana Tomlin's classification system for raster operations. So, he came up with local, focal, zonal, and global operations. So, focal is based on this idea that you're making a calculation for one cell at a time, but that cell's calculation is based on not just that cell but the cells around it, otherwise known as being in its neighborhood. That's what I'm trying to show with this diagram here is that we have an input cell here, but we also have the neighborhood around that cell and all of those cells are going to be used to calculate a value for one output cell. Sometimes, I feel like it's better to refer to this as a neighborhood operation, but I like to stick with the consistency of Dana Tomlin's system, local, focal, zonal, global. It's great that it all rhymes, but I have to admit, the focal never quite made sense to me. I think it's meant to the focus, so go with it. Let's roll this. We're going to keep focal as it is, but often, like I said, I think it makes more sense to think of it as the neighborhood around a particular cell. Now, the neighborhood that's defined to do the calculation for any kind of focal operation does not just have to be a square, it can be a lot of different shapes. So, this is what I'm trying to show here. These are some examples I borrowed or adapted from the Ezri documentation, but what they're trying to show here is that we have a yellow cell where that's going to be the cell that gets the result of whatever calculation we do. The rest of these red cells are the cells that are going to use for the basis for that calculation. So, the rectangle here is the same one that I just showed you. Yes, I know it is actually a square, but that's the term that's used in the software is rectangle, but you can also have things like a circle. Again, it doesn't really look like a circle, but from a raster point of view, it's the closest to a circle as you're going to get. So, again, all of the cells that are in red are being used as part of a calculation to get a value for the center. You can do an annulus, which is basically the same thing as a circle, only the value in the middle is not included in the calculation. Or you can use things like a wedge shape, where these are all used in the calculation to get that one output. Why would you want to use something like a wedge or these other ones? Rasters are funny that way. There's different ways of being able to try to draw data out of a data set. So, especially if you think of things like satellite imagery, land cover classifications, that sort of thing, is sometimes you're trying to find ways to enhance things like lines, roads, north-south features, differences between different types of classes, there's different ways to try and enhance different things or smooth things out. So, these are just options that you have available to you that you can play with, to try and see what works best for a particular raster data set. I do think it's a different way of thinking than vector. Vector is very cut and dried. It's very discreet. Something's either inside or outside. It overlaps or it doesn't. But raster can be a little messier that way. A little more subtle, a little more complicated, is you have to try and draw things out, see if you can filter things in certain ways. If you think of it almost like a filter, or an Instagram or something like that, where depending on which one you use. You might enhance certain things or unenhance other things, it's the same thing here. In fact, the filters that are used in Photoshop, Instagram, things like that, are basically raster operations that are being done and some of them are Focal like this. So, not to get too far off on a tangent here, but it really is a similar thing, is that you're trying to find ways to use operators like this to define these, to work with the data, to get something that will help you get where you want to go or answer the question that you're interested in. So, what kinds of operations can you do that or Focal? Well, for example, you can do one like variety. If we have an input data set here, and notice that I've got a couple that are no data, just to keep things interesting. If I want to generate an output data set, I can use my Focal operations. So, remember, this is also known as a moving window. If I want to use variety operation, what that's looking for is how many different values are there in my moving window. In this case, we have four different values. So, my variety value is four, then the whole window shifts over one cell and then the calculation is done again. So, now, we have three different values, so my variety value is three. Why would you want to calculate variety? Well, it's a way of being able to say how similar or different are different values that are near each other. So, I haven't filled in the rest of the values here, but it's essentially the same idea, as it goes through and fills in the rest of them. The no data values by the way, you can either have it set so that it fills in a value based on the numbers that are present, or you can set it so that if there's a no data as an input to a cell, then it keeps the no data as an output to the cell to honor the fact that, well, if you don't know what's there to begin with maybe you should just keep it as no data's the output. That's up to you to decide which is more useful. So, here's the focal statistics tool that you can access through the spatial analyst toolbox. I've just put in here. I've shown you a screenshot of the different drop down options that are available. These are all the statistics that are possible with our focal operations. I won't go through every single definition, but I think some of them are fairly straight forward. What's the maximum, the minimum, the mean. Majority is a good one, that's one where which cell occurs most often in that window, that's a useful one as well. So, I encourage you to try these out, play with them and keep them in mind as options that are available to you depending on what you're after. So, you'll see here that I'm going to do this on my NDVI Reclassification. So, I took my original NDVI values. I reclassified them into five values from one to five, five being the most vegetation. Here, I've got my three by three. So, that's the size of the rectangle that I'm going to use for my focal operation. I'm actually going to select Variety as the type of operation that I'm going to do. So, here's my input. This is before I've done the Focal operation. These are the original five NDVI classes. This is zoomed in so you can see the individual cells a little bit better. I think it gives you a better appreciation of what's going on. Here's the result of the Focal variety. So, I've purposefully given it the dreaded rainbow, color ramp, which some people roll their eyes out especially cartographers and things. I find that every once in a while it comes in handy, because all I really want to show here is what's the difference between different values. This is not ever meant to be a finished map that you wouldn't necessarily show somebody, it's just a way for you to be able to see what's going on, where do I have high values, low values, and that kind of thing. So, having said that, let's have a look at it. So, if we zoom in a little bit here and look at the variety results, we can ask ourselves, "What is this really showing?" So, remember, this is the number of different values that the software found inside my three-by-three moving window as it went across the dataset. So, where we have values of five, that means that there were five different values, inside that three-by-three window. We only have five original classes, so, that's the maximum amount of variety that we could have in this dataset. So, that means, with that three-by-three cell grid, that's for the moving window, we have a lot of different NDVI values. As opposed to, if we went to an area that had a variety value of one, so these dark blue areas, as opposed to the red areas up here. That means that the cells, there was only one value of cell, inside that moving window. It doesn't mean, that the NDVI values were one, it just means that they were all the same. Okay so they could have all been fours, they could have all been fives, but they were all the same, so there was one variety value. Whereas with the NDVI five, it's easy to get these confused. That doesn't mean that it was the value of five, meaning there's lots of vegetation, it meant there was five different values, in that small little area. So, what this is getting at is, how homogeneous, or heterogeneous, are different areas? Are there areas where we have a lot of different amounts of vegetation, or do we have areas where they're all the same? So, for example, if we have an area like this which is a large area that's all variety of value of one, that could be say a field, that's all the same NDVI value. Or it could be a parking lot. Because that would all be the same NDVI value. As opposed to here, this is granted your pond, and so, what I'm seeing, is that we have a lot of variety, around the shoreline of that pond. Which makes sense so, you've got some NDVI that's probably over the water, so that would be low, then you have some nearby that could be say fours, they could be high, and then you have ones that are in between, they could be shrubby areas. So that's useful information for you to try and interpret your data, and interpret the landscape. What am I seeing? Why am I seeing that? What areas are homogeneous, what areas are heterogeneous? Then interpreting that, in terms of your own study, in your own work. Let's go back to the original NDVI values before I classify them, and you'll notice that there's a lot of subtlety here in terms of the values. So, these are floating point, or decimal values, from negative 0.25 roughly, to 0.65 on the positive side. There's a lot of modeled areas, mixed areas, things like that even within the park, there's lighter greens, and darker greens. So, we can use a focal statistic, to smooth out that image. Here what I'm going to try, is calculating the mean, or the average for that moving window, with a three-by-three window size. So, here's the original NDVI values, and here's what it looks like after you've done mean of moving window, three-by-three. So, what that's doing, is it's taking the average of the nine cell values that are nine that window, taking the average of those, and applying it to the one cell in the middle. Moving over, taking the average of the next one. So, what you end up with, is a smoother looking image, than you had originally. Now, you may look at that and say, "Well, why would you want to do that? Why would you want to make it blurrier essentially, than it was before?" As you will see, sometimes what you're trying to do, is look for broader patterns. What is the overall trend that's going on? Or what's the thing that we see most often in an area? So, it can actually be to our advantage, to simplify the data, or generalize it a bit, in order to be able to make it easier to interpret. So, that's what we're doing here. I can try a five-by-five, instead of a three-by-three. So, here's my original NDVI data again, I'm going back to my same focal statistics tool, but now I'm going to do a five-by-five, but I'm still going to calculate a mean. So, here's my before, and here's my after. So, this is even blurrier, than the three-by-three version. That makes sense, if you think about it. So, now, we have a five-by-five window, so we taking an average of a much larger area, and then applying it to that one cell in one middle. You can imagine that the larger that those windows get, the smoother it's going to be. If you took that to an extreme, if you took one average for one entire raster dataset, then you will get one value as one result, and everything would basically be the same. What we're doing here of course it's not quite that extreme, but I want you to get that idea, it's that if you have a three-by-three, you have a small average, with a five-by-five, you have a bigger average, and the bigger the window gets, the smoother your image is going to be, and the more generalized it's going to be. Whether that's useful to you, is completely up to the work that you're doing, or whether that's helping you, get to where you want to go. So, here's our original NDVI values, here's the three-by-three, and here's the five-by-five. Here's the comparison of the three next to each other, just so you can see the effect that they have. So, like I said, sometimes, there's an instinct, and I do this too, where I want to have all the original data at all times. But as I said, you'll notice on the left here for example, that you have areas that are fairly heterogeneous, you go all this modelling taking place, and when you smooth it out, especially if you go to the five-by-five, you get a much smoother image, which can be easier to interpret or easier to map. Of course it also depends on the scale that you plan on using it at. So, if we smooth out this data, and then reclassified it, this is what we would get. So, I think now, I'm hoping, that you'll appreciate the idea that sometimes smoothing it out, can actually make it easier to work with. So, here we have a focal variety of that reclassified version, and you can see that it looks even more generalized than the version we did previously. We have to be careful. We don't want to over state things, or mislead people to think that areas are more homogeneous than they really are. So, I don't want to over do this, but I do think that sometimes it can be useful to simplify this way. So, here's the NDVI reclassified on the left, and on the right we have the NDVI after we've done the five-by-five mean, and then reclassified, and then done our focal variety. So, you can see the two side-by-side. So, what if, we took our classified version of the NDVI which has classes from one to five, and then did a focal mean on that. So, we can take our NDVI class here, that's been reclassified, to our three-by-three, and take a mean of that. So, here's our original reclassified version on the left, so we have classes of one to five, and here's one average of those classes on the right. So, you can see, that we have a smoother version of this. Hold on a minute though, does this make any sense at all? Okay this is something that I think it's easy to fall into without really thinking about it, is that you think, "Oh, I want to smooth out my image, so why don't I just do a mean of the reclassified version?" But it's important to always remember that these classes, one to five, those numbers, are completely arbitrary. So, taking an average of those arbitrary numbers, is meaningless. So, in other words, I could have just as easily made those classes, 1, 50, 500, a 1,000, they're just ways of saying this has less vegetation, this has more. The actual numbers themselves, really don't have any meaning beyond that. They're just a way of ordering the data. So, if you take an average of ordered data like that, the result is that you'll end up with something where you'll have a cell that has a value of 4.3. Well, that's 4.3 between class four and five, it's not that meaningful. So,that's why I stuck this in the end. I just wanted to give you a little reminder, that it doesn't always make sense to take the average of something. If we did an operation where we said what's the majority of the variety of class data like this that's ranked great. But otherwise, you shouldn't take a mean from a classed set of data like this.