So that we've gone through some one dimensional summaries and now I want to talk a little bit about some two dimensional summaries. Which include things like scatterplots, but also you can, you can, you can kind of group together lots of one dimensional plots to look at mult, two dimensional data. And in particular Lattice and GGplot are good at this. so, we're going to look at scatterplots smooth scatterplots and also, and in, looking at two, greater than two dimensions then you have to get a little bit more clever in terms of how you present the data because one possibility of course is actual three-dimensional plots. Where you might need you know, 3D gla-, you know, a set of glasses to wear them. These are not usually not that useful. But I have seen them. You can have spinning plots, which would allow you to kind of move the data around in three dimensions. These can be useful sometimes. But often the most useful techniques are to take a two dimensional plots and kind of array them or, or, or organize them in a special way so that you can look at a third variable or even a fourth variable. Uh,and so we'll talk a little bit about these methods. You can also use things like color and size and shape of, of plotting symbols to add dimensions to your data. So here's a way to look at two dimensional data. I've got in one dimension the east west variable which is categorical. And the other dimension I've got the pm2.5 variable. And you can see that I've got two box plots, one for east and one for west. And you can see that in the east and on average the pm2.5 tends to be quite a bit higher than it is in the west. And and so that's a nice summary of, of these two dimensions. Now on the, in the west you see that on average it tends to be a little bit lower, but that, all of the extreme values of the distribution so to speak are in the western United States. So the west has an average that's lower but it has generally a larger spread, so that's kind of interesting. You can do a similar type of plot with, with histograms instead, so now I've made a histogram on top which is the eastern counties, in the histogram on the bottom, which are the western counties. And you see the same basic features. The western counties tend to be lower on average, but they have more extreme values that are higher. [COUGH] And the eastern counties are generally higher on average, but they tend to be but they don't have those kind of very high, extreme values. So of course one of the easier things to do, the most obvious things to do is a scatter plot. And here I'm plotting the latitude of the monitor in the county versus the pm2.5 just to see if there are any, you know, north-south trends. So as the latitude gets increases you're going north. And you can see that there aren't, these are there isn't a very strong particularly strong north-south trend. it does seem that the, the the pm2.5 tends to be higher in the middle latitudes, and not quite as high in the, in the upper and lower latitudes. I've added a d, a horizontal dashed line here at 12, so you can see roughly where the national ambient air quality standard is. And you can see that there are a number of dots that are above it, which are technically speaking would be out of compliance. So you can use color to add another dimension. So for example, if you remember there was the latitude and the longitude and the east-west variable in this dataset. And so what I've got on the X-axis here is the latitude, and the Y-axis, the pm2.5. And then I've added the color. So the, the, the red and black dots indicate the east and west variable. And so the the black dots are the east counties and the red, sorry the black circles are the eastern counties and the red circles are the western counties so you can see, now you can see we are looking at three different variables here. Another way to look at these variables is to, just to make two plots. So now I've got a panel of plots. On the left hand, left hand side I've got all the western counties and on the east, on the right hand side I've got all the eastern counties and I'm plotting them by latitude and pm2.5. So you can see that in the, in both cases the kind of higher pollution- counties tend to be in the middle latitudes. and, and the north and south, northern and southern latitudes tend to have generally lower pollution. This is both the case in the eastern and western United States, so that's kind of interesting. So that's a quick summary of the, the, kind of plotting one dimensional and two dimensional data in R. and, and, even using color and multiple panels for looking at three dimensions that are greater than two. And in general these exploratory plots are meant to be quick and dirty. Notice I didn't spend any time working with the axes or you know, or setting the labels. and, and so I just kind of use the defaults and for the most part use the defaults in R. But the nice thing about exploratory plots is they let you summarize the data and highlight any you know, broad features in the data which may be of interest. You can explore kind of basic questions and hypotheses. So remember, the question, the basic question that we were looking at was, you know, whe, are there any counties that may be out of compliance? And these plots may suggest useful modeling strategies for the next step. Which may be ana a more detailed analysis or model fitting. Lastly here's just a couple resources. There's a nice website called the R Graph Gallery which has lots of examples of, of graphics and plots made in R. And of course the R Bloggers website will often highlight lots of, you know, plots that people have made and have published to their blogs. So take a look at those to get a sense of what's possible.