Hi, friends. I'm Cat Truxillo from SAS. As Eric mentioned, I'll be jumping in from time to time to show you how to do things in software. We're going to be doing things with SAS software. It'll give you a chance to see how to apply some of what you learn hands on. If you don't feel like you're quite ready for hands on, don't worry about it. You'll be just fine. Everything in this course is made to work, whether you want to go hands on and try it or not. In addition, I'm going to be showing you how to do some things that are useful for everybody, like basic data visualization. I'll also be showing you some pretty advanced things in the third course, like neural networks or logistic regression models, and how to then apply and interpret those models. So if you want to try it all out, that's fantastic. If you don't want to try any of it, that's OK, too. But keep in mind it's all there for you. In addition, if you want to go deeper and you want to learn more, there are tons of other good SAS classes out there for you to try, and I encourage you to check them out. Thanks. OK. So I am in Viya for Learners right now. And if you haven't already watched the video on how to access the software using Viya for Learners, please make sure you go back and watch that video. It'll show you how to get into the environment. It'll show you how to access data sets. And the data set that we're going to use in this course is called PVA_Partition. I've already loaded that data set. It'll be in that Viya for Learners environment for you to find. Once you're in that SAS Viya environment, we're going to go to-- let me show you where that was again. There's a menu right here that says Show List of Applications. And I know it's a little silly, but I'm going to call it a hamburger menu because it kind of looks like a hamburger, like there's the patty in the middle and the two buns above and below. And we're going to click that hamburger, and then click on Explore and Visualize. I'm going to go ahead and click on Start with Data, and I'm going to locate the PVA_Partition data set, which I've already loaded into memory. It is sitting right here. And when I click on it-- there we go-- it's ready, and I click OK. This data set has responses to a campaign to try to get donors to donate at different frequencies and for different amounts of money. So people, sometimes they donate several times throughout a year, or sometimes they just donate one big lump amount. With this data set, with the PVA data set, it's a veterans organization. And our most recent campaign for donations, we're using that as our target. So did people respond to the most recent campaign? And we have lots of information about how they responded to previous campaigns because everyone in this data set has made donations to us before. So these are all people who have donated to the veterans organization. But now we want them to donate again. So what we have are things like we have a demographic cluster. That's where one of our data scientists did some work to come up with clusters of demographic variables. We have gender. We have home ownership. We also have the status. You know how you get, like, a frequent shoppers card? Well, there are people who have frequent donation cards. And so if they're in one of these special categories, they'll be identified with that status category 96NK. We also have information about how much they've donated in the past and how many times they donated in the past, how many times they received things like greeting card promotions, and how much they donated each time. One thing I want to do first is identify a few of the variables in the data set that should be treated as categories rather than as numbers. It's important in software to consider when a value is stored as numbers, do you really want to do mathematical operations on it? Or do you want to treat it like it's actually different categories? And we've got a few of those in here that we want to look at. So the first variable that I want to change is called Target Gift Flag. This is a variable that just takes on values of zero and one of whether somebody didn't donate or they did donate. And it is numbers. Technically zero and one are numbers, but I want them to be treated as categories because I want to be able to develop a model that treats people's responses as being categorized as zero or one. So to do that, I'm going to expand the Target Gift Flag, and then under Classification, I'm going to change that from Measure to Category. That now moves Target Gift Flag up into the list up at the top. So you see Target Gift Flag is right here. There's another change that I need to make as well. One thing that I want to do is create a partition variable. This data set actually already has a partition indicator, which means that my observations have been partitioned into training and validation observations. Training observations are used to develop a machine learning model. Validation observations are used to then assess and evaluate and select the champion machine learning model from a whole set of them. And so I need to have that partition variable identified in SAS. To do that, we'll find the partition variable right here. And I'm going to right-click on it and select New Partition. All right. Now there's one thing that might strike you as counter intuitive. But the validation value I want to be validation. And the training data value I want to be training. And those were set in reverse order, but it's because of the way that the zeros and ones are coded in the underlying data set. It's just a quirk of the data, in this case. So we get those set correctly, and now we click OK. There are two more variables in this data set that I don't want to use in this course. And to make life easier for us when we're trying to select variables, I'm going to go ahead and have them hidden. To do that, I'm going to right-click on Target Gift Amount and select Hide, and then right-click on Target Gift Amount with Zero and select Hide. Now if you're really feeling ambitious, you could try developing, for example, a two-stage model where you predict whether or not somebody donates, and then if they donate, how much they donate. That's way beyond the scope of this course, but it's something you could try out on your own. All right. So now we're going to play around with this data set a bit. We're not really developing machine learning models yet, although we are doing some automated machine learning in this demonstration. What we're going to do is right-click on our target variable, which is called Target Gift Flag. Right-click on that and select Explain, and Explain on current page. And watch what happens. What SAS does in the background is it takes our target variable, and then it looks at all of the other variables that are remaining in the data set and considers those to be candidate inputs. So we hid two of the variables. It's not going to use those two. And it has one variable that we've identified as a partition. So it won't use that one as an input. But all the other variables are fair game to be used for explaining differences between the ones who did and didn't respond to our campaign, the Target Gift Flag. And we can see that the factor that is most related to Target Gift Flag, in this case, is Gift Amount 36 Months. And the second most related one is Gift Amount Average Card 36 Months. Those are related to each other, so maybe it's not such a big surprise. And as I highlight each of these different variables, we get a plot on the right-hand side of the relationship between Target Gift Flag and the input that's been selected. You can play around with this and try anything you want with it. I would also like to show you how we can do some automatic machine learning by right-clicking on Target Gift Flag, and then select Predict. And we'll predict on a new page so that we get a new report. Now what Visual Analytics is doing is it's taking our target variable, and it's using all of those candidate inputs to try out a variety of different machine learning models automatically, and then select the one that does the best job of predicting Target Gift Flag. This is really just a PHD button, a Press Here Dummy button, as Eric would call it. So don't expect, necessarily, that these are going to be your most meaningful models. I always urge caution when you're using a PHD, a Press Here Dummy tool, that the most important element of a good machine learning model is actually the modeler. You've got to make sure that you're putting in good variables, that you're drawing conclusions that make sense, that can actually legitimately be pulled from that data. But this is one way to get just a really quick overview of what are some of the variables that are best predictors of that target, and trying out a variety of models to see what works well. But you can also try setting other values manually, if you want. So you could try filling in different values for these and see what the prediction is for Target Gift Flag. So let's say I'm wondering if somebody donated $1,000 last time-- which that's probably not something that's ever going to happen-- well, we predict that they're going to donate this time. But if they donated zero last time, and their average gift amount is zero, and their gift count in 36 months was zero, what do we get? Well, we still get a prediction of one. Interesting. There are some interactions in this data set that are worth exploring. And you'll see some of those coming up in module number three.