All right so, here's the first example that we're going to be working with. We have a direct marketing company that's selling a unique product, the Whatchamacallit. And we observe which customers decide to acquire the product, or which individuals decide to purchase this product. Based on the marketing efforts that were targeted at them, so we know how much we spent on the targeted advertising. We know the price that is was offered to them at, we know whether or not they sent a coupon and whether or not the company sent out an email. And so, what we're going to try to do is look at this set of x variables, that's said of x variables. These four Xs, we're going to try to tie to the acquisition probability and that acquisition probability is going to drive the acquisition decision. So the model that we're going to build is going to give us a baseline, as well as coefficients for each of these for marketing actions. So that we can understand which of these pieces is going to impact that acquisition decision. In this exercise, we are going to look at is connecting marketing variables to an acquisition decision. So you see on our spreadsheet columns A through D this corresponds to different levels of, different marketing variables and. Each row of this spreadsheet corresponds to the marketing activity that was aimed at a prospect, and then column E tells us the result of that acquisition decision. Did the prospect end up becoming a customer, one for yes, zero for no. So as we looked at earlier, we can't use linear regression in this case, because our outcome is binary, it's zero or one. So there's no best fitting line that we can draw given the nature of that outcome. So what we have to do instead is, make the assumption that that 0 1 outcome comes from a Bernoulli distribution. Governed by a particular probability and we're going to use logistic regression to develop this particular model. So we're going to do the majority of our work on the worksheet that's labeled LL. That's for Log Likelihood and let me walk you through the process we're going to use. Same data columns A through D, that contains our x variables, column E contains our outcome. Did the customer or did the prospect become a customer, yes or no? In column F, we're going to calculate the linear combination, and that's going to be comprised of an intercept, and the impact of each of the marketing activities. We're then going to translate that into the likelihood that's going to take into account whether or not the prospect became a customer. We're going to do that in column G, we're going to convert that to the log likelihood just to make estimation easier for ourselves. And once we have done that we're going to use the solver function within Excel to maximize the log likelihood function. And find the values for the intercept price, email, coupon, and advertising coefficients that best correspond to this data. What we're trying to do is make that likelihood, make the joint probability associated with the observed data, as high as possible, so we want to maximize the likelihood function. Other software tools, such as SaaS, SPSS, R, have logistic regression built into them. Excel doesn't have that as at tool built into it, so we are going to create that ourselves. So to start out what we're going to do is create cells that contain the values for the coefficients that we're ultimately going to estimate. So in K 1 through K 5 we're just going to put in placeholder values, and I'm just going to use 0s to get us started. And then we want to write out what's the linear combination corresponding to the marketing activity aimed at a particular prospect? So in cell F2, we're going to specify a regression equation that's going to look very similar to linear regression. We're going to start with our intercept value, we're going to make sure that we use absolute references. Then we're going to incorporate the effective price, so that's K2 multiplied by the price that this prospect received plus the effect of the email. Again, an absolute reference, did this prospect receive an email? The effect of couponing, multiplied by whether or not the prospect received coupons and the impact of advertising. Multiplied by the units of advertising aimed at this individual, so that's our linear combination. And you'll notice since we had assumed values of 0 for all of our coefficients, just as placeholder values, we're getting a 0 here. But we're just going to copy this formula all the way down for all of the prospects in our dataset. All right, so how do we take that linear combination and turn that into a likelihood function? Well, the likelihood's comprised of what's the probability associated with a particular outcome, and did we observe that outcome? And so the way that we can write this out is using an if statement, so if acquisition happened, Well, the probability associated with the acquisition happening, it's going to be the probability associated with the logit model. It's going to be the exponential function to the power of the linear combination divided by one plus the exponential function raise to the power of that linear combination. If we look at the numerator and the denominator always going to be positive, always going to be bounded by zero and one so it's guaranteed to be a probability. That if we observe the prospect being acquired, what if the prospect is not acquired? What's the probability corresponding to 1 minus that logic probability? And we can write that as 1 divided by 1 plus the exponential of our linear combination. And we add another parentheses to close out that if statement and again, we can copy that formula all the way down our spreadsheet. And now our likelihood function, if that gives us the likelihood, what's the log likelihood going to be? This is for mathematical convenience, we're going to use the natural logarithm of the likelihood itself. Ideally, what we'd like to do is maximize the likelihood that's given in column G directly. The problem being we have thousands or hundreds of thousands of individuals in our data set. Multiplying those individual probabilities together is going to create an incredibly small number that a computer's not going to be able to distinguish from 0. So rather than maximizing the likelihood we're going to maximize the log likelihood. So we take the logarithm, the natural logarithm of the likelihood for each individual. And we can specify our sample log likelihood as the sum of the individual log likelihoods. So we've specified the sample log likelihood, all that's left for us to do is use Solver to estimate the appropriate values for the Intercept, Price, Email, Coupon and Advertising. And so we're going to do this using the Solver feature, if you don't have it loaded on the computer that you're using you go under File, Options, click on the Add Ins on the menu. Manage Excel Add Ins, we click on Go and you just want to make sure that Solver add in is checked off. So, on the data ribbon, we're going to click on Solver, and you'll see Solver's going to ask us for three things. The first, what cell contains our objective? That's going to be our log likelihood. Second, it's going to ask us what we want to do with it. In our case, we want to maximize the log likelihood and how do we want to do that, what cells are we allowed to change on the spreadsheet in order to do that? Well, we're going to change K1 through K5 we are not going to impose any constraints, so the intercept and the coefficients can be positive or negative in this case. You do want to make sure that this box make unconstrained variables non-negative you want to make sure that that's not checked off. Now we're going to be dealing with relatively straightforward problems using Excel. So we're not going to worry too much about solving method, you can see a couple of different options that Excel has added in. Again, in this particular example not going to make that much of a difference for us. So we're telling Excel we want to maximize the log likelihood that's contained in K7 by changing the values in cells K1 through K5. And we're going to click on the Solve button. All right, now, Let's try this one more time, we'll just use the original estimation method And there you see it's produced those results for us. One thing you do want to keep in mind when you're using Solver, it is going to be sensitive to these starting values that you use. So you may want to conduct some sensitivity analysis, use different starting values. In some problems, as we'll see In some of our other exercises you may want to impose some constraints to prevent Solver from going toward values that are going to produce an error message. And so there's a little bit of fine tuning that's going to need to be done but for relatively straightforward problems such as this one, Solver is able to get us those coefficients. And so we see the intercept value price has a negative coefficient and that's what we would expect. Higher price means lower likelihood of becoming a customer. Email and couponing have positive effects, as does advertising all positive effects. So using these coefficients, we could then plug in different values for our marketing activity and estimate what is the probability associated with a different set of marketing mix activities. Now this Solver tool is one that we're going to come back to throughout the rest of the course. We're going to use that for a range of problems as far as being able to estimate these maximum likelihood models.