0:06

In this session,

Â we're going to dig a little bit deeper into the idea of forecasting.

Â When I say, dig deeper, I actually mean, we're going to drill down and look at.

Â Predicting behavior when it comes to individual customers and

Â that's what customer analytics is about.

Â So, the promise of marketing is I'm going to be able to deliver

Â the right message to the right customer.

Â Well, we want to be able to influence the behavior of individual customers,

Â which customer gets a particular coupon, the depth of the coupon?

Â Which customer do I send which message to?

Â In order to answer those questions, we really have to have

Â an understanding of not just the average behavior of customers,

Â but the behavior of individual customers ad that's we're going to

Â be able to look at with some simple models within Excel.

Â So, where is this idea of a customer-centric analytics going to come

Â into play?

Â Well, the behavior of existing customers, behavior of prospective customers.

Â How are they different from each other?

Â What's the right message to send them?

Â When is the right time to send that out?

Â If we need to know what product should we launch, well, that requires understanding

Â of the customer segments that are in the marketplace.

Â Not just the average customer, but

Â how many different types of customers are there?

Â How many types of customers are there of each?

Â If I want to look at effectiveness when it comes to promotional activity.

Â Promotions aren't going to be equally effective for everyone.

Â Some customers, they're going to have absolutely no impact on,

Â other customers are going to be very effective.

Â And ultimately, if we're looking to make allocation decisions,

Â if I've a fixed marketing budget,

Â how do I allocate resources across all of my customers and prospective customers?

Â 1:50

So if we look at the toolbox that's available to us,

Â the statistical methods that we have,

Â the right model is really going to depend on the type of data that we're looking at.

Â Now we've got truly four different, I have included continuous data in this set,

Â but what kind of customer level data might we have?

Â Well, we're going to have a lot of choice data.

Â I'm choosing between Coca-Cola and Pepsi.

Â I'm choosing between AT&T, Verizon and Sprint.

Â I'm making a choice in that context.

Â We might have count data.

Â How much am I going to purchase?

Â How much quantity am I going to purchase on a particular shopping occasion?

Â We might look at timing data.

Â When do I become a customer?

Â How long do I stay as a customer?

Â How long is it between visits to my favorite website?

Â How long is it between purchases?

Â All of those are timing or duration observations.

Â And if I were to combine some multiple pieces of data might be,

Â let's say, choice and count data.

Â Well, I choose a brand and then how much of it do I buy?

Â Well, that would be multivariate data or I have interpurchase times and

Â then how much do I buy combining count and timing data?

Â Well, those are the different types of data sources that we might encounter and

Â all of those are going to have different methods associated with dealing with them.

Â In terms of marketing, we might be interested in ROI,

Â understanding marketing effectiveness.

Â We might be interested in understanding clickstream behavior of customers online

Â and targeting advertisements At them.

Â If you're looking at loyalty programs or social media activity,

Â all of this is producing individual level data.

Â It's not just being produced at the level of how many total purchases do we have,

Â but it's who are the individuals conducting these purchases?

Â And as we have access to those individual histories,

Â that's what's going to allow us to make those individual forecasts.

Â Conduct customer evaluation at the level of the individual customer.

Â 3:57

So, let me give you a brief refresher from what we talked about last class with our

Â four testing models and building out regression models.

Â We came up with a prediction,

Â our Y variable is our prediction.

Â So, our prediction Y based on a set of predictors.

Â X1 marketing activity 1, 2, however much marketing activity that we have.

Â And we said that's our best guess, but we don't always observe our best guess and

Â that's where the error term comes in.

Â Well, the error term's the difference here,

Â Epsilon between what did I predict, Y hat and what did I actually observe?

Â And we make the assumption when we're running regression models,

Â specifically when we're running linear regression models that,

Â that error term follows a normal distribution that,

Â that error term follows that normal bell-shaped distribution.

Â 4:53

Another way that we could work out this would be to say,

Â my observation Y follow itself, follows a normal distribution

Â with a particular expectation mu and that's my best guess is mu and

Â that's what's giving me my regression equation.

Â That's where my X variable come into play.

Â Well, that's assuming that everything follows normal distribution.

Â When we're dealing with customer level data,

Â the normal distribution isn't necessarily going to be appropriate if I'm

Â dealing with choice data or count data or duration or timing data.

Â Using that normal distribution just doesn't make sense,

Â because that's not what the data ultimately looks like.

Â 5:36

But just like when we're conducting linear regression, we're going to start by

Â building up a model based on the data that we observe and

Â we're going to refer to this as the likelihood function.

Â So if I have N different observations, Y1 through YN and

Â I make the assumption that they all follow a particular distribution.

Â Now when we're doing linear regression,

Â we assume that they follow a normal distribution.

Â 6:04

But what we're going to ultimately look at is when we estimate parameters,

Â the question we're asking is how likely under a given set of values for

Â those parameters, how likely am I to observe the set of data that I did?

Â And what we try to do is make that likelihood,

Â make that probability as big as possible.

Â So, we're going to maximize the likelihood of observing our data.

Â That is we're going to choose the right values for our coefficients or

Â our parameters that makes observing the data as likely as possible.

Â 6:57

So, just let me just give you this example using the normal distribution to give you

Â a sense that we've been doing this all along.

Â We just didn't know it.

Â So, linear regression is actually maximum likelihood estimation.

Â So if we had a set of data, X1 through Xn,

Â that we said came from a normal distribution.

Â 7:38

And so rather now from a computational standpoint, rather than trying to

Â maximize the likelihood which is going to in most cases, end up being a very,

Â very tiny probability that computer programs can't distinguish from zero.

Â Well, we're going to employ a mathematical trick here.

Â Rather than maximizing the likelihood,

Â we're ultimately going to maximize the logarithm of the likelihood.

Â Turns out that it's not going to to change our results,

Â just changes the scale that we're working on.

Â So if I take the log of the likelihood function,

Â this is the equation that I have.

Â Now, let's take our first derivative and

Â find the value of mu that's going to maximize that log likelihood.

Â Well, taking our first derivative and

Â this is our first derivative that we're setting equal to 0.

Â And now, we're just solving for the value of mu.

Â And what we end up with, really your maximum likelihood estimate for

Â the mean is the sample average.

Â So if you've got data that you believe follows a normal distribution,

Â your maximum likelihood estimate is no different from

Â just taking the sample average.

Â So in a lot of cases,

Â what we think intuitively is going to line up with that maximum likelihood estimate.

Â 8:51

And we can also then look at what's the variation?

Â What's the uncertainty around those estimates?

Â So, let's drill down.

Â And again, we're going to focus today on two types of data very common within

Â marketing, choice data and timing data.

Â Now within Excel, we're going to talk a lot about binary choices.

Â Yes, no outcomes.

Â The technique that we're going to be using

Â is going to generalize to those multinomial options when I have three,

Â four, five different options and I'm picking one among that set of options.

Â So, we'll look at the techniques for choice decisions.

Â We'll also look at models that are specific to duration data.

Â 9:34

So choice data, very common throughout marketing.

Â I've put together here, some examples of choices that customers might face.

Â Do you buy a particular category on a shopping trip?

Â Do you buy particular brand on a shopping trip?

Â Yes or no?

Â Did you acquire service, yes or no?

Â Did you keep service, yes or no?

Â Did you decide to file a complaint with the company, yes or no?

Â Any time we're categorizing things into yes or

Â no brand A versus brand B, we're talking about a binary choice.

Â So, it's going to be a very common type of marketing data all for us to deal with.

Â