0:00

You now know how to create a simulation model in the analytic software platform.

You also know how to run the model and interpret the output.

To create these models we have made a critical assumption

about the source of uncertainty.

We took the probability distribution of the uncertain

input variables as something that was given to us.

Remember the other booking problem that we have been discussing?

In that problem, there are two sources of uncertainty.

The demand for seats in the flight and

the numbers of booked passengers that actually show up.

Or assumptions were that the demand for

seats follows a Poisson distribution with an average of 150 passengers.

We also assume that the show-up process

could be modeled with a Binomial distribution.

0:51

These are probably pretty good assumptions, but

you might be wondering how in general we could come up

with the right functions to model the uncertainty of key input variables.

That's what we're going to learn in this video.

As he have mentioned before, uncertainty comes from variability.

When we say that the average demand for a flight is 150 passengers,

we don't expect that on every single day the demand will be exactly 150 passengers.

This is the difference between an average value and a constant value.

So, what we're saying, is that the demand varies and

that it is possible to calculate an average.

An average value is good information but we want more.

We want to know how the demand varies.

1:40

They key is to estimate the probability of distribution function for

critical inputs to a simulation model.

And the way we are going to do that is by analyzing historical data.

Locate and open the Excel file flight over booking data.

This workbook contains two years of data for

the flight that we are using as an example.

The first column has a date,

the second column the demand, the third column has the number of passenger's vote.

The fourth column has a number of passengers that actually showed up, and

the fifth column has a fraction of the book passengers that showed up.

You can see that the demand varies from day to day.

You can also see that the fraction of passengers that show up for

their flight also varies from one day to another.

Those are the two sources of variability and,

therefore, also the two sources of uncertainty.

In cell H-1 we have calculated the average demand, which is about 150 passengers.

The show up rate of 92% is in cell H-2.

If we look at the formula in H-2 we can see that the show up rate is simply

the average of the fraction of booked passengers that have shown up for

the flight in the two years of data.

Generally speaking, simulation models do not use historical data directly.

The idea is to use the data to create a so-called probability distribution

function for each variable and then use the function for the simulation model.

The process of identifying an appropriate probability distribution function for

a variable for which we have historical data is called Distribution Fitting.

The analytics of a platform includes a distribution fitting tool.

To use this tool we need to start by selecting the cells

where the historical data is.

So let just start with the demand and select cells B2 to B731.

3:40

Know that in this case, I did select the name of the variable which is in cell B1.

Now, we go to the ASP tab and in the tools group we find fit.

Click on fit and the fit options dialogue appears.

There are two general types of this distribution functions,

continuous and discrete.

In this case we have discrete data and we choose the discrete option.

This is the only option that we're going to change.

We will leave the other options under their full values.

Click the fifth button and the results windows is displayed.

The results window shows the comparison chart.

This chart compares the frequency distribution of the historical data in

blue and the frequency of the proposed distribution function in pink.

The idea is to visually inspect how well the proposed function fits their values.

In this case the analysis is suggesting a Poisson distribution with a mean

value of 150.4 passengers.

Other alternatives are shown in the left panel of the window.

These alternatives are shown in order of fitness.

So, the analysis is suggesting that the Poisson is the best fit,

followed by the negative Binomial.

We're going to keep the Poisson distribution as it

looks like a good representation of the demand.

If we close this window,

we are asked whether we want to accept the fitted distribution.

Say yes,

the ASP then allows you to place the side function somewhere on the spreadsheet.

Place it in cell H4.

We can verify that H4 now contains this signed Poisson function

with an average of 150.4.

For the show up rate the processes are slightly different.

In our simulation we model the show up rate as a Binomial distribution.

The Binomial distribution makes a lot of sense because given the number of trials

and the probability of success in each trial.

The distribution represents the frequency of the number of successes.

The number of trials in our context is the number of booked passengers.

Success means that the passenger shows up for the flight.

Or best estimate of the show up rate is not 92% which we've found by

averaging the historical percentage of passengers showing up.

So we have a an estimate of the probability of success.

We also have the number of trials because the number of booked passengers is known.

That means that we just need to verify that the number

passengers that show up for the flight follow a binomial distribution.

We can use the fit tool to do this.

Select the data on column D and click on fit.

6:34

ASB shows that the parameters of this distribution are at 339.

For the number of trials and point three nine for the probability for success.

However, these numbers are not meaningful because we have

already estimated the show up rate and for each flight we know the number of trials.

That is we know the number of booked passengers.

The distribution fitting process has helped us verify

that the Binomial distribution is appropriate for the purpose of modeling

the number of passengers that show up out of a number that are booked.

Distribution fitting is a very important tool to convert data into probability

distribution, for instance, that can be used in a simulation model.

It is possible to run what could also be considered as simulation

using historical data.

And this is sometimes done to test how a policy who have performed in the past.

But distribution fading along with simulation allows us to create

many scenarios based on what happen in the past.

This is the most effective use of data in the context of building simulation models.