Let's now talk about forward selection.

We start with single predictor regressions

of response versus each explanatory variable.

We then pick the model with the highest

adjusted R squared, add the remaining variables one

at a time to the existing model and

pick the model with the highest adjusted R squared.

We repeat until the addition of any of the other

remaining variables does not result in a higher adjusted R squared.

Let's illustrate this with an example using the cognitive test scores data.

We start with four simple linear regressions,

one for each of the candidate predictors in

our data set, and then we pick the model with the highest adjusted R squared.

So the first variable that we're going to be

adding to our model is going to be mom's IQ.

In the next step we then try the remaining

3 variables and once again pick the model with the

highest adjusted R squared and that's going to be mom's

IQ with the addition now of mom's high school status.

And net in the next step we once again try the 2 remaining

variables and if there's an increase in the adjusted R squared which there is.

Then we move on to the one more complicated model.

Lastly, we try the full model, but the adjusted R squared does not

go up, therefore we're going to stick with the model in step three.

And note that we arrived at the same model, whether

we went backwards or forwards, using the adjusted R squared criteria.