Statistical Learning

Statistical Learning

Instructor: Shahrzad Jamshidi

Included with Coursera Plus

Learn more

9 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

115 hours to complete

3 weeks at 38 hours a week

Flexible schedule

Learn at your own pace

Build toward a degree

Learn more

9 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

115 hours to complete

3 weeks at 38 hours a week

Flexible schedule

Learn at your own pace

Build toward a degree

Learn more

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

37 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

There are 9 modules in this course

This course offers a deep dive into the world of statistical analysis, equipping learners with cutting-edge techniques to understand and interpret data effectively. We explore a range of methodologies, from regression and classification to advanced approaches like kernel methods and support vector machines, all designed to enhance your data analysis skills.

Our journey is guided by the well-known textbook "The Elements of Statistical Learning" by T. Hastie, R. Tibshirani, and J. Friedman. This course provides examples written in Python. Your system should have Python 3.8 or higher, as well as essential libraries such as NumPy, pandas, matplotlib, seaborn, scikit-learn, SciPy, and PyTorch. These tools not only support the learning process but also prepare you for real-world data analysis challenges. Whether you're aiming to refine your expertise or just starting out in the field of data science, this course provides the knowledge and tools to transform your understanding and application of statistical learning. It's a perfect blend of theory and practice, ideal for anyone looking to enhance their skills in data interpretation and analysis.

Welcome to Statistical Learning! In this course, we will cover the topics: Statistical Learning: Terminology and Ideas, Linear Regression Methods, Linear Classification Methods, Basis Expansion Methods, Kernel Smoothing Methods, Model Assessment and Selection, Maximum Likelihood Inference, and Advanced Topics. Module 1 offers an in-depth exploration of statistical learning, beginning with the rationale behind choosing a pre-defined family of functions and optimizing the expected prediction error (EPE). It covers the essentials of statistical learning, including the loss function, the bias-variance tradeoff in model selection, and the significance of model evaluation. This module also distinguishes between supervised and unsupervised learning, discusses various types of statistical learning models and data representation, and delves into the three core elements of a statistical learning problem, providing a comprehensive introduction to this field.

What's included

8 videos5 readings4 assignments1 discussion prompt1 ungraded lab

8 videosTotal 54 minutes

Instructor Welcome2 minutesPreview module
Course Overview4 minutes
Module 1 Introduction1 minute
What is statistical learning?5 minutes
Types of Data 14 minutes
Models in Statistical Learning6 minutes
Model Selection 8 minutes
Formal Description of Statistical Learning10 minutes

5 readingsTotal 105 minutes

Syllabus10 minutes
What is Statistical Learning Reading10 minutes
Terminology and Types of Data Reading15 minutes
Formal Description of Statistical Learning Reading60 minutes
Module 1 Summary10 minutes

4 assignmentsTotal 38 minutes

Module 1 Summative Assessment15 minutes
What is Statistical Learning Quiz3 minutes
Terminology and Types of Data Quiz5 minutes
Formal Description of Statistical Learning Quiz15 minutes

1 discussion promptTotal 10 minutes

Meet and Greet Discussion10 minutes

1 ungraded labTotal 60 minutes

Coding Exercise60 minutes

Welcome to Module 2 of Math 569: Statistical Learning. Here, we explore what is arguably the foundational model of the field: linear regression. This simple yet highly useful model helps us better understand the statistical learning problem discussed in Module 1. In Lesson 1, we'll carefully review what linear regression aims to do, how we construct the model's parameters with a given dataset, and what kinds of statistical tests we can perform on our estimated coefficients. In Lesson 2, we’ll cover a method known as Subset Selection, which aims to improve linear regression by eliminating unimpactful independent variables. In Lesson 3, we explore introducing bias into the linear regression model with two regularization methods: Ridge Regression and LASSO. These methods utilize a hyperparameter, a key concept in this course, to limit the growth of the coefficients. This is the source of the bias and will help us understand why a biased estimator can outperform our unbiased estimator for the coefficients of linear regression in Lesson 1. Finally, Lesson 4 introduces the concept of data transformations, which allow one to address complexities within a dataset. It also provides a simple way of converting a linear model to a nonlinear model.

What's included

10 videos6 readings5 assignments6 ungraded labs

10 videosTotal 91 minutes

Module 2 Introduction1 minutePreview module
What is Linear Regression? - Part 17 minutes
What is Linear Regression? - Part 24 minutes
Linear Regression11 minutes
Linear Regression Assumptions9 minutes
Statistical Tools20 minutes
Subset Selection8 minutes
Ridge Regression10 minutes
LASSO9 minutes
Data Transformation Examples and Linear Regressions 7 minutes

6 readingsTotal 290 minutes

Module 2 Introduction Reading5 minutes
Linear Regression and Least Squares Reading30 minutes
Modification of Linear Regression: Subset Selection Readings120 minutes
Coefficient Shrinkage for Linear Regression: Ridge Regression and LASSO Readings120 minutes
Data Transformations and Linear Regression Reading5 minutes
Module 2 Summary10 minutes

5 assignmentsTotal 90 minutes

Module 2 Summative Assessment60 minutes
Linear Regression and Least Squares Quiz10 minutes
Modification of Linear Regression: Subset Selection Quiz5 minutes
Coefficient Shrinkage for Linear Regression: Ridge Regression and LASSO Quiz10 minutes
Data Transformations and Linear Regression Quiz5 minutes

6 ungraded labsTotal 360 minutes

Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes

Welcome to Module 3 of Math 569: Statistical Learning, where we delve into linear classification. In Lesson 1, we explore how linear regression, typically used for predicting continuous outcomes, can be adapted for classification tasks-predicting discrete categories. We'll cover the conversion of categorical data into a numerical format suitable for classification and introduce essential classification metrics such as accuracy, precision, and recall. In Lesson 2, we'll explore Linear Discriminant Analysis (LDA) as an alternative method for constructing linear classifications. This method introduces the notion that classification maximizes the probability of a category given a data point, a framing we will revisit later in the course. Maximizing the likelihood of classification, given some simplifying assumptions, leads to a linear model that can also reduce the dimensionality of the problem. Finally, in Lesson 3, we will cover logistic regression, which is constructed by assuming the log-likelihood odds are linear models. The outcome, similar to LDA, produces a linear decision boundary.

What's included

5 videos6 readings4 assignments6 ungraded labs

5 videosTotal 37 minutes

Module 3 Introduction1 minutePreview module
Classification with Linear Regression10 minutes
Linear Regression and Indicator Matrices7 minutes
Linear Discriminant Analysis (LDA)9 minutes
Logistic Regression 8 minutes

6 readingsTotal 175 minutes

Module 3 Introduction Reading15 minutes
Linear Regression of an Indicator Matrix Readings20 minutes
Linear Discriminant Analysis (LDA) Readings45 minutes
Logistic Regression Readings75 minutes
Module 3 Summary10 minutes
Insights from an Industry Leader: Learn More About Our Program10 minutes

4 assignmentsTotal 210 minutes

Module 3 Summative Assessment180 minutes
Linear Regression of an Indicator Matrix Quiz10 minutes
Linear Discriminant Analysis (LDA) Quiz10 minutes
Logistic Regression Quiz10 minutes

6 ungraded labsTotal 480 minutes

Coding Example120 minutes
Coding Exercise60 minutes
Coding Example120 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes

Welcome to Module 4 of Math 569: Statistical Learning, focusing on advanced methods in statistical modeling. This module starts with an introduction to Basis Expansion Methods, exploring how these techniques enhance linear models by incorporating non-linear relationships. We then delve into Piecewise Polynomials, discussing their utility in capturing varying trends across different segments of data. In Lesson 2, we explore Smoothing Splines, emphasizing their role in effectively balancing model fit and complexity. Lastly, Lesson 3 covers Regularization and Kernel Functions, elaborating on how these concepts contribute to constructing more complex models without significantly increasing computational complexity.

What's included

5 videos5 readings4 assignments6 ungraded labs

5 videosTotal 25 minutes

Module 4 Introduction1 minutePreview module
What are basis expansion methods?3 minutes
Piecewise Polynomials, the Method and Theory 5 minutes
Smoothing Splines 6 minutes
Regularization and Kernel Functions8 minutes

5 readingsTotal 330 minutes

Module 4 Introduction Reading20 minutes
Piecewise Polynomials Readings60 minutes
Smoothing Splines Readings60 minutes
Regularization via Reproducing Kernel Hilbert Spaces Readings180 minutes
Module 4 Summary10 minutes

4 assignmentsTotal 90 minutes

Module 4 Summative Assessment60 minutes
Piecewise polynomials Quiz10 minutes
Smoothing Splines Quiz10 minutes
Regularization via Reproducing Kernel Hilbert Spaces Quiz10 minutes

6 ungraded labsTotal 360 minutes

Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes

Welcome to Module 5 of Math 569: Statistical Learning, dedicated to advanced techniques in non-linear data modeling. In Lesson 1, we delve into Kernel Smoothers, exploring how they make predictions based on local data and their comparison to k-Nearest Neighbors (kNN) models. Lesson 2 focuses on Local Regression, particularly Local Linear Regression (LLR) and Local Polynomial Regression (LPR). We'll examine how LLR overcomes some kernel smoothing limitations and how LPR provides flexibility in capturing local data structure. The module emphasizes the adaptiveness of these techniques for complex data relationships and addresses the challenges in selecting hyperparameters and computational demands, especially for large datasets.

What's included

3 videos4 readings3 assignments4 ungraded labs

3 videosTotal 14 minutes

Module 5 Introduction1 minutePreview module
Kernel Smoothers and kNN6 minutes
Local Regression 6 minutes

4 readingsTotal 140 minutes

Module 5 Introduction Reading10 minutes
Kernel Smoothers Readings60 minutes
Local Regression Readings60 minutes
Module 5 Summary10 minutes

3 assignmentsTotal 80 minutes

Module 5 Summative Assessment60 minutes
Kernel Smoothers Quiz10 minutes
Local Regression Quiz10 minutes

4 ungraded labsTotal 240 minutes

Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes

Module 6 of Math 569: Statistical Learning delves into model evaluation and model selection via hyperparameter choice. It begins with an understanding of Bias-Variance Decomposition, highlighting the trade-off between model simplicity and accuracy. The module then explores model complexity, offering strategies for balancing this complexity with predictive performance. Building on the importance of balancing model complexity with performance, we move on to cover model selection metrics, namely: AIC, BIC, and MDL. These are information-theoretic metrics that balance error with model complexity, such as the number of parameters. Finally, the module concludes with lessons on estimating test error without a testing set, using concepts like VC Dimension, Cross-Validation, and Bootstrapping. This module is pivotal for mastering model evaluation and selection in statistical learning.

What's included

8 videos7 readings7 assignments9 ungraded labs

8 videosTotal 54 minutes

Module 6 Introduction1 minutePreview module
Bias, Variance and Model Complexity 10 minutes
The Bias-Variance Decomposition8 minutes
AIC and BIC 4 minutes
Minimum Description Length (MDL)7 minutes
Vapnik-Chervonenkis (VC) Dimension 5 minutes
K-fold Cross Validation 7 minutes
Bootstrapping8 minutes

7 readingsTotal 700 minutes

Module 6 Introduction Readings15 minutes
Bias, Variance and Model Complexity Readings75 minutes
Bayesian Approach and BIC Readings360 minutes
Vapnik-Chervonenkis (VC) Dimension Readings60 minutes
Cross Validation Readings120 minutes
Bootstrapping Readings60 minutes
Module 6 Summary10 minutes

7 assignmentsTotal 400 minutes

Module 6 Summative Assessment120 minutes
Bias, Variance and Model Complexity10 minutes
Bias, Variance and Model Complexity Quiz 260 minutes
Bayesian Approach and BIC Quiz10 minutes
Vapnik-Chervonenkis (VC) Dimension Quiz10 minutes
Cross Validation Quiz180 minutes
Bootstrapping Quiz10 minutes

9 ungraded labsTotal 540 minutes

Coding Example60 minutes
Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes

Module 7 of Math 569: Statistical Learning introduces advanced inferential techniques. Lesson 1 focuses on Maximum Likelihood Inference, explaining how to find optimal model parameters by maximizing the likelihood function. This method is pivotal in estimating parameters for which a dataset is most likely. Lesson 2 dives into Bayesian Inference, contrasting it with frequentist approaches. It covers Bayes' Theorem, which integrates prior beliefs with new evidence to update beliefs dynamically. The module thoroughly discusses the process of Bayesian modeling, including the construction and updating of models using prior and posterior distributions. This module is crucial for understanding complex inference methods in statistical learning.

What's included

4 videos4 readings4 assignments2 ungraded labs

4 videosTotal 22 minutes

Module 7 Introduction0 minutesPreview module
Maximum Likelihood Inference - Part 16 minutes
Maximum Likelihood Inference - Part 26 minutes
Bayesian Inference 9 minutes

4 readingsTotal 120 minutes

Module 7 Introduction Reading5 minutes
Maximum Likelihood Inference Reading45 minutes
Bayesian Inference Readings60 minutes
Module 7 Summary10 minutes

4 assignmentsTotal 260 minutes

Module 7 Summative Assessment180 minutes
Maximum Likelihood Inference Quiz- Part 110 minutes
Maximum Likelihood Inference Quiz - Part 260 minutes
Bayesian Inference Quiz10 minutes

2 ungraded labsTotal 120 minutes

Coding Example60 minutes
Coding Exercise60 minutes

Module 8 of Math 569: Statistical Learning covers diverse advanced machine learning techniques. It begins with Decision Trees, focusing on their structure and application in both classification and regression tasks. Next, it explores Support Vector Machines (SVM), detailing their function in creating optimal decision boundaries. The module then examines k-Means Clustering, an unsupervised learning method for data grouping. Finally, it concludes with Neural Networks, discussing their architecture and role in complex pattern recognition. Each lesson offers a deep dive into these techniques, showcasing their unique advantages and applications in statistical learning.

What's included

6 videos5 readings5 assignments8 ungraded labs

6 videosTotal 45 minutes

Module 8 Introduction1 minutePreview module
Tree Models - Part 16 minutes
Tree Models - Part 26 minutes
Support Vector Machines9 minutes
K-means Clustering 6 minutes
Neural Networks 14 minutes

5 readingsTotal 610 minutes

Additive Models and Trees Readings120 minutes
Support Vector Machines Readings120 minutes
k-Means Clustering Readings60 minutes
Neural Networks Readings300 minutes
Module 8 Summary10 minutes

5 assignmentsTotal 100 minutes

Module 8 Summative Assessment60 minutes
Additive Models and Trees Quiz10 minutes
Support Vector Machines Quiz10 minutes
k-Means Clustering Quiz10 minutes
Neural Networks Quiz10 minutes

8 ungraded labsTotal 480 minutes

Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes
Coding Example60 minutes
Coding Exercise60 minutes

This module contains the summative course assessment that has been designed to evaluate your understanding of the course material and assess your ability to apply the knowledge you have acquired throughout the course. Be sure to review the course material thoroughly before taking the assessment.

What's included

1 assignment

Instructor

Shahrzad Jamshidi

Illinois Tech

2 Courses919 learners

Offered by

Illinois Tech

Recommended if you're interested in Probability and Statistics

Illinois Tech
Bayesian Computational Statistics
Course
Illinois Tech
Data Preparation and Analysis
Course
Johns Hopkins University
Data Science Decisions in Time: Information Theory & Games
Course
University of Illinois Urbana-Champaign
Professional IQ Capstone
Course

Build toward a degree

This course is part of the following degree program(s) offered by Illinois Tech. If you are admitted and enroll, your completed coursework may count toward your degree learning and your progress can transfer with you.¹

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

New to Probability and Statistics? Start here.

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

You will be eligible for a full refund until two weeks after your payment date, or (for courses that have just launched) until two weeks after the first session of the course begins, whichever is later. You cannot receive a refund once you’ve earned a Course Certificate, even if you complete the course within the two-week refund period. See our full refund policy.

Statistical Learning

Skills you'll gain

Details to know

See how employees at top companies are mastering in-demand skills

Earn a career certificate

There are 9 modules in this course

Module 1: Statistical Learning - Terminology and Ideas

What's included

Module 2: Linear Regression Methods

What's included

Module 3: Linear Classification Methods

What's included

Module 4: Basis Expansion Methods

What's included

Module 5: Kernel Smoothing Methods

What's included

Module 6: Model Assessment and Selection

What's included

Module 7: Maximum Likelihood Inference

What's included

Module 8: Advanced Topics

What's included

Summative Course Assessment

What's included

Instructor

Offered by

Recommended if you're interested in Probability and Statistics

Bayesian Computational Statistics

Data Preparation and Analysis

Data Science Decisions in Time: Information Theory & Games

Professional IQ Capstone

Build toward a degree

Master of Data Science

Why people choose Coursera for their career

New to Probability and Statistics? Start here.

Open new doors with Coursera Plus

Advance your career with an online degree

Join over 3,400 global companies that choose Coursera for Business

Frequently asked questions

When will I have access to the lectures and assignments?

What will I get if I purchase the Certificate?

What is the refund policy?

More questions