Machine Learning: Classification

Machine Learning: Classification

This course is part of Machine Learning Specialization

Taught in English

Some content may not be translated

Instructors: Emily Fox

125,054 already enrolled

Included with Coursera Plus

Learn more

Course

Gain insight into a topic and learn the fundamentals

4.7

(3,712 reviews)

94%

21 hours (approximately)

Flexible schedule

Learn at your own pace

View course modules

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

19 quizzes

Course

Gain insight into a topic and learn the fundamentals

4.7

(3,712 reviews)

94%

21 hours (approximately)

Flexible schedule

Learn at your own pace

View course modules

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

Build your subject-matter expertise

This course is part of the Machine Learning Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

There are 10 modules in this course

Case Studies: Analyzing Sentiment & Loan Default Prediction

In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification. In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We've also included optional content in every module, covering advanced topics for those who want to go even deeper! Learning Objectives: By the end of this course, you will be able to: -Describe the input and output of a classification model. -Tackle both binary and multiclass classification problems. -Implement a logistic regression model for large-scale classification. -Create a non-linear model using decision trees. -Improve the performance of any model using boosting. -Scale your methods with stochastic gradient ascent. -Describe the underlying decision boundaries. -Build a classification model to predict sentiment in a product review dataset. -Analyze financial data to predict loan defaults. -Use techniques for handling missing data. -Evaluate your models using precision-recall metrics. -Implement these techniques in Python (or in the language of your choice, though Python is highly recommended).

Classification is one of the most widely used techniques in machine learning, with a broad array of applications, including sentiment analysis, ad targeting, spam detection, risk assessment, medical diagnosis and image classification. The core goal of classification is to predict a category or class y from some inputs x. Through this course, you will become familiar with the fundamental models and algorithms used in classification, as well as a number of core machine learning concepts. Rather than covering all aspects of classification, you will focus on a few core techniques, which are widely used in the real-world to get state-of-the-art performance. By following our hands-on approach, you will implement your own algorithms on multiple real-world tasks, and deeply grasp the core techniques needed to be successful with these approaches in practice. This introduction to the course provides you with an overview of the topics we will cover and the background knowledge and resources we assume you have.

What's included

8 videos4 readings

8 videosTotal 27 minutes

Welcome to the classification course, a part of the Machine Learning Specialization1 minutePreview module
What is this course about?6 minutes
Impact of classification1 minute
Course overview3 minutes
Outline of first half of course5 minutes
Outline of second half of course5 minutes
Assumed background3 minutes
Let's get started!0 minutes

4 readingsTotal 35 minutes

Important Update regarding the Machine Learning Specialization10 minutes
Slides presented in this module10 minutes
Get help and meet other learners. Join your Community!5 minutes
Reading: Software tools you'll need10 minutes

Linear classifiers are amongst the most practical classification methods. For example, in our sentiment analysis case-study, a linear classifier associates a coefficient with the counts of each word in the sentence. In this module, you will become proficient in this type of representation. You will focus on a particularly useful type of linear classifier called logistic regression, which, in addition to allowing you to predict a class, provides a probability associated with the prediction. These probabilities are extremely useful, since they provide a degree of confidence in the predictions. In this module, you will also be able to construct features from categorical inputs, and to tackle classification problems with more than two class (multiclass problems). You will examine the results of these techniques on a real-world product sentiment analysis task.

What's included

18 videos2 readings2 quizzes

18 videosTotal 77 minutes

Linear classifiers: A motivating example2 minutesPreview module
Intuition behind linear classifiers3 minutes
Decision boundaries3 minutes
Linear classifier model5 minutes
Effect of coefficient values on decision boundary2 minutes
Using features of the inputs2 minutes
Predicting class probabilities1 minute
Review of basics of probabilities6 minutes
Review of basics of conditional probabilities8 minutes
Using probabilities in classification2 minutes
Predicting class probabilities with (generalized) linear models5 minutes
The sigmoid (or logistic) link function4 minutes
Logistic regression model5 minutes
Effect of coefficient values on predicted probabilities7 minutes
Overview of learning logistic regression models2 minutes
Encoding categorical inputs4 minutes
Multiclass classification with 1 versus all7 minutes
Recap of logistic regression classifier1 minute

2 readingsTotal 20 minutes

Slides presented in this module10 minutes
Predicting sentiment from product reviews10 minutes

2 quizzesTotal 60 minutes

Linear Classifiers & Logistic Regression30 minutes
Predicting sentiment from product reviews30 minutes

Once familiar with linear classifiers and logistic regression, you can now dive in and write your first learning algorithm for classification. In particular, you will use gradient ascent to learn the coefficients of your classifier from data. You first will need to define the quality metric for these tasks using an approach called maximum likelihood estimation (MLE). You will also become familiar with a simple technique for selecting the step size for gradient ascent. An optional, advanced part of this module will cover the derivation of the gradient for logistic regression. You will implement your own learning algorithm for logistic regression from scratch, and use it to learn a sentiment analysis classifier.

What's included

18 videos2 readings2 quizzes

18 videosTotal 82 minutes

Goal: Learning parameters of logistic regression2 minutesPreview module
Intuition behind maximum likelihood estimation4 minutes
Data likelihood8 minutes
Finding best linear classifier with gradient ascent3 minutes
Review of gradient ascent6 minutes
Learning algorithm for logistic regression3 minutes
Example of computing derivative for logistic regression5 minutes
Interpreting derivative for logistic regression5 minutes
Summary of gradient ascent for logistic regression2 minutes
Choosing step size5 minutes
Careful with step sizes that are too large4 minutes
Rule of thumb for choosing step size3 minutes
(VERY OPTIONAL) Deriving gradient of logistic regression: Log trick4 minutes
(VERY OPTIONAL) Expressing the log-likelihood3 minutes
(VERY OPTIONAL) Deriving probability y=-1 given x2 minutes
(VERY OPTIONAL) Rewriting the log likelihood into a simpler form8 minutes
(VERY OPTIONAL) Deriving gradient of log likelihood8 minutes
Recap of learning logistic regression classifiers1 minute

2 readingsTotal 20 minutes

Slides presented in this module10 minutes
Implementing logistic regression from scratch10 minutes

2 quizzesTotal 60 minutes

Learning Linear Classifiers30 minutes
Implementing logistic regression from scratch30 minutes

As we saw in the regression course, overfitting is perhaps the most significant challenge you will face as you apply machine learning approaches in practice. This challenge can be particularly significant for logistic regression, as you will discover in this module, since we not only risk getting an overly complex decision boundary, but your classifier can also become overly confident about the probabilities it predicts. In this module, you will investigate overfitting in classification in significant detail, and obtain broad practical insights from some interesting visualizations of the classifiers' outputs. You will then add a regularization term to your optimization to mitigate overfitting. You will investigate both L2 regularization to penalize large coefficient values, and L1 regularization to obtain additional sparsity in the coefficients. Finally, you will modify your gradient ascent algorithm to learn regularized logistic regression classifiers. You will implement your own regularized logistic regression classifier from scratch, and investigate the impact of the L2 penalty on real-world sentiment analysis data.

What's included

13 videos2 readings2 quizzes

13 videosTotal 65 minutes

Evaluating a classifier3 minutesPreview module
Review of overfitting in regression3 minutes
Overfitting in classification5 minutes
Visualizing overfitting with high-degree polynomial features3 minutes
Overfitting in classifiers leads to overconfident predictions5 minutes
Visualizing overconfident predictions4 minutes
(OPTIONAL) Another perspecting on overfitting in logistic regression8 minutes
Penalizing large coefficients to mitigate overfitting5 minutes
L2 regularized logistic regression4 minutes
Visualizing effect of L2 regularization in logistic regression5 minutes
Learning L2 regularized logistic regression with gradient ascent7 minutes
Sparse logistic regression with L1 regularization7 minutes
Recap of overfitting & regularization in logistic regression0 minutes

2 readingsTotal 20 minutes

Slides presented in this module10 minutes
Logistic Regression with L2 regularization10 minutes

2 quizzesTotal 60 minutes

Overfitting & Regularization in Logistic Regression30 minutes
Logistic Regression with L2 regularization30 minutes

Along with linear classifiers, decision trees are amongst the most widely used classification techniques in the real world. This method is extremely intuitive, simple to implement and provides interpretable predictions. In this module, you will become familiar with the core decision trees representation. You will then design a simple, recursive greedy algorithm to learn decision trees from data. Finally, you will extend this approach to deal with continuous inputs, a fundamental requirement for practical problems. In this module, you will investigate a brand new case-study in the financial sector: predicting the risk associated with a bank loan. You will implement your own decision tree learning algorithm on real loan data.

What's included

13 videos3 readings3 quizzes

13 videosTotal 46 minutes

Predicting loan defaults with decision trees3 minutesPreview module
Intuition behind decision trees1 minute
Task of learning decision trees from data3 minutes
Recursive greedy algorithm4 minutes
Learning a decision stump3 minutes
Selecting best feature to split on6 minutes
When to stop recursing4 minutes
Making predictions with decision trees1 minute
Multiclass classification with decision trees2 minutes
Threshold splits for continuous inputs6 minutes
(OPTIONAL) Picking the best threshold to split on3 minutes
Visualizing decision boundaries5 minutes
Recap of decision trees0 minutes

3 readingsTotal 30 minutes

Slides presented in this module10 minutes
Identifying safe loans with decision trees10 minutes
Implementing binary decision trees10 minutes

3 quizzesTotal 74 minutes

Decision Trees30 minutes
Identifying safe loans with decision trees14 minutes
Implementing binary decision trees30 minutes

Out of all machine learning techniques, decision trees are amongst the most prone to overfitting. No practical implementation is possible without including approaches that mitigate this challenge. In this module, through various visualizations and investigations, you will investigate why decision trees suffer from significant overfitting problems. Using the principle of Occam's razor, you will mitigate overfitting by learning simpler trees. At first, you will design algorithms that stop the learning process before the decision trees become overly complex. In an optional segment, you will design a very practical approach that learns an overly-complex tree, and then simplifies it with pruning. Your implementation will investigate the effect of these techniques on mitigating overfitting on our real-world loan data set.

What's included

8 videos2 readings2 quizzes

8 videosTotal 40 minutes

A review of overfitting2 minutesPreview module
Overfitting in decision trees5 minutes
Principle of Occam's razor: Learning simpler decision trees5 minutes
Early stopping in learning decision trees6 minutes
(OPTIONAL) Motivating pruning8 minutes
(OPTIONAL) Pruning decision trees to avoid overfitting6 minutes
(OPTIONAL) Tree pruning algorithm3 minutes
Recap of overfitting and regularization in decision trees1 minute

2 readingsTotal 20 minutes

Slides presented in this module10 minutes
Decision Trees in Practice10 minutes

2 quizzesTotal 60 minutes

Preventing Overfitting in Decision Trees30 minutes
Decision Trees in Practice30 minutes

Real-world machine learning problems are fraught with missing data. That is, very often, some of the inputs are not observed for all data points. This challenge is very significant, happens in most cases, and needs to be addressed carefully to obtain great performance. And, this issue is rarely discussed in machine learning courses. In this module, you will tackle the missing data challenge head on. You will start with the two most basic techniques to convert a dataset with missing data into a clean dataset, namely skipping missing values and inputing missing values. In an advanced section, you will also design a modification of the decision tree learning algorithm that builds decisions about missing data right into the model. You will also explore these techniques in your real-data implementation.

What's included

6 videos1 reading1 quiz

6 videosTotal 24 minutes

Challenge of missing data3 minutesPreview module
Strategy 1: Purification by skipping missing data4 minutes
Strategy 2: Purification by imputing missing data4 minutes
Modifying decision trees to handle missing data4 minutes
Feature split selection with missing data5 minutes
Recap of handling missing data1 minute

1 readingTotal 10 minutes

Slides presented in this module10 minutes

1 quizTotal 30 minutes

Handling Missing Data30 minutes

One of the most exciting theoretical questions that have been asked about machine learning is whether simple classifiers can be combined into a highly accurate ensemble. This question lead to the developing of boosting, one of the most important and practical techniques in machine learning today. This simple approach can boost the accuracy of any classifier, and is widely used in practice, e.g., it's used by more than half of the teams who win the Kaggle machine learning competitions. In this module, you will first define the ensemble classifier, where multiple models vote on the best prediction. You will then explore a boosting algorithm called AdaBoost, which provides a great approach for boosting classifiers. Through visualizations, you will become familiar with many of the practical aspects of this techniques. You will create your very own implementation of AdaBoost, from scratch, and use it to boost the performance of your loan risk predictor on real data.

What's included

13 videos3 readings3 quizzes

13 videosTotal 58 minutes

The boosting question3 minutesPreview module
Ensemble classifiers5 minutes
Boosting5 minutes
AdaBoost overview3 minutes
Weighted error4 minutes
Computing coefficient of each ensemble component4 minutes
Reweighing data to focus on mistakes4 minutes
Normalizing weights2 minutes
Example of AdaBoost in action5 minutes
Learning boosted decision stumps with AdaBoost4 minutes
The Boosting Theorem3 minutes
Overfitting in boosting5 minutes
Ensemble methods, impact of boosting & quick recap4 minutes

3 readingsTotal 30 minutes

Slides presented in this module10 minutes
Exploring Ensemble Methods10 minutes
Boosting a decision stump10 minutes

3 quizzesTotal 90 minutes

Exploring Ensemble Methods30 minutes
Boosting30 minutes
Boosting a decision stump30 minutes

In many real-world settings, accuracy or error are not the best quality metrics for classification. You will explore a case-study that significantly highlights this issue: using sentiment analysis to display positive reviews on a restaurant website. Instead of accuracy, you will define two metrics: precision and recall, which are widely used in real-world applications to measure the quality of classifiers. You will explore how the probabilities output by your classifier can be used to trade-off precision with recall, and dive into this spectrum, using precision-recall curves. In your hands-on implementation, you will compute these metrics with your learned classifier on real-world sentiment analysis data.

What's included

8 videos2 readings2 quizzes

8 videosTotal 31 minutes

Case-study where accuracy is not best metric for classification3 minutesPreview module
What is good performance for a classifier?3 minutes
Precision: Fraction of positive predictions that are actually positive5 minutes
Recall: Fraction of positive data predicted to be positive3 minutes
Precision-recall extremes2 minutes
Trading off precision and recall4 minutes
Precision-recall curve5 minutes
Recap of precision-recall1 minute

2 readingsTotal 20 minutes

Slides presented in this module10 minutes
Exploring precision and recall10 minutes

2 quizzesTotal 60 minutes

Precision-Recall30 minutes
Exploring precision and recall30 minutes

With the advent of the internet, the growth of social media, and the embedding of sensors in the world, the magnitudes of data that our machine learning algorithms must handle have grown tremendously over the last decade. This effect is sometimes called "Big Data". Thus, our learning algorithms must scale to bigger and bigger datasets. In this module, you will develop a small modification of gradient ascent called stochastic gradient, which provides significant speedups in the running time of our algorithms. This simple change can drastically improve scaling, but makes the algorithm less stable and harder to use in practice. In this module, you will investigate the practical techniques needed to make stochastic gradient viable, and to thus to obtain learning algorithms that scale to huge datasets. You will also address a new kind of machine learning problem, online learning, where the data streams in over time, and we must learn the coefficients as the data arrives. This task can also be solved with stochastic gradient. You will implement your very own stochastic gradient ascent algorithm for logistic regression from scratch, and evaluate it on sentiment analysis data.

What's included

16 videos2 readings2 quizzes

16 videosTotal 51 minutes

Gradient ascent won't scale to today's huge datasets3 minutesPreview module
Timeline of scalable machine learning & stochastic gradient4 minutes
Why gradient ascent won't scale3 minutes
Stochastic gradient: Learning one data point at a time3 minutes
Comparing gradient to stochastic gradient3 minutes
Why would stochastic gradient ever work?4 minutes
Convergence paths2 minutes
Shuffle data before running stochastic gradient2 minutes
Choosing step size3 minutes
Don't trust last coefficients1 minute
(OPTIONAL) Learning from batches of data3 minutes
(OPTIONAL) Measuring convergence4 minutes
(OPTIONAL) Adding regularization3 minutes
The online learning task3 minutes
Using stochastic gradient for online learning3 minutes
Scaling to huge datasets through parallelization & module recap1 minute

2 readingsTotal 20 minutes

Slides presented in this module10 minutes
Training Logistic Regression via Stochastic Gradient Ascent10 minutes

2 quizzesTotal 60 minutes

Scaling to Huge Datasets & Online Learning30 minutes
Training Logistic Regression via Stochastic Gradient Ascent30 minutes

Instructors

Instructor ratings

4.7 (145 ratings)

Emily Fox

University of Washington

6 Courses472,205 learners

Carlos Guestrin

University of Washington

8 Courses472,946 learners

Offered by

University of Washington

Recommended if you're interested in Machine Learning

University of Washington
Machine Learning: Clustering & Retrieval
Course
Coursera Project Network
Crea formularios con React Hooks y MUI
Guided Project
University of Washington
Data Science at Scale - Capstone Project
Course
University of Minnesota
Recommender Systems Capstone
Course

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

Showing 3 of 3712

4.7

3,712 reviews

5 stars
76.80%
4 stars
18.56%
3 stars
3.04%
2 stars
0.61%
1 star
0.96%

Reviewed on Jan 24, 2017

Reviewed on Feb 22, 2018

Reviewed on Jun 23, 2017

View more reviews

New to Machine Learning? Start here.

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy.

Machine Learning: Classification

Course

Skills you'll gain

Details to know

Course

See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise

Earn a career certificate

There are 10 modules in this course

Welcome!

What's included

Linear Classifiers & Logistic Regression

What's included

Learning Linear Classifiers

What's included

Overfitting & Regularization in Logistic Regression

What's included

Decision Trees

What's included

Preventing Overfitting in Decision Trees

What's included

Handling Missing Data

What's included

Boosting

What's included

Precision-Recall

What's included

Scaling to Huge Datasets & Online Learning

What's included

Instructors

Offered by

Recommended if you're interested in Machine Learning

Machine Learning: Clustering & Retrieval

Crea formularios con React Hooks y MUI

Data Science at Scale - Capstone Project

Recommender Systems Capstone

Why people choose Coursera for their career

Learner reviews

New to Machine Learning? Start here.

Open new doors with Coursera Plus

Advance your career with an online degree

Join over 3,400 global companies that choose Coursera for Business

Frequently asked questions

When will I have access to the lectures and assignments?

What will I get if I subscribe to this Specialization?

What is the refund policy?

More questions