Predicting Wine Quality with Random Forest and Scikit-Learn

Offered By
Coursera Community Project Network
In this Guided Project, you will:

Perform Exploratory Data Analysis.

Apply a Random Forest Classifier.

Analyze Random Forest Importances.

Clock2.5 hours
CloudNo download needed
VideoSplit-screen video
Comment DotsEnglish
LaptopDesktop only

In real life we face various classification problems, such as predicting whether an email is spam or not, or whether a credit card transaction is fraudulent or not, or what label the mobile phone should assign to the image it focuses on, perhaps a flower, a dog, a person or something else. Fortunately, we have machine learning techniques to help us deal with this. In this guided project, we will tackle the problem of predicting red wine quality using a Random Forest Classifier. Specifically, we will implement it by programming with Python and the classifier provided by the Scikit-Learn package. You will learn to train the classifier, calibrate it, tune its hyperparameters and evaluate the accuracy of its predictions. You will also learn how to perform cluster analysis to handle collinearity and reduce the number of predictors without sacrificing model accuracy. In addition, you will draw various graphs to help you interpret the results. This project is intended for beginners, so the prerequisites are basic knowledge of Python, Pandas, Numpy, Matplotlib, Seaborn, Scikit-Learn, Scipy and Random Forest algorithms. Note: This course runs in Rhyme's virtual browser, which is Coursera's hands-on project platform. With this browser you will connect to Google Colaboratory to write and execute Python code in a Jupyter Notebook, without worrying about installing software. All you need is to have a Google account. This Guided Project was created by a Coursera community member.

Skills you will develop

Machine LearningExploratory Data AnalysisClustering Analysis

Learn step-by-step

In a video that plays in a split-screen with your work area, your instructor will walk you through these steps:

  1. Getting Started

  2. Defining Problem, Importing Libraries and Downloading Data

  3. Cleaning Data

  4. Performing Exploratory Data Analysis (part 1)

  5. Performing Exploratory Data Analysis (part 2)

  6. Generating Training, Validation and Testing Datasets

  7. Creating a Data Visualizer

  8. Applying a Random Forest Classifier

  9. Analyzing Random Forest Importances

  10. Clustering Analysis

  11. Performing Hyperparameter Tuning

How Guided Projects work

Your workspace is a cloud desktop right in your browser, no download required

In a split-screen video, your instructor guides you step-by-step

Frequently asked questions

Frequently Asked Questions

More questions? Visit the Learner Help Center.