IBM

Data Analysis with R

This course is part of multiple programs.

Tiffany Zhu
Yiwen Li
Gabriela de Queiroz

Instructors: Tiffany Zhu

Access provided by University of North Texas

34,800 already enrolled

Gain insight into a topic and learn the fundamentals.
4.7

(339 reviews)

Intermediate level
Some related experience required
Flexible schedule
2 weeks at 10 hours a week
Learn at your own pace
95%
Most learners liked this course
Gain insight into a topic and learn the fundamentals.
4.7

(339 reviews)

Intermediate level
Some related experience required
Flexible schedule
2 weeks at 10 hours a week
Learn at your own pace
95%
Most learners liked this course

What you'll learn

  • Prepare data for analysis by handling missing values, formatting and normalizing data, binning, and turning categorical values into numeric values.

  • Compare and contrast predictive models using simple linear, multiple linear, and polynomial regression methods.

  • Examine data using descriptive statistics, data grouping, analysis of variance (ANOVA), and correlation statistics.

  • Evaluate a model for overfitting and underfitting conditions and tune its performance using regularization and grid search.

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

11 assignments¹

AI Graded see disclaimer
Taught in English

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is available as part of
When you enroll in this course, you'll also be asked to select a specific program.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

There are 6 modules in this course

All data analysis starts with a problem that you need to solve and understanding your data and the types of questions you can answer about it are key aspects of this. The R programming language provides you with all the tools you need to conduct powerful data analysis, providing the conduit between your data and the real-world problems you want to solve. In this module, you’ll review a type of problem that you can solve in R and the underlying data that forms the basis for your analysis. You’ll also learn about the R packages for data analysis, which provide a powerful set of tools that you’re likely to use in everyday data analyses. Finally, you’ll see how to import data and gain basic insights from the dataset.

What's included

6 videos1 reading2 assignments1 app item1 plugin

Data wrangling, or data pre-processing, is an essential first step to achieving accurate and complete analysis of your data. This process transforms your raw data into a format that can be easily categorized or mapped to other data, creating predictable relationships between them, and making it easier to build the models you need to answer questions about your data. This module provides an introduction to data pre-processing in R and then provides you with the tools you need to identify and handle missing values in your dataset, transform data formats to align them with other data you may want to compare them to, normalize your data, create categories of information through data binning, and convert categorical variables into quantitative values that can then be used in numeric-based analyses.

What's included

6 videos1 reading2 assignments1 app item1 plugin

Exploratory data analysis, or EDA, is an approach to analyzing data that summarizes its main characteristics and helps you gain a better understanding of the dataset, uncover relationships between different variables, and extract important variables for the problem you are trying to solve. The main question you are trying to answer in this module is: "What causes flight delays?" In this module, you’ll learn some useful exploratory data analysis techniques that will help answer this question.

What's included

5 videos1 reading2 assignments1 app item1 plugin

You have identified the problem that you’re trying to solve and have pre-processed the dataset you’ll use in your analysis, and you have conducted some exploratory data analysis to answer some of your initial questions. Now, it’s time to develop your model and assess the strength of your assumptions. In this module, you will examine model development by trying to predict the arrival delay of a flight using the Airline dataset. You’ll learn regression techniques for determining the correlation between variables in your dataset, and evaluate the result both visually and through the calculation of metrics.

What's included

7 videos1 reading2 assignments1 app item1 plugin

You have a firm understanding of your data and have pre-processed it to ensure the best possible outcomes. And you have conducted exploratory data analysis and developed your model. Everything looks good so far, but how can you be certain your model works in the real world and performs optimally? In this module, you’ll learn how to use the tidymodels framework to evaluate your model. Tidymodels is a collection of packages for modeling and machine learning using tidyverse principles. Using these packages, you’ll learn how to cross-validate your models, identify potential problems, like overfitting and underfitting, and handle overfitting problems using a technique called regularization. You’ll also learn how to tune your models using grid search.

What's included

4 videos1 reading2 assignments1 app item1 plugin

What's included

4 readings1 assignment1 peer review2 app items3 plugins

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings
4.6 (84 ratings)
Tiffany Zhu
IBM
2 Courses47,987 learners
Yiwen Li
IBM
2 Courses47,987 learners
Gabriela de Queiroz
IBM
2 Courses47,987 learners

Offered by

IBM

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

4.7

339 reviews

  • 5 stars

    80.93%

  • 4 stars

    12.02%

  • 3 stars

    3.22%

  • 2 stars

    1.46%

  • 1 star

    2.34%

Showing 3 of 339

RN
5

Reviewed on Mar 2, 2023

CB
5

Reviewed on Dec 2, 2022

HM
5

Reviewed on Dec 28, 2022

Explore more from Data Science

¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.