Developing insights about your organization, business, or research project depends on effective modeling and analysis of the data you collect. Building effective models requires understanding the different types of questions you can ask and how to map those questions to your data. Different modeling approaches can be chosen to detect interesting patterns in the data and identify hidden relationships.



Modeling Data in the Tidyverse
This course is part of Tidyverse Skills for Data Science in R Specialization



Instructors: Shannon Ellis, PhD
Access provided by Stanford University
1,575 already enrolled
What you'll learn
- Describe different types of data analytic questions 
- Conduct hypothesis tests of your data 
- Apply linear modeling techniques to answer multivariable questions 
- Apply machine learning workflows to detect complex patterns in your data 
Skills you'll gain
- Rmarkdown
- Machine Learning
- Statistical Methods
- Tidyverse (R Package)
- Probability & Statistics
- Descriptive Statistics
- Regression Analysis
- Statistical Inference
- Data Analysis
- Statistical Hypothesis Testing
- Data Science
- Data Modeling
- Predictive Modeling
- R Programming
- Classification And Regression Tree (CART)
- Exploratory Data Analysis
- Statistical Analysis
Details to know

Add to your LinkedIn profile
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 11 modules in this course
Developing insights about your organization, business, or research project depends on effective modeling and analysis of the data you collect. Building effective models requires understanding the different types of questions you can ask and how to map those questions to your data. Different modeling approaches can be chosen to detect interesting patterns in the data and identify hidden relationships.
What's included
16 readings1 assignment
Inferential Analysis is what analysts carry out after they’ve described and explored their dataset. After understanding your dataset better, analysts often try to infer something from the data. This is done using statistical tests. We discussed a bit about how we can use models to perform inference and prediction analyses. What does this mean?
What's included
3 readings1 assignment
Linear models are the most commonly used models in data analysis because of their computational efficiency and their ease of interpretation. Having a solid understanding of linear models and how they work is critical for any work in data science. The tidyverse provides a set of tools for making linear modeling more efficient and streamlined.
What's included
12 readings1 assignment
Multiple linear regression is needed when you want to include confounding factors or other predictors in your model for the response. R provides a straightforward way to do this via the formula interface to the lm() function.
What's included
1 reading1 assignment
While we’ve focused on linear regression in this lesson on inference, linear regression isn’t the only analytical approach out there. However, it is arguably the most commonly used. And, beyond that, there are many statistical tests and approaches that are slight variations on linear regression, so having a solid foundation and understanding of linear regression makes understanding these other tests and approaches much simpler. For example, what if you didn’t want to measure the linear relationship between two variables, but instead wanted to know whether or not the average observed is different from expectation?
What's included
3 readings
Hypothesis testing describes a family of statistical techniques for determining whether the data you collect provides evidence for the value of an unknown parameter of interest. The goal of hypothesis tests is to make inferences while accounting for variability in the data that can lead to spurious results.
What's included
3 readings1 assignment1 plugin
Prediction modeling is an essential activity in data science and involves building systems for making predictions based on previously observed data. These models are typically very flexible and can capture a range of different relationships.
What's included
12 readings1 assignment
There are incredibly helpful packages available in R thanks to the work of RStudio. As mentioned above, there are hundreds of different machine learning algorithms. The tidymodels R packages have put many of them into a single framework, allowing you to use many different machine learning models easily.
What's included
5 readings1 assignment
This case study will demonstrate an approach to building a prediction model for predicting outdoor air pollution concentrations in the United States.
What's included
17 readings1 ungraded lab
The tidymodels collection of packages can be overwhelming at first glance. Here, we provide a quick summary chart to help navigate all of the packages and when they should be used.
What's included
1 reading
In this project, you will practice building models with the tidyverse for classifying consumer complaints data from the Consumer Financial Protection Bureau (CFPB). This project includes both a Peer Review step in which you'll upload R Markdown and knitted HTML files AND a Quiz step in which you'll answer questions about the predictions made by your classification algorithm.
What's included
1 reading1 assignment1 peer review
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors



Offered by
Why people choose Coursera for their career




Explore more from Data Science
 - ESSEC Business School 
 - Johns Hopkins University 
 - University of Colorado Boulder 
 - Johns Hopkins University 
¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.


