When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 6 modules in this course
Data never arrive in the condition that you need them in order to do effective data analysis. Data need to be re-shaped, re-arranged, and re-formatted, so that they can be visualized or be inputted into a machine learning algorithm. This course addresses the problem of wrangling your data so that you can bring them under control and analyze them effectively. The key goal in data wrangling is transforming non-tidy data into tidy data.
This course covers many of the critical details about handling tidy and non-tidy data in R such as converting from wide to long formats, manipulating tables with the dplyr package, understanding different R data types, processing text data with regular expressions, and conducting basic exploratory data analyses. Investing the time to learn these data wrangling techniques will make your analyses more efficient, more reproducible, and more understandable to your data science team.
In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.
Data never arrive in the condition that you need them in order to do effective data analysis. Data need to be re-shaped, re-arranged, and re-formatted, so that they can be visualized or be inputted into a machine learning algorithm. This module addresses the problem of wrangling your data so that you can bring them under control and analyze them effectively. The key goal in data wrangling is transforming non-tidy data into tidy data.
What's included
19 readings2 assignments
Show info about module content
19 readings•Total 155 minutes
About This Course•3 minutes
Tidy Data Review•2 minutes
Reshaping Data•2 minutes
Wide Data•5 minutes
Long Data•5 minutes
Reshaping Data•30 minutes
Data Wrangling•0 minutes
R Packages•15 minutes
The Pipe Operator•15 minutes
Filtering Data•20 minutes
Reordering•15 minutes
Creating New Columns•5 minutes
Separating Columns•5 minutes
Merging Columns•5 minutes
Cleaning Column Names•5 minutes
Combining Data Across Data Frames•5 minutes
Grouping Data•5 minutes
Summarizing Data•3 minutes
Operations Across Columns•10 minutes
2 assignments•Total 60 minutes
Reshaping Data Quiz•30 minutes
Data Wrangling Quiz•30 minutes
Working With Factors, Dates, and Times
Module 2•2 hours to complete
Module details
In R, categorical data are handled as factors. By definition, categorical data are limited in that they have a set number of possible values they can take. For example, there are 12 months in a calendar year. In a month variable, each observation is limited to taking one of these twelve values. Thus, with a limited number of possible values, month is a categorical variable. Categorical data, which will be referred to as factors for the rest of this lesson, are regularly found in data. Learning how to work with this type of variable effectively will be incredibly helpful.
What's included
14 readings2 assignments
Show info about module content
14 readings•Total 75 minutes
Working with Factors•5 minutes
Factor Review•5 minutes
Manually Changing the Labels of Factor Levels: fct_releve()•5 minutes
Keeping the Order of the Factor Levels: fct_inorder()•5 minutes
Advanced Factoring•5 minutes
Re-ordering Factor Levels by Frequency: fct_infreq()•5 minutes
Reversing Order Levels: fct_rev()•5 minutes
Re-ordering Factor Levels by Another Variable: fct_reorder()•5 minutes
Combining Several Levels into One: fct_recode()•5 minutes
Converting Numeric Levels to factors: ifelse() + factor()•5 minutes
Dates and Times Basics•5 minutes
Creating Dates and Date-Time Objects•10 minutes
Working with Dates•5 minutes
Time Spans•5 minutes
2 assignments•Total 60 minutes
Working With Factors Quiz•30 minutes
Working With Dates Quiz•30 minutes
Working With Strings and Text and Functional Programming
Module 3•3 hours to complete
Module details
Working with text data is increasingly common in data science projects. Text manipulation is often needed to clean up messy datasets and to create numerical measurements out of text input. In addition, often the text themselves are the data and this module covers tools to extract information from the text.
What's included
13 readings2 assignments
Show info about module content
13 readings•Total 135 minutes
Working with Strings•5 minutes
stringr•5 minutes
String Basics•15 minutes
Regular Expressions•3 minutes
glue•15 minutes
Tidy Text Format•15 minutes
Sentiment Analysis•15 minutes
Word and Document Frequency•30 minutes
Functional Programming•5 minutes
For Loops vs. Functionals•2 minutes
map Functions•5 minutes
Multiple Vectors•15 minutes
Anonymous Functions•5 minutes
2 assignments•Total 60 minutes
Working With Strings Quiz•30 minutes
Functional Programming Quiz•30 minutes
Exploratory Data Analysis
Module 4•1 hour to complete
Module details
The goal of an exploratory analysis is to examine, or explore the data and find relationships that weren’t previously known. Exploratory analyses explore how different measures might be related to each other but do not confirm that relationship as causal, i.e., one variable causing another. You’ve probably heard the phrase “Correlation does not imply causation,” and exploratory analyses lie at the root of this saying. Just because you observe a relationship between two variables during exploratory analysis, it does not mean that one necessarily causes the other.
What's included
2 readings
Show info about module content
2 readings•Total 35 minutes
Exploratory Data Analysis•10 minutes
General Principles of EDA•25 minutes
Case Studies
Module 5•3 hours to complete
Module details
Now we will demonstrate how to import data using our case study examples. When working through the steps of the case studies, you can use either RStudio on your own computer or Coursera lab spaces provided for each case study.
What's included
11 readings2 ungraded labs
Show info about module content
11 readings•Total 180 minutes
Case Studies•10 minutes
Healthcare Coverage Data•20 minutes
Healthcare Spending Data•20 minutes
Join the Data•30 minutes
Census Data•15 minutes
Violent Crime•15 minutes
Brady Scores•15 minutes
The Counted Fatal Shootings•15 minutes
Unemployment Data•15 minutes
Population Density: 2015•15 minutes
Firearm Ownership•10 minutes
2 ungraded labs•Total 20 minutes
Case Study #1: Health Expenditures•10 minutes
Case Study #2: Firearms•10 minutes
Project: Wrangling data in the Tidyverse
Module 6•1 hour to complete
Module details
In this project, you will practice data exploration and data wrangling with the tidyverse using consumer complaint data from the Consumer Financial Protection Bureau (CFPB).
What's included
1 reading1 assignment
Show info about module content
1 reading•Total 5 minutes
Important information before you start the project•5 minutes
1 assignment•Total 60 minutes
Wrangling Data in the Tidyverse Course Project•60 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors
Instructor ratings
Instructor ratings
We asked all learners to give feedback on our instructors based on the quality of their teaching style.
The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.