When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 6 modules in this course
Getting data into your statistical analysis system can be one of the most challenging parts of any data science project. Data must be imported and harmonized into a coherent format before any insights can be obtained. You will learn how to get data into R from commonly used formats and harmonizing different kinds of datasets from different sources. If you work in an organization where different departments collect data using different systems and different storage formats, then this course will provide essential tools for bringing those datasets together and making sense of the wealth of information in your organization.
This course introduces the Tidyverse tools for importing data into R so that it can be prepared for analysis, visualization, and modeling. Common data formats are introduced, including delimited files, spreadsheets and relational databases, and techniques for obtaining data from the web are demonstrated, such as web scraping and web APIs.
In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.
A basic data type in the tidyverse is the tibble. Tibbles store tabular data and are a modern take on the standard R data frame. They have many user-friendly features that are an improvement over standard data frames when doing interactive data analysis. The remainder of this module covers tabular data in spreadsheet formats like Excel, CSV, TSV, and other delimited files.
What's included
15 readings1 assignment
Show info about module content
15 readings•Total 166 minutes
About This Course•5 minutes
Tibbles•10 minutes
Creating a tibble•20 minutes
Subsetting•10 minutes
Spreadsheets•1 minute
Excel files•30 minutes
Google Sheets•45 minutes
CSVs•10 minutes
Downloading CSV files•5 minutes
Reading CSVs into R•10 minutes
TSVs•2 minutes
Reading TSVs Files into R•5 minutes
Delimited Files•3 minutes
Reading Delimited Files into R•5 minutes
Exporting Data from R•5 minutes
1 assignment•Total 30 minutes
Importing and Exporting Data Quiz•30 minutes
JSON, XML, and Databases
Module 2•3 hours to complete
Module details
Data can come in non-tabular formats, especially unstructured data or data that otherwise would not fit into a table. JSON and XML are common formats for storing arbitrarily structured data and this module covers the packages used to read in those data formats. In addition, relational databases are common for storing very large collections of tables where you do not need to read in the entire dataset at once. There are many relational database formats and we will cover the SQLite format, which is a compact and simple to use format.
What's included
10 readings1 assignment
Show info about module content
10 readings•Total 132 minutes
JSON•30 minutes
XML•15 minutes
Databases•2 minutes
Relational Data•15 minutes
Relational Databases: SQL•5 minutes
Connecting to Databases: RSQLite•10 minutes
Working with Relational Data: dplyr & dbplyr•5 minutes
Mutating Joins•30 minutes
Filtering Joins•10 minutes
How to Connect to a Database Online•10 minutes
1 assignment•Total 30 minutes
JSON, XML, and Databases Quiz•30 minutes
Web Scraping and APIs
Module 3•2 hours to complete
Module details
Reading in data from various Internet sources can be a useful way to build analyses that need to be regularly updated. The rvest and httr packages are useful for connecting to web sites, web APIs and other online sources of data.
What's included
11 readings1 assignment
Show info about module content
11 readings•Total 105 minutes
Web Scraping•10 minutes
rvest Basics•0 minutes
SelectorGadget•10 minutes
Web Scraping Example•10 minutes
A final note: SelectorGadget•2 minutes
API•5 minutes
Getting Data: httr•5 minutes
Example 1: GitHub’s API•30 minutes
Example 2: Obtaining a CSV•20 minutes
read_csv() from a URL•3 minutes
API keys•10 minutes
1 assignment•Total 30 minutes
Getting Data from the Internet Quiz•30 minutes
Foreign Formats, Images, and googledrive
Module 4•2 hours to complete
Module details
Working with others in a data science project often involves reading output or data produced using other statistical analysis packages or other software. This module covers packages for reading in these foreign formats, as well as images and data from Google Drive.
What's included
3 readings1 assignment
Show info about module content
3 readings•Total 65 minutes
haven•15 minutes
Images•30 minutes
googledrive•20 minutes
1 assignment•Total 30 minutes
Foreign Formats, Images and googledrive Quiz•30 minutes
Case Studies
Module 5•4 hours to complete
Module details
Now we will demonstrate how to import data using our case study examples. When working through the steps of the case studies, you can use either RStudio on your own computer or Coursera lab spaces provided for each case study.
What's included
11 readings2 ungraded labs
Show info about module content
11 readings•Total 142 minutes
Case Study #1: Health Expenditures•5 minutes
Healthcare Coverage Data•45 minutes
Healthcare Spending Data•30 minutes
New Case Study #2: Firearms•2 minutes
Census Data•5 minutes
Counted Data•5 minutes
Suicide Data•10 minutes
Brady Data•10 minutes
Crime Data•10 minutes
Land Area Data•10 minutes
Unemployment Data•10 minutes
2 ungraded labs•Total 120 minutes
Health Expenditures Lab•60 minutes
Firearms Case Study Lab•60 minutes
Project: Importing Data into R
Module 6•1 hour to complete
Module details
This project will give you the opportunity to read in data from multiple sources and conduct some simple operations on those data.
What's included
2 readings1 assignment
Show info about module content
2 readings•Total 20 minutes
Introduction and Background•10 minutes
Datasets•10 minutes
1 assignment•Total 30 minutes
Importing Data into R Project•30 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors
Instructor ratings
Instructor ratings
We asked all learners to give feedback on our instructors based on the quality of their teaching style.
The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Learner reviews
4.7
51 reviews
5 stars
78%
4 stars
18%
3 stars
4%
2 stars
0%
1 star
0%
Showing 3 of 51
E
EL
5·
Reviewed on Nov 22, 2022
Excellent. While there were no lectures, and it is possible to simply read the authors' book, having the quizzes makes the difference between just reading and actually learning. Thanks!
V
VM
5·
Reviewed on Mar 27, 2021
Great for beginners. Clearly explained, and easy to follow.
F
FC
5·
Reviewed on Jan 28, 2021
Excellent tutorial for importing data into the tidyverse environment
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.