Back to Getting and Cleaning Data
Johns Hopkins University

Getting and Cleaning Data

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

Status: Data Integration
Status: Data Processing
Course20 hours

Featured reviews

RR

4.0Reviewed Jul 9, 2017

I found the last project insufficiently explained. I was struggling in understanding what the task is. A bit more clear task description (as in Course 2) would be really appreciated.

AB

5.0Reviewed Oct 15, 2017

This course is very enlightening. The techniques demonstrated in this course are critical for gathering raw data from various sources and turning it into useful data for analysis.

HS

5.0Reviewed May 2, 2020

This course provides an introduction of some important concepts and tools on a very important aspect of data science: cleaning and organizing data before any analysis. A must for any data scientist.

NA

5.0Reviewed Jun 7, 2020

A very useful course. The audio quality of some lectures (especially those by the main instructor) was not good. This course completes the sister course of R programming and they work together.

NK

4.0Reviewed May 18, 2020

The 'cleaning data' part was explained pretty well... I do feel he could've gone into more detail for the 'gathering data' part- especially the webscraping part. Other than that, great course!

XX

4.0Reviewed Aug 14, 2018

The Swirl practice part is great! But there is a big gap between what we learned from video/swirl and the course project! The project is much harder than what I learn from the course.

AT

4.0Reviewed Nov 19, 2017

Very interesting and enjoyed doing the Assignment. but the assignment instructions are not clear.A lot of time was wasted trying to figure out what data is what are what are we interested in.

AC

5.0Reviewed Jan 1, 2019

It was pretty hard for someone like me who has a weakness in programming but it provided sufficient exposure and tasks for me to learn within my capabilities. I did enjoy its challenges.

CW

4.0Reviewed Aug 6, 2020

I think that the level of difficulty of the exercises and final assignment does not match with the depth of the lectures; without a textbook, I feel lost, don't have a reference, and have to guess.

RD

4.0Reviewed Nov 9, 2016

Good coverage of topics. A bit scattered in early slides and assignments were often inconsistent with coursework. Overall a great introduction of what to expect when gathering data.

SB

5.0Reviewed Mar 16, 2018

So knowledgeable and interesting course. I have learned much about data cleaning and getting from different sources. Finally thanks to coursera team for giving us the opportunity.

BD

5.0Reviewed Oct 25, 2016

This course is really a challenging and compulsory for any one who wants to be a data scientist or working in any sort of data. It teaches you how to make very palatable data-set fro ma messy data.