Back to Getting and Cleaning Data
Johns Hopkins University

Getting and Cleaning Data

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

Status: MySQL
Status: Data Manipulation
Course20 hours

Featured reviews

NK

4.0Reviewed May 18, 2020

The 'cleaning data' part was explained pretty well... I do feel he could've gone into more detail for the 'gathering data' part- especially the webscraping part. Other than that, great course!

WC

5.0Reviewed Oct 31, 2016

This course is amazing! I have spent the majority of my time in R merely doing analytics. This course taught me the tools needed to go out and grab the data that I need for those analytics.

JS

4.0Reviewed May 4, 2016

Actually, very interesting and helpful class. The one area around more complex structures (API, XML) warrants more attention, since I assume those are more dominant access methods.

AB

5.0Reviewed Oct 15, 2017

This course is very enlightening. The techniques demonstrated in this course are critical for gathering raw data from various sources and turning it into useful data for analysis.

NA

5.0Reviewed Jun 7, 2020

A very useful course. The audio quality of some lectures (especially those by the main instructor) was not good. This course completes the sister course of R programming and they work together.

DH

5.0Reviewed Feb 1, 2016

Easy, mostly instructive Course. The Assignments and quizzes are quite good, and illustrates the lessons very well.See the videos for general presentation, but use the energy on the excersizes.

XX

4.0Reviewed Aug 14, 2018

The Swirl practice part is great! But there is a big gap between what we learned from video/swirl and the course project! The project is much harder than what I learn from the course.

RD

4.0Reviewed Nov 9, 2016

Good coverage of topics. A bit scattered in early slides and assignments were often inconsistent with coursework. Overall a great introduction of what to expect when gathering data.

AT

4.0Reviewed Nov 19, 2017

Very interesting and enjoyed doing the Assignment. but the assignment instructions are not clear.A lot of time was wasted trying to figure out what data is what are what are we interested in.

BD

5.0Reviewed Oct 25, 2016

This course is really a challenging and compulsory for any one who wants to be a data scientist or working in any sort of data. It teaches you how to make very palatable data-set fro ma messy data.

AG

4.0Reviewed May 7, 2020

The course was very helpful & guided but since I don't have a strong coding background I felt myself getting lost often. It would be really helpful if there is some guidance in assignments.

BP

5.0Reviewed Feb 3, 2018

The course is an excellent introduction to the dplyr package and string manipulation in r. I thought the assignment at the end of the course was a little vague and hard to understand

All reviews

Showing: 20 of 1,315

William Stewart
2.0
Reviewed Feb 4, 2018
T M
1.0
Reviewed Feb 1, 2019
Matt Kerns
1.0
Reviewed Jul 17, 2018
Sebastián Lucas
2.0
Reviewed Jan 12, 2018
Bhawesh Singhania
2.0
Reviewed Apr 4, 2019
Mohammad Amir Aghaee
3.0
Reviewed May 13, 2019
Thej
1.0
Reviewed Nov 29, 2018
THI A ALLGOOD
1.0
Reviewed Feb 16, 2019
Les Schmidt
2.0
Reviewed Apr 8, 2017
Pietro Pollo
2.0
Reviewed Jan 25, 2019
Javier R Lores Gil
1.0
Reviewed Nov 17, 2018
Kyle Rozic
2.0
Reviewed Jun 1, 2020
Jennifer Sargent
1.0
Reviewed Jul 22, 2020
1.0
Reviewed Apr 9, 2020
Erin Aylsworth
5.0
Reviewed Dec 9, 2019
Mathew Knudson
4.0
Reviewed Dec 29, 2019
Viktor Kusnezh
1.0
Reviewed Mar 23, 2020
Dan Kjeldstrøm Hansen
5.0
Reviewed Feb 2, 2016
Moshe Pilsky
3.0
Reviewed Mar 13, 2019
Akshay Khatter
2.0
Reviewed Apr 9, 2018