- All DegreesExplore Bachelorâ€™s & Masterâ€™s degrees
- Computer Science & EngineeringExplore Computer Science & Engineering degrees
- BusinessExplore MBA & Business degrees
- Bachelorâ€™s DegreesExplore masterâ€™s degrees from leading universities
- MasterTrackâ„¢Earn credit towards a Masterâ€™s degree
- University CertificatesAdvance your career with graduate-level learning

Back to Exploratory Data Analysis

stars

6,014 ratings

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data....

CC

Jul 28, 2016

This is the second course I have taken from Roger Peng and both were outstanding. I have a strong math background, but not much of a background in stats, but this course was very approachable for me.

Y

Sep 23, 2017

Very good course! It provide me the foundation in learning how to plot and interpret data. This will definitely strengthen my "R programming" to generate publication type figure for my genomics data!

Filter by:

By JM

â€¢Jul 11, 2018

Once it got to the clustering section the lessons were inscrutable. Extremely difficult to understand and not explained well.

By Roman

â€¢Aug 30, 2018

Cons:

# Too much focus on hopelessly outdated R functions.

# Lectures are mostly powerpoint karaoke along the lines of "You can do that thing. And you can also do that other thing. And also you do this third thing" without much real-world application.

# ggplot2 is the only modern viz package that gets mentioned

Pros:

# The swirl exercises are great (but very buggy on Mac)

By Luca R

â€¢Jun 10, 2017

The videos were merely repeating the content from swirl, with absolutely no added values.

By Dilyan D

â€¢Feb 12, 2018

This is the worst of the Data Science courses so far (they've all been pretty good up to this point).

It's called Exploratory Data Analysis, but is actually all about the graphics systems in R. And it does a botched job on those as well.

All quizzes and assignments are about the graphics systems. The only portion of the course that deviates from that is Week 3 (for which there is no quiz or project) where we "learn" about clustering and dimension reduction. However, that material is presented really poorly: not enough depth for someone who is already familiar with the subject matter; and not nearly well enough explained for newbies.

On the graphics side, none of the systems is explored in great depth. The lattice system is essentially just mentioned in passing.

To cap it all off, the brief for the last assignment is really ambiguous, which often causes perfectly valid work to be graded poorly by peers. (Just look at the forums, if you need proof.)

By Beverly A

â€¢Sep 19, 2016

When it comes down to it, there's simply not the support to assist a student that has a really hard problem, "hacker mentality" seems to equate to "figure it out on your own cuz nobody's going to help you". If things do not work perfectly for you then you are likely never to be able to finish because your "peers" don't know any better either. The way this class is set up makes me angry every time I have to deal with it. I would probably be just as well served doing just the swirl() exercises. I would quit if I hadn't paid all the way through in advance. I can't believe this is the type of school John Hopkins is to produce a course of this quality, but I guess I have to.

By Dan H

â€¢May 13, 2019

Provides a solid overview of the base plotting system and a discussion (better elsewhere) of others. Introduces some higher level exploratory methods, without much information on either the theory or application (simply walks through the recipe). Assessments do not match the lecture material, so the credential is essentially meaningless. Read the associated book, watch the video lectures if you'd like. Don't bother with paying for the certificate.

By Paul R

â€¢Mar 11, 2019

This course covers plotting (base, lattice, ggplot) then takes a confusing tour into heavy topics of clustering and dimension reduction, then flips back to coloring in charts. The order of the lectures is confusing and PCA/SVD needs more background, clearer explanation and treatment (gets covered a bit more later under regression). Assignments are good and swirl courses helped solidify the lectures.

By Faben W

â€¢Feb 4, 2019

This lesson could have been significantly improved if there was at least one assignment on clustering/dimensional reduction. Those are probably the hardest concepts thought thus far, so it would have been extremely useful to have at least one challenge to work through.

By Rok B

â€¢May 15, 2019

This course is basically plotting with R and clustering/dimensionality reduction. There's is not enough emphasis on the later in my opinion. The final assignment focuses only on plotting, which is a shame.

By Pamela M

â€¢Jun 4, 2016

Alas, after only 10 minutes of the first video, I am reminded that this instructor does not gear his lectures to the true Beginners among us. He speaks much more for an audience of grad students. I do want to complete this Specialization, so I will try again perhaps after learning more - about statistics and R and who knows what else. I fought my way through the first three courses, but now I'm going to work smarter by finding other ways to acquire this knowledge. Then return to him; maybe. This course should be labelled Intermediate and Statistics should be listed as a prerequisite. (I think; since I don't know what it is that I don't know, I am making a guess as to the missing piece of the puzzle.)

By M C

â€¢Jan 14, 2021

These courses need to be updated.

At least one of the Swirl packages references a retired command, gather.

The use of Swirl is nice but it can get very tiring when the computer picks up spaces and makes correct small details over and over.

I believe that all of these courses need to share some practice questions before the quizzes. This allows people to discuss the problems they have without feeling like you are cheating when discussing quiz questions.

Six years is a long time to have course material for teaching. I suggest it is getting too OLD.

By Eswara K

â€¢Jun 6, 2020

Awesome course that expands on your R knowledge. Only nitpick is that some of the links don't work and the videos need an overhaul as there seem to be little to no updates since 2015/2016.

By Josh H

â€¢Nov 5, 2017

I can't help but feel lied to. The FAQ for the specialization says the following: "We also suggest a working knowledge of mathematics up to algebra (neither calculus or linear algebra are required). " If no linear algebra background is required, then why do you assume that I know what a singular value decomposition is? Or principal components analysis? Terrible course.

By Sergey K

â€¢May 10, 2016

This course mostly about how to use plotting libraries in R.

By omar k

â€¢Oct 10, 2017

limited and monotonous explanation

By NISHANT P

â€¢Oct 5, 2017

Very insightful course!!!

The swirl packages and course projects in "Exploratory Data Analysis" course have really helped me to understand the power of R in performing introductory graphical analyses towards initial inferences. It has good hands-on exercises to really put to action various sophisticated graphs and plots for boardroom conversations on how to go deeper into the data analysis in order to find meaningful business insights or build powerful predictive models. As I advance through the specialization, I am getting to realize how powerful Statistical Learning through R is for quick business action and automation.

By Dale O J

â€¢Oct 16, 2018

This has been a challenging course for me, for whatever reasons. I have devoted a great deal of time in reading Dr. Peng's books as well as reviewing work product of other students to get a better grasp of the logic and methodology. I have enjoyed this course more than any of the preceding courses. And, the struggle I believe will be worth the effort and facilitate my completion of the data science specialization program.

By Chandrakanth C

â€¢Jun 18, 2018

Well organised course for Exploratory Data Analysis. After this course you will be thorough with the basics of the Exploratory Data Analysis. The peer graded assignment is one level higher than the concepts thought in the lectures is what I felt. Overall, it is worth taking this course. Thanks a lot, Coursera.

By Rishabh J

â€¢Aug 22, 2017

I found Prof. Roger Peng to be the best instructor in this specialization. This course just proves my point further. He teaches different concepts in a lucid manner. These concepts were presented in a way that could be applied to real world data sets right away. Awesome course!

By Farah N

â€¢Aug 28, 2019

I enjoyed taking this course specially the projects and swirl practice. If the clustering were a bit detailed, it would be useful. Also we could do a project using the 3 different approaches, it would be interesting. Nevertheless, it was fantastic with the amazing professors.

By Diego A Q

â€¢Jun 18, 2019

Great course, it teaches you a lot about how to create plots, charts and other tools using R code. This course is focused on "get to know your data" by using all this tools during a research process. It is like the previous step you have to do before going into any analytics.

By Linwood C I

â€¢Mar 7, 2016

I loved this course!! All of the classes taken in the specialization all come together for practical use. Course 2 is where it really kicked in. Students will learn how to use R to explore data sets that send you down interesting paths.

By Clare S

â€¢Mar 21, 2016

Really nice course. Good to put the graphics functions in R to use. I think it would be helpful to have a summary page somewhere that compares the format of how to generate simple plots using each of the 3 packages - just for reference.

By Deleted A

â€¢Jul 7, 2019

This course teaches how to use three different plotting systems in R. Given the dominance of the tidyverse/ggplot2 paradigm, I really appreciate the opportunity to learn the base plotting system and the lattice plot system.

By Amir h F N

â€¢Nov 7, 2021

Very informative and useful course for data analyzers and scientists. It presented plotting systems, data reduction / normalizing, and many other different modules in analysis. I advise others to undertake it.

- AWS Cloud A Practitioner's Guide
- Basics of Computer Programming with Python
- Beginners Python Programming in IT
- Developing Professional High Fidelity Designs and Prototypes
- Get Google CBRS-CPI Certified
- Introduction to MATLAB Programming
- Learn HTML and CSS for Building Modern Web Pages
- Learn the Basics of Agile with Atlassian JIRA
- Managing IT Infrastructure Services
- Mastering the Fundamentals of IT Support

- Basics of Computer Programming with Python
- Beginners Python Programming in IT
- Building a Modern Computer System from the Ground Up
- Getting Started with Google Cloud Fundamentals
- Introduction to Cryptography
- Introduction to Programming and Web Development
- Introduction to UX Design
- Learn HTML and CSS for Building Modern Web Pages
- Mastering the Fundamentals of IT Support
- Utilizing SLOs & SLIs to Measure Site Reliability

- Building an Agile and Value-Driven Product Backlog
- Foundations of Financial Markets & Behavioral Finance
- Getting Started with Construction Project Management
- Getting Started With Google Sheets
- Introduction to AI for Non-Technical People
- Learn the Basics of SEO and Improve Your Website's Rankings
- Mastering Business Writing
- Mastering the Art of Effective Public Speaking
- Social Media Content Creation & Management
- Understanding Financial Statements & Disclosures