This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.



Exploratory Data Analysis
This course is part of multiple programs.



Instructors: Roger D. Peng, PhD
Access provided by EY
182,977 already enrolled
(6,085 reviews)
What you'll learn
- Understand analytic graphics and the base plotting system in R 
- Use advanced graphing systems such as the Lattice system 
- Make graphical displays of very high dimensional data 
- Apply cluster analysis techniques to locate patterns in data 
Skills you'll gain
Details to know

Add to your LinkedIn profile
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 4 modules in this course
This week covers the basics of analytic graphics and the base plotting system in R. We've also included some background material to help you install R if you haven't done so already.
What's included
15 videos6 readings1 assignment5 programming assignments1 peer review
Welcome to Week 2 of Exploratory Data Analysis. This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was part of the original R system and lacks many features that may be desirable in a plotting system, particularly when visualizing high dimensional data. The Lattice and ggplot2 systems also simplify the laying out of plots making it a much less tedious process.
What's included
7 videos1 reading1 assignment5 programming assignments
Welcome to Week 3 of Exploratory Data Analysis. This week covers some of the workhorse statistical methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displays of very high dimensional data (many many variables). We also cover novel ways to specify colors in R so that you can use color as an important and useful dimension when making data graphics. All of this material is covered in chapters 9-12 of my book Exploratory Data Analysis with R.
What's included
12 videos1 reading4 programming assignments
This week, we'll look at two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques, and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but I'm providing these videos to give you a sense of how you might proceed with a specific type of dataset.
What's included
2 videos2 readings1 programming assignment1 peer review
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors

Offered by
Why people choose Coursera for their career




Learner reviews
6,085 reviews
- 5 stars74.33% 
- 4 stars21.09% 
- 3 stars3.38% 
- 2 stars0.73% 
- 1 star0.44% 
Showing 3 of 6085
Reviewed on May 23, 2019
Amazing! Learing so much how to explore the data for the first time. This is a must do for anyone who wants to be a data scientist. Now I can use ggplot without any trouble. Thanks!
Reviewed on Oct 17, 2018
Seems this would type of course in an online learning MOOC would be better if it was more direct hands on "how to" and less focused on explanatory fluff (academic style) .
Reviewed on Jan 11, 2017
I did learn more about putting together a set of graphs that help to explore the data. I did see how subsetting and aggregating data helps to give a better understanding of the data.
Explore more from Data Science
 - Coursera Project Network 
 - Coursera Project Network 
 - University of Leeds 
 - Johns Hopkins University 
¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.

