This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.



Reproducible Research
This course is part of multiple programs.



Instructors: Roger D. Peng, PhD
Access provided by McKinsey
107,202 already enrolled
(4,179 reviews)
What you'll learn
- Organize data analysis to help make it more reproducible 
- Write up a reproducible data analysis using knitr 
- Determine the reproducibility of analysis project 
- Publish reproducible web documents using Markdown 
Skills you'll gain
Details to know

Add to your LinkedIn profile
2 assignments
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 4 modules in this course
This week will cover the basic ideas of reproducible research since they may be unfamiliar to some of you. We also cover structuring and organizing a data analysis to help make it more reproducible. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn't going to ruin the story.
What's included
9 videos4 readings1 assignment
This week we cover some of the core tools for developing reproducible documents. We cover the literate programming tool knitr and show how to integrate it with Markdown to publish reproducible web documents. We also introduce the first peer assessment which will require you to write up a reproducible data analysis using knitr.
What's included
9 videos1 assignment1 peer review
This week covers what one could call a basic check list for ensuring that a data analysis is reproducible. While it's not absolutely sufficient to follow the check list, it provides a necessary minimum standard that would be applicable to almost any area of analysis.
What's included
10 videos
This week there are two case studies involving the importance of reproducibility in science for you to watch.
What's included
5 videos1 reading1 peer review
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors

Offered by
Why people choose Coursera for their career




Learner reviews
4,179 reviews
- 5 stars68.67% 
- 4 stars22.94% 
- 3 stars5.67% 
- 2 stars1.65% 
- 1 star1.05% 
Showing 3 of 4179
Reviewed on Apr 29, 2020
Great topic which is discussed well with a good case study. I'd like to see more up-to-date content and more detailed analytical techniques. However, it's a nice introduction!
Reviewed on Aug 9, 2019
Without taking this course wouldn't have fully understood the importance of reproducible research in data science. Thank you so much. I recommend this course for all data scientists.
Reviewed on Mar 30, 2022
I took this course as part of the Data Science specialization without any real expectation and realized that this subject is probably one of the most important in data analysis.
Explore more from Data Science
 - Johns Hopkins University 
 - Johns Hopkins University 
 - Emory University 


