About this Course
4.5
2,665 ratings
398 reviews
This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results....
Stacks

Course 5 of 10 in the

Globe

100% online courses

Start instantly and learn at your own schedule.
Calendar

Flexible deadlines

Reset deadlines in accordance to your schedule.
Clock

Suggested: 4-9 hours/week

Approx. 10 hours to complete
Comment Dots

English

Subtitles: English

What you will learn

  • Check
    Determine the reproducibility of analysis project
  • Check
    Organize data analysis to help make it more reproducible
  • Check
    Publish reproducible web documents using Markdown
  • Check
    Write up a reproducible data analysis using knitr

Skills you will gain

KnitrR ProgrammingData AnalysisMarkup Language
Stacks

Course 5 of 10 in the

Globe

100% online courses

Start instantly and learn at your own schedule.
Calendar

Flexible deadlines

Reset deadlines in accordance to your schedule.
Clock

Suggested: 4-9 hours/week

Approx. 10 hours to complete
Comment Dots

English

Subtitles: English

Syllabus - What you will learn from this course

1

Section
Clock
2 hours to complete

Week 1: Concepts, Ideas, & Structure

This week will cover the basic ideas of reproducible research since they may be unfamiliar to some of you. We also cover structuring and organizing a data analysis to help make it more reproducible. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn't going to ruin the story. ...
Reading
9 videos (Total 72 min), 3 readings, 1 quiz
Video9 videos
What is Reproducible Research About?8m
Reproducible Research: Concepts and Ideas (part 1)7m
Reproducible Research: Concepts and Ideas (part 2) 5m
Reproducible Research: Concepts and Ideas (part 3) 3m
Scripting Your Analysis 4m
Structure of a Data Analysis (part 1)12m
Structure of a Data Analysis (part 2)17m
Organizing Your Analysis11m
Reading3 readings
Syllabus10m
Pre-course survey10m
Course Book: Report Writing for Data Science in R10m
Quiz1 practice exercise
Week 1 Quiz20m

2

Section
Clock
3 hours to complete

Week 2: Markdown & knitr

This week we cover some of the core tools for developing reproducible documents. We cover the literate programming tool knitr and show how to integrate it with Markdown to publish reproducible web documents. We also introduce the first peer assessment which will require you to write up a reproducible data analysis using knitr. ...
Reading
9 videos (Total 59 min), 2 quizzes
Video9 videos
Markdown5m
R Markdown6m
R Markdown Demonstration7m
knitr (part 1)7m
knitr (part 2) 4m
knitr (part 3) 4m
knitr (part 4) 9m
Introduction to Course Project 14m
Quiz1 practice exercise
Week 2 Quiz10m

3

Section
Clock
1 hour to complete

Week 3: Reproducible Research Checklist & Evidence-based Data Analysis

This week covers what one could call a basic check list for ensuring that a data analysis is reproducible. While it's not absolutely sufficient to follow the check list, it provides a necessary minimum standard that would be applicable to almost any area of analysis....
Reading
10 videos (Total 60 min)
Video10 videos
RPubs 3m
Reproducible Research Checklist (part 1)8m
Reproducible Research Checklist (part 2) 10m
Reproducible Research Checklist (part 3) 6m
Evidence-based Data Analysis (part 1)3m
Evidence-based Data Analysis (part 2) 3m
Evidence-based Data Analysis (part 3) 4m
Evidence-based Data Analysis (part 4) 4m
Evidence-based Data Analysis (part 5) 7m

4

Section
Clock
3 hours to complete

Week 4: Case Studies & Commentaries

This week there are two case studies involving the importance of reproducibility in science for you to watch....
Reading
5 videos (Total 59 min), 1 reading, 1 quiz
Video5 videos
Case Study: Air Pollution14m
Case Study: High Throughput Biology30m
Commentaries on Data Analysis2m
Introduction to Peer Assessment 2m
Reading1 reading
Post-Course Survey10m
4.5
Direction Signs

35%

started a new career after completing these courses
Briefcase

83%

got a tangible career benefit from this course

Top Reviews

By AAFeb 13th 2016

My favorite course, at least it gives me an argument why scripted statistics is awesome and can be applied to a number of data related activities. Recycling chunks of code has proven useful to me.

By ASJun 23rd 2017

Of course, I liked this course. There was even an extra non-graded assignment. Plus two graded assignments. Quality instruction videos and lots of practice. Everything a learner needs.

Instructors

Roger D. Peng, PhD

Associate Professor, Biostatistics
Bloomberg School of Public Health

Jeff Leek, PhD

Associate Professor, Biostatistics
Bloomberg School of Public Health

Brian Caffo, PhD

Professor, Biostatistics
Bloomberg School of Public Health

About Johns Hopkins University

The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world....

About the Data Science Specialization

Ask the right questions, manipulate data sets, and create visualizations to communicate results. This Specialization covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material....
Data Science

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

More questions? Visit the Learner Help Center.