Chevron Left
Back to Introduction to Probability and Data

Learner Reviews & Feedback for Introduction to Probability and Data by Duke University

3,776 ratings
869 reviews

About the Course

This course introduces you to sampling and exploring data, as well as basic probability theory and Bayes' rule. You will examine various types of sampling methods, and discuss how such methods can impact the scope of inference. A variety of exploratory data analysis techniques will be covered, including numeric summary statistics and basic data visualization. You will be guided through installing and using R and RStudio (free statistical software), and will use this software for lab exercises and a final project. The concepts and techniques in this course will serve as building blocks for the inference and modeling courses in the Specialization....

Top reviews


Jan 24, 2018

This course literally taught me a lot, the concepts were beautifully explained but the way it was delivered and overall exercises and the difficulty of problems made it more challenging and enjoying.


Mar 31, 2018

The tutor makes it really simple. The given examples really helped to understand the concepts and apply it to a wide range of problems. Thank you for this. Wish I could complete the assignments too.

Filter by:

776 - 800 of 851 Reviews for Introduction to Probability and Data

By Daud M M

Nov 14, 2017

Excellent course for beginners

By Mulenga M

Aug 19, 2016

It was a well taught course

By clement l

Apr 20, 2017

Nice intro to stats and R.

By heleny2

Aug 20, 2018

Need more courses about R

By Shashank M

Apr 16, 2018

5 for Stats and 3 for R.

By George G R

May 06, 2017

The classes are good.

By Hennie d n

Aug 15, 2016

great course so far!!

By Shikhar M

Feb 08, 2018

Comprehensive Course

By Marildo G F

Jul 26, 2017

Excellente course!!!

By V S S

Jul 29, 2019

Great explanation

By Emmanuel k S

Feb 28, 2019

very interesting

By Indrani S

May 25, 2020

very helpful

By 김인수

Jun 25, 2019

good lecture

By Md M H

Nov 13, 2018


By Sanjeev A

Jul 02, 2018

Very Good

By jaime p

Mar 15, 2019


By Zhai H

Oct 10, 2017


By 徐天宇

Nov 15, 2018


By FangXinyi

Jul 23, 2018


By Subhadra M

May 30, 2017


By Marcin W

Apr 29, 2017


By Philippe R

Sep 05, 2016

Very mixed feelings about this course.

Generally speaking, the course lectures are informative and well organized. Mentors are reallly of great help, they are doing a great job, honestly: they are very active, they give good insights, they know the subject matter.

But in the course lectures, there are occasions where concepts are used which were not formally introduced before their actual use.

One example: in the lectures on probability, the first "slide" in the lecture talks about random processes, outcomes of random process,... On the next slide, the notion of probability of an event is introduced, but the very notion of "event" was never introduced. It is introduced in the accompanying book, but if it is the case that the book chapters should be read PRIOR to watching the course videos, that fact should be made clear.

Further in the course on probability, some words are used "interchangeably" without the context making it clear why they can be used interchangeably. For instance, on some occasions, the concept of independent events is used, but then, later on, the discussion talks of independent processes. Which is which??? Is there a difference? If so, what is it? When do I need to use independent events as opposed to independent processes?

The graded assignments are of varying quality. The most disturbing thing about them is that, on some occasions, concepts are used in the quiz questions (either directly in the questions and answer choices, or indirectly in the "correction" for the quiz after you have submitted it) that were never touched upon in the course.

I have had two occasions of concepts not introduced in the course but used in the graded assignments.

The first occurrence of a gap between course content and quiz questions was on a quiz question about inference. I failed the question, and understood why I failed based on the course content litterally minutes after failing the question (and one mentor actually rightly corrected me). But the question "correction" (the explanation text you receive after submitting, as justification for what the correct answer is) referred to the concept of "two-sided hypothesis test". Where did THAT come from?? I checked and rechecked the course videos, no mention at all of it. I checked the accompanying book, and the first mention of two-sided hypothesis test is way way way further in the book, in a chapter that is entirely focusing on inference.

The second occurrence was in week 4. The course lectures cover two distributions: normal and binomial. The recommended reading in the book also focus on these two distributions (the recommended reading actually skips the section on geometric distribution, if I remember well). But in one of the quiz question, there was one of the possible answers referring to the geometric distribution. If it is the case that we are supposed to know and understand about geometric distributions, then the course content should cover the subject. Or at the very least, the course lecture should mention clearly that learners are advised to read about it in the accompanying book.

The guidelines for the project assignment (week 5) are not all that clear as to what is expected from the learners. Sure, there are instructions on where to find the info, what structure should be followed,... There is also a very nice "example" project (designed by one of the mentors), which provides a lot of useful info (how to filter missing values from variables,...).

But there is no real hint as to the depth of analysis we are expected to complete. This is definitely a source of confusion, not only for me, but also for a few other learners, from what I gathered in the discussion forums. The result is that the projects you get to review are of very disparate levels. Some end up in calculating one figure per research question, without any attempt at deriving trends or patterns, others do not include any plots at all,... The thing is that the peer review criteria do not really provide a good basis to ensure that learners did indeed assimilate the course contents. Most of the questions in the peer review assignment have a lot more to do with following a canvas and not so much with the course substance itself.

For instance, some of the peer review criteria have to do with the narratives for computed statistics and plots. The criteria are: "Is each plot/R outout followed by a narrative", "Does the narrative correctly interpret the plots, or statistics", "Does the narrative address the research question". But when the research question is a question of the type "What it the IQR for income per state", for instance, the narrative can be very short: "IQR per state shows that the state with higher variability of income is...". So, the narrative meets the 3 evaluation criteria: there is a narrative, it does address the research question, and it does correctly interpret the statistics. But it is not particularly useful.

I do understand that Internet-based peer review is challenging, and that you have to settle for "neutral" criteria that are easy to assess by learners. But the peer review grading "grid" as it currently stands is not "that" helpful in assessing whether the course contents has been assimilated.

To conclude, when I took the course, my initial plan was to follow the entire specialization. But after having completed the first course of the specialization, I have radically changed my mind, and will look for alternatives "elsewhere" to get the knowledge/skillset that I am after.

By Casey S

Nov 12, 2017

This course to me had some very clear un-explicit limitations, pros and cons:

- The lectures are fantastic and have a good sequence for beginners

- The course is very holistic in its approach, meaning that it covers theory and application very broadly and gives you a good sense of how different aspects of the field of statistics relate to eachother

- The coverage of the R programming language is insufficient for the requirements for using it in the final assignment, I can't stress this enough for beginners. I highly suggest you take a foundational course in R, highlighting syntactical structure of the language, prior to taking this course

- The labs are great for learning the primary components of R, but they don't give you real practice coding. There is very little to no explanation of certain functions in R and there are no videos on it. I do not feel at the end of this course I have a very good understanding of the structure of the language of R, I do however feel I was assessed as if I should have.

- I felt the quizzes were appropriately rigorous for a beginner such as myself.

Most important bottom line is: If you are a true beginner like myself I urge you to first take a course more targeted to R before starting this specialization. Otherwise, like myself, I think you will feel very overwhelmed at the end.

By Jeremy L

Jul 06, 2018

The course is divided into 5 sections, each of which you have a week to complete (if you want a certificate). The first 4 sections/weeks are well designed and involved a mixture of lectures (most were good), reading assignments in a textbook (free online access), practice problems, and a weekly quiz. Along the way students learn how to use R through a handful of walk-through examples. In general this works. That said, the last two R assignments are a mess. For the 4th week, the instructors put together a demonstration for using R to ask and answer some basic research questions. The document they put together for this demonstration, however, is so full of typos and grammar mistakes, and worse, heaps of nearly incomprehensible sentences and phrasings, that it is almost worthless. It was really painful to get through it. The final R task is to work with a real-world data set, ask a few research questions, and use R to do some basic statistical analysis of the data. Working with a real-world data set is great. That said, I felt as if the instructors were asking students to do far more with R and statistics than we had learned in the class. I saw many similar opinions about this assignment online. And in grading my peers, I noticed that other students didn't know how to complete the project either.

By Nayyer I

Apr 28, 2020

The course is great in terms of building foundational concept for data analysis and lab assignments were ok. The I think often the time listed for any class is a little underestimation of time commitment. The course has offered me a lot of new concepts to learn and was good refresher for many other. But the final project was a huge disappointment for me. The students have been given a huge datasets that leaves students struggling to figure out where to start. In order to understand the data, students have to go to various links to see what the data is, how it is collected and definition of each variables. Then there are more than 300 variables and you need to pick few to do something you think is interested. Finally, the project needs a good level of expertise in "R" and course does not teach you that at all. I would suggest that for future courses reduce the number of variables depending on what most students have been using. Draft a quick summary and report about key information of data and share that on course page rather than links to web pages, and finally let students use the software that they may feel interesting. Not everyone is skilled to use R.