Probability sampling minimizes the selection threat to external validity. Before I discuss different types of probability sampling, let's consider the essential feature of probability sampling, and how this feature helps to minimize the risk of systematic bias in our selection of participants. The essential feature of probability sampling is that for each element in the sampling frame, the probability of being included in the sample is known and non-zero. In other words, some form of random selection is required where any element could, in principle, end up in the sample. To use probability sampling, we need to have a sampling frame. A list of all elements in the population that can be accessed or contacted. A sampling frame is necessary to determine each element's probability of being selected. Now, let's see why probability sampling minimizes the threat of a systematic bias in our selection of participants. Reducing systematic bias means reducing the risk of over or under representation, of any population sub-group with a systematically higher or lower value on the property. Otherwise, our sample value will be unlikely to represent the population value accurately. We've already seen a method to eliminate systematic bias in participant characteristics. Remember how we eliminated the selection threat to internal validity? We used random assignment to get rid of systematic differences between the experimental and control condition. In the long run, any specific participant characteristic will be divided equally over the two groups. This means that any characteristic associated with a systematically higher or lower score on the dependent variable cannot bias the results, in the long run. The same principle can be applied, not in the assignment, but in the selection of participants. We avoid systematic difference between the sample and the population by randomly selecting elements from the population. In the long run, any specific participant characteristics will be represented in the sample, proportionally to their presence in the population. We call this a representative sample. Suppose a population consists of 80% women. With repeated random sampling, we can expect the sample to contain 80% women, in the long run. Each individual element has the same probability to be selected. And, since there are more women, female elements will be selected more often. Besides resulting a representative sample, in the long run, probability sampling has another advantage. Probability sampling allows us to assess the accuracy of our sample estimate. Probability sampling allows us to determine that with repeated sampling, in a certain percent of the samples, the sample value will differ from the real population value by no more than a certain margin of error. This sounds, and is complicated. But, it basically means that we can judge how accurate our sample estimate is, in the long run. Given a certain risk to get it wrong, we can assess what the margin of error is on average. Meaning by how much the sample and population value will differ on average. Consider an election between Conservative candidate A and Democratic candidate B. We want to estimate the proportion of people in the population that will vote for candidate A, as accurately as possible. Random sampling allows us to make statements such as this. If we were to sample voters repeatedly, then in 90% of the samples, the true population proportion of votes for A would lie within 8% points of our sample estimate. So, if we find that 60% of our sample indicates that they will vote for A, then we can say we are fairly confident, that the true proportion will lie somewhere between 52% and 68%. This interval is called a confidence interval. Of course, this particular interval could be wrong, because in 10% of the samples, the sample value will lay further than 8% points from the true value. This could be one of those samples, so we can never be certain.