0:03

So before we begin discussing probability, we need some very basic mathematics.

Â Now everyone listening to this lecture will have had set

Â notation at some point in their life and covered it from a very basic or

Â even more advanced mathematical perspective.

Â In probability, the set notation has the same rules,

Â of course, it's just a subset of ordinary set notation.

Â However, the interpretations of set notation are slightly different.

Â So usually when you talk about set notation, you talk about sum.

Â Uber space that contains everything.

Â Well, in statistics, we call this the sample space and

Â that we usually denote with an upper case omega and

Â this is the collection of all possible outcomes of an experiment.

Â So as in a simple example, let's conduct an experiment.

Â We roll a die, so the possible set of outcomes are one, two,

Â three, four, five or six.

Â Where here, we're not gonna play the sort of mental games that the die could land on

Â an edge or a corner or something like that.

Â It has to roll on one of the numbers and

Â then the sample space would be the integers from one to six.

Â An event is any subset of the sample space.

Â So for example, you could have the event that the die rolls even.

Â I that E is the set containing the numbers two, four and six.

Â Certain kinds of events are so

Â commonly talked about that we give them a separate name.

Â An elementary or simple event is the particular result of an experiment.

Â So for example, if the die roll is a four,

Â then usually we denote this for the lowercase omega.

Â Omega = 4.

Â Here, we don't tend to split hairs about the actual number four and

Â the set containing the element four.

Â But I think in the traditional definition, a simple event is the actual element for

Â not the set containing the element four.

Â But here, I don't think for our purposes that distinction will be necessary and

Â then it's always useful to talk about nothing.

Â So the null event is actually the event that nothing occurs.

Â The null event or empty set and

Â that's usually denoted with a letter here, which I'll just call null.

Â Again, the sets in probability theory follow all the same rules

Â as ordinary set notation.

Â Of course, because it is exactly ordinary set notation,

Â but just with different interpretations.

Â So when we say that an elementary event is an element of an event,

Â then that implies that E occurs when W occurs.

Â So for example, just looking back at the previous slide.

Â If our elementary event is that the die roll is a four and

Â the event is that it is even.

Â If you roll a four, then the roll is even.

Â If the elementary event is not in an event,

Â that implies that E does not occur when W occurs.

Â For example, if the elementary event is a five.

Â Five not being in the set of even number means that when you roll a five,

Â you have not rolled an even number.

Â We can follow along this logic.

Â So for example, E being a subset of F implies that the occurrence

Â of E implies the occurrence of F.

Â So for example, let's take E as the event that the die roll is even,

Â E equals 2, 4, 6 and F is the event that the die is either even or a 5.

Â Hence, F is the event two, four, six and five.

Â So two, four, five, six,

Â then the occurrence of E implies the occurrence of F.

Â That is if you roll an even die roll, then you have also

Â rolled in an element of the set of even die rolls plus five.

Â If the standard set intersection E intersect F

Â implies that both E and F occur.

Â So to give you a specific example of this, imagine that E is the event that

Â the die roll is even, F is the event that the die roll is a prime number.

Â So let's think of what the prime numbers would be on a die roll, that would be two,

Â three and five.

Â So E intersect F means that the die roll is both even and

Â prime that would just be the number two.

Â So the event E intersect F occurs means that you get both a even number and

Â prime number, which of course, in this case, would mean that you get a two.

Â E union F is the standard set notation for union, but

Â in probabilistic interpretation it means that at least one of E or F occur.

Â So in my previous example, it would mean that I either get

Â an even number or a prime number or both in the case of two.

Â If E intersects F is the null set, that means that both E and

Â F cannot simultaneously occur.

Â So imagine E is the set of even numbers, F is the set of odd numbers,

Â then you cannot roll a die that is both even and odd.

Â So, E intersect F will be the null set and

Â that's important enough that we give it its own name.

Â So in bold here, you see its own name.

Â That's called mutually exclusive.

Â So if we say that two events are mutually exclusive,

Â that means that they both cannot occur and

Â you frequently hear people use the phrase mutually exclusive incorrectly.

Â So what it technically means,

Â things are mutually exclusive if they cannot both simultaneously occur.

Â 5:42

And then the compliment of an event, E compliment or sometimes we might write

Â E with a little bar on top of it, that is the event that E did not occur.

Â So in our case, where E is an even number, two, four or six,

Â E compliment is the odd numbers, one, three and five.

Â Since something and its opposite cannot simultaneously occur,

Â their intersection is always the null set.

Â So, E and E complement are always mutually exclusive.

Â There's some standard set theory facts that we should also just remind you of,

Â there's the famous so-called DeMorgan's laws.

Â So A intersect B complement is A complement union B complement and

Â they way to think about this is this little complement symbol

Â sort of distributes itself across the parentheses to A and B,

Â A complement and B complement and it flips the cap into a cup.

Â And then what's nice is if you look at the second example of DeMorgan's law,

Â A union B complement is a complement intersect B, the same thing happens.

Â This C distributes itself across the parentheses, so you get A complement and

Â B complement.

Â In this case, the cup turns into a cap.

Â So DeMorgan's law basically says, if you complement across and

Â either at intersection or union, the compliment distributes itself, but

Â it flips everything.

Â It flips all the cups and caps.

Â So, I struggled to come up with a verbal example of DeMorgan's laws and

Â here's the best I could do.

Â Let's let A be the event that you're an alligator and

Â B be the event that you're a turtle, so

Â the event that A union B is the event that you are either a turtle or an alligator.

Â 7:30

And then complementing that, that means if an alligator or turtle you are not, then

Â DeMorgan's law says that's a complement intersect b complement A complement is you

Â are not an alligator, intersect B complement is you are not a turtle.

Â So the set theory association with the English would be if an alligator or

Â a turtle you are not, then you are not an alligator and you are also not a turtle.

Â This is the equivalence between those two sentences.

Â I think everyone agree those two sentences agree.

Â Another example for the second DeMorgan's law, if your car is not both hybrid and

Â diesel, so A is the event that your car is hybrid and

Â B is the event that your car is diesel and you complement their inner sections.

Â So if your car is not both hybrid and diesel,

Â then your car is either not hybrid or not diesel.

Â So, A compliment union B compliment.

Â Some other small little facts that I'm sure you remember from set theory,

Â A compliment, compliment is A.

Â So if you do not, not get an even number, you get an even number and

Â A union B quantity intersection C is A intersect C union and B intersects C.

Â So the way to think about this just to remember it is think of the union as sort

Â of plus and the intersection is multiplication.

Â This little rule looks just exactly like the distributive property.

Â So C sort of gets multiplied by A.

Â C gets multiplied by B.

Â It sort of distributes across the plus sign sort of the unions.

Â So that's the way you can sort of remember that one.

Â So that gives you a very basic Rosetta Stone, taking ordinary set notation and

Â connecting it to how we think about it in probability.

Â Next, we're gonna actually use the set notation to develop probability.

Â So this is a very brief section and in this discussion,

Â we're just going to talk about probability at its very conceptual level.

Â And in the next section, we'll talk about probability at it sort of

Â mathematical foundation, but I wanted to spend a minute talking about

Â where we're going with probability as a modeling tool to analyze data and

Â here's a strategy that underlies much of science and the idea is this.

Â For a given experiment, attribute everything you know to a systematic model.

Â So a good example of this are things like lines and planes and

Â hyperplanes where people presume that an outcome say,

Â something like hypertension depends on a lot of predictors in a linear fashion.

Â And so that's either known or it's theorized or

Â it's assumed for sake of convenience, but that relates known

Â predictors to the known outcome and then attribute everything else to randomness.

Â Now this is a very difficult bullet to swallow I think for

Â many people, because in nearly all applications of probability,

Â what the word random means is very difficult to tie down is example earlier

Â on in the lecture we're talking about retrospectively sampling hospital records.

Â And in this case, if you were to model the outcome of whether or

Â not a person had a disease as is predicted by their history

Â where we perform some form of retrospective sampling.

Â It's not exactly clear where the randomness is coming from or

Â even what randomness means in this context.

Â 12:11

And as you can imagine from this discussion, all three of these first

Â bullet points come with quite a bit of baggage in the terms of assumptions and

Â things that you cannot evaluate at all and so what we'd like to do is check

Â how sensitive our conclusions are to the assumptions in these models.

Â In some cases, we can actually directly verify them.

Â We can check whether the relationship between the response and

Â the predictors looks kind of like a line, so we're okay modeling it as a line.

Â In other cases, they involve assumptions that we can't possibly check.

Â They involve variables that we did not collect or

Â variables that we do not even know.

Â And in this case,

Â we have to evaluate our sensitivity to our model in terms of unknowns.

Â We have to evaluate how robust our approach is to the unknowns and

Â this comes from the study of how the data was collected?

Â How the statistics were used?

Â What exactly is probability actually modeling?

Â In what follows, we're going to both cover the mathematics of probability, but

Â hopefully touch on these subjects.

Â Now I want to emphasize that these are very, very difficult topics

Â that many people struggle with if thought of with sufficient depth and

Â what we hope to do in this class is mostly get you started thinking about this.

Â And I think if you just did one thing when thinking about probability

Â in your data that you're analyzing is when you say,

Â I have a 95% confidence interval or my p value is blank or

Â something like this where you actually use probability in your actual data analysis.

Â Go through the exercise of trying to think what is it that you're modeling is random.

Â What is the sources of this randomness and how good of a job do you think

Â your probability statements do at characterizing this randomness?

Â This is the end of Mathematical Biostatistics Boot Camp lecture one.

Â In this lecture, we covered basic conceptual ideas.

Â And next lecture, we're going to be covering much of the basic mathematics

Â that underlies probability.

Â So make sure you have plenty of coffee to get ready.

Â [MUSIC]

Â