We can describe random variable using definition. So for each outcome to define which value our random variable take for this particular outcome. However, this can be too complex to do it and it is possible to use a more compact way to describe random variable. To do so, we have to introduce the notion of probability distribution. Let us return again to an example where x is the number of heads in two tossings. We see that we can ask some questions, for example, what is the probability that x equals to zero? We can answer all these questions using this table. We know that in our case, when a coin is fair the, probability of each outcome is one fourth. If we ask, what is the probability that x equals to zero? It means that we are interested in the probability of event that satisfies this condition. The only outcome that belong to this event is this one, tail tail. So we can write that probability that x equals to zero. It is a shorthand notation for the following event. It is a probability of event that consists of all outcomes for which this condition holds. So we basically just have to look at this table and select all roles for each value of x is given. So the only row that we have is this row and this outcome. So this is probability of tail tail. This probability equals to one fourth. In the same way, we can ask, what is the probability that x equals to one? How do you think? What is the answer? Let us find, probability that x equals to one. Is the same thing as probability of event that consists of outcomes for which this condition holds. So for these two outcomes. So we have probability of event head tail, tail head. The probability of these event equals to two fourth because we have two equally probable outcomes. In the same way, we can answer the question, what is the probability that X equals to two? This probability as you can easily find equal to one fourth. Now, we can write down the following table. In the first row, we will write all possible values that our random variable X capital can take. We have exactly three values: zero, one, and two. In the second row, we can write down the probability of the corresponding event that our random variable take the corresponding value. So here we write one fourth. This is this one fourth. Here, we will write one half. This is this one half. Here, we will write one fourth. This is this one fourth. Now, we have this table that is a different way to describe the same random variable as here. As you can see, we can use this table to answer some probabilistic question about x. For example, if I want to ask, what is the probability that x is greater than or equal to one? Let us find it using this table only. Again, this is shorthand notation for event that consists of all outcomes that satisfy this condition. So we can use this table to find this probability, but instead of that, we will use this table. If x is greater than or equal to one, we see from this table let it means that x can be either one or two. No other options available. So we can write that this probability is equal to probability of the following event. We see that these two events are disjoint, because if x equals to one, it cannot be equal to two. It means that this probability can be defined as a sum of two probabilities. Both of these probabilities are written in this table. So we see that this term equals to one half. This term equals to one fourth. So finally, this sum equals three fourths. Now, we see that we can use this table to answer any probabilistic question about random variable X. This table is called distribution of x. Now, let us define the general notion of probability distribution. Let us consider some random variable x that take on a finite number of possible values. We'll denote this values by small letter x with indices, x1, x2, and so on, x_n. To draw this kind of table, we have to specify with which probability each of these values is taking. So we have to specify a sequence of numbers that we will denote by P1, P2, and so on, Pn. So we have probability that X capital takes value xj is Pj, and j here is from one to n. This pair of two sequences is called distribution of random variable X. It is easy to see that not arbitrary numbers can serve as Ps in this distribution. Each of this number is the probability of some event. It means that it have to be non-negative number. Also, the random variable X can take some of these values. It means that the probability that at least one of these numbers will be taken and have to be equal to one. It means that the sum of these values, it have to be equal to one, and vice versa. If we have a pair of such sequences and values P1, P2, and so on, P_n, satisfy these two properties, then we can introduce a random variable that takes these values with given probabilities. It means that this pair of values and probabilities is distribution of the corresponding random variable. So formally, we say that we have probability distribution if we have this pair of sequences, such that these properties satisfied. It holds, in case, when random variable take on a finite number of possible values. It is possible to generalize the notion of probability distribution to the case when the number of possible values is infinite, and we will discuss it later. Now, I just want to stress that probability distribution is a good way to think about random variables. In different cases, we will specify random variable with its distribution. Let me also introduce another notion that is useful tool to describe the distribution in mathematical convenient way. This thing is called probability mass function. For random variable that takes only finite number of values, probability mass function is the following function. We will denote it by pmf, probability mass function, of random variable X is just a function. That is, probability that a random variable X takes value x small. This is called probability mass function. Let us find probability mass function for our random variable in this example. We will draw the graph of this probability mass function. They only have values for which probability mass function is non-zero, is exactly those values that our random variable can take. So we have three points for which, the corresponding probability is non-zero. The first point is zero, and the corresponding probability is one-fourth. The second point is one, and the corresponding probability is one-half. Finally, we have value two, and the corresponding probability is, again, one-fourth. So the graph of our function consists of these points. In all other points, we have the value of the corresponding probability mass function to be zero. So the graph looks like this one. It will be useful for us to use probability mass function as a way to visualize probability distributions. For example, here we see that the probability of getting value one is twice as large as probability of getting value zero. Now, although, use this ways to describe probability distributions to consider some other random variables.