We will now talk about expectation maximization. A very general machine learning approach in bioinformatics that is applied to a wide range of bioinformatics problems. For example, one of the most popular in bioinformatics is an expectation maximization algorithm. Please note that there is one question that remain unaddressed for both coin flipping and k-means clustering. In coin flipping, we have not discussed yet what to do if the probability of sequence being generated by the blue coin is exactly the same as the probability of the sequence being generated by the green coin. And likewise, in k-means clustering, we have not discussed how to assign a meet point to two clusters that this point is equidistant to. And in hard assignments, we have no choice but to make a decision, either blue or red in this case. But in soft assignment, we saw that there is yet another possibility to assign 50% of point to blue cluster and 50% to red cluster. But how do we assign this responsibilities? Let's recall how we derive HiddenVector from the previous lecture. Let's say, in this case, let's assume parameters are 0.60 for the blue coin and 0.82 for the green coin. Which coin is more likely to have generated the first segments with this 4 H? In this case, we simply computed the probability of sequence being generated by blue coin, sequence being generated by green coin. Compare them, turn out that the blue coin is more likely and that's how we constructed the hidden vector. But now let's try to be a little bit more accurate. Instead of deciding blue or green, let's compute the responsibility of blue and green coin for the first sequence of flips. Of what does the responsibilities of coin for this sequence. Well probability of this sequence being generated by the blue coin is 0.000531, but probability of it being generated by the green coin is much smaller. So it makes sense to assign responsibility of the blue coin as probability of the blue coin divided by sum of probabilities of blue and green coins. And likewise, responsibility of the the green coin will be equal to probability of the green coin divided by sum of probabilities of green coins and blue coin. As a result, we derive that responsibility of the blue coin for the first series of coin tosses is 0.97. And the responsibility of the green coin correspondingly is 0.03. So we are starting to build the hidden matrix. What are the responsibility of the 2nd sequence once again? We compute probability of the 2nd sequence being generated by the blue coin and the green coin, and similarly compute the corresponding responsibilities. Proceeding in this way, we compute the entire HiddenMatrix. So, expectation maximization algorithm works in the following way. You start from data and whatever random choice of parameter and perform the E-step of the expectation maximization algorithm move H from compute from Data and Parameters, compute the HiddenMatrix. Now what we would like to do is given HiddenMatrix and data, we would like to compute parameters. And it is still unclear how we can do this. And that's where what we learned about the dot product will help us. Let's recall how to use the dot product to compute parameters. Remember that the bias of the blue coin was estimated as the dot product of data and hidden vector. While the bias of the green coin was estimated using the dot product of data and 1-HiddenVector. Now let's take a look at the HiddenVector and let's ask the question how the HiddenMatrix carries point into this HiddenVector would look like. This is a simple question, because the first role of this matrix will simply be exactly as a HiddenVector while the second role of this matrix will be 1-HiddenVector. So after we figure it out, it means that since the first row of the HiddenMatrix is the HiddenVector, we can rewrite the expression for the bias of the blue coin as Data dot product the first role of the HiddenMatrix. And we can rewrite the bias of the green coin as the dot product of data and the second row of HiddenMatrix divided by the corresponding expression on the right, of course. And now after we figure out how to compute biases of coins through HiddenMatrix, we can compute the entire HiddenMatrix that's shown here, and we now accomplish this task, moving from HiddenMatrix and Data to parameters. And currently, I showed you how to do this for the simple HiddenMatrix based on binary vector derived from the HiddenVector. But whatever are the rows of the HiddenMatrix, we can use the same formulas that's shown in the boxes in the slide. And in this case, the parameters will be the ones shown on the slide for the blue coin, 0.48, and for the green coin, 0.81. The only question left to address before we move to soft k-mer clustering, is what to do in our coin flipping experiment if there were k coins? For example, three rather than two coins. If you watch how I was deriving parameters using dot products, then you will see the parameters, which represent k-mer vector can be computed simply as a product of an n-mer vector data by the k times n hidden matrix. And this simple, elegant formula will help us to move toward soft k-means clustering in the next segment.