And of course, this new generation of parameters would allow us to further
adjust the inferred latent variable or hidden variable values.
So we have a new generation of values,
because of the E-step based on the new generation of parameters.
And these new inferred values of Zs will give us then
another generation of the estimate of probabilities of the word.
And so on and so forth so this is what would actually happen when we compute
these probabilities using the EM Algorithm.
As you can see in the last row where we show the log-likelihood,
and the likelihood is increasing as we do the iteration.
And note that these log-likelihood is negative because the probability is
between 0 and 1 when you take a logarithm, it becomes a negative value.
Now what's also interesting is, you'll note the last column.
And these are the inverted word split.
And these are the probabilities that a word is believed to
have come from one distribution, in this case the topical distribution, all right.
And you might wonder whether this would be also useful.
Because our main goal is to estimate these word distributions.
So this is our primary goal.
We hope to have a more discriminative order of distribution.
But the last column is also bi-product.
This also can actually be very useful.
You can think about that.
We want to use, is to for
example is to estimate to what extent this document has covered background words.
And this, when we add this up or
take the average we will kind of know to what extent it has covered background
versus content was that are not explained well by the background.
[MUSIC]