0:41

If they belong to the same partition T and

Â they also in the same cluster C, in that case, this is a true positive.

Â For example, we look at this case.

Â The definition is the true positive is a number of such cases, okay?

Â For example, for any two points x sub i and x sub j,

Â if they have the same true partition label and

Â they also have the same cluster label, that means they belong

Â to the same cluster, then that's the case of true positive.

Â For example, we just look at this case.

Â For these two blue points, they belong to the same

Â ground truth T2 and also they belong to the same cluster C2.

Â That's true positive.

Â Okay.

Â Then what is false negative?

Â False negative means they have the same ground-truth partition label,

Â but on the other hand they are not in the same cluster.

Â For example, you just look at these two brown points.

Â They have the same ground-truth T1, but they belong to different, clusters.

Â So that's the false negative case.

Â Well, what is false positive?

Â False positive means they actually have different partition label,

Â but they are in the same cluster.

Â For example, just look at this blue one and this brown one.

Â They actually do not have the same partition label but

Â they are in the same cluster, okay.

Â What is true negative?

Â True negative pairs actually means that they do not have the same

Â partition label but they are also not in the same cluster.

Â For example, if you look at this black one and this blue one.

Â These two points, they do not have the same partition label,

Â but they are also not in the same cluster.

Â So that's a good case.

Â Then we see how we can calculate the four measures.

Â First, given n points in the data sets, the possible pairs you need to examine,

Â actually the n chooses two, so that's the reason you have this formula.

Â Then for the true positive case is you get all the, the partitions and

Â the clusters, if they agree to each other, that's the case you have n i j,

Â you choose any, these case you choose any two, you get that many cases.

Â 3:13

Then what is false negative?

Â False negative means you get that many partition cases,

Â but they do not belong to the true positive, okay.

Â What is false positive?

Â That means you have so many clustering cases, but

Â they do not belong to the true positive.

Â Then what is true negative cases?

Â That means for all the cases, they do not belong to any one of

Â the above three cases, then they are true negative.

Â Okay, so that computation we can just simply use those formulas.

Â 3:49

With the introduction of true positive, false negative, these four measures, then

Â we can calculate other measures like the Jaccard coefficient and Rand statistic.

Â Then, we still take this figure as our illustrative example.

Â Then for Jaccard coefficient, remember we define the Jaccard coefficient before.

Â And this one is the definition for the pairwise measures.

Â And this Jaccard coefficient have the similar kind of flavor.

Â You probably can see.

Â You have true positives divide by all the cases except that you ignore the true

Â negative case, okay.

Â 4:26

That means the fraction of the true positive points, but

Â after ignoring the true negative cases.

Â Therefore, this computation, positive and

Â negative are different to the, our asymmetric measure.

Â However, for perfect clustering Jaccard coefficient should be one because,

Â you know, all the cases you actually cover.

Â 4:50

You don't have those false ca, cases.

Â For a Rand statistic is you take all the true cases,

Â true positive plus true negative divided by all the possible pairs.

Â Okay?

Â That means you don't cover anything like a negative.

Â Okay? That's why, if it's perfect clustering,

Â Rand statistic should be 1.

Â 5:22

Then there is another interesting measure called Fowlkes-Mallow measure.

Â This measure is a geometric mean of precision and recall.

Â Remember we just studied F measure,

Â which is a harmonic mean of precision and recall.

Â For geometric mean Fowlkes-Mallow measure,

Â essentially is precision times recall, you get their square root.

Â 5:49

And once we introduce all these measures, if we wanted to calculate for

Â example any table like this one this green table,

Â we can use this measure to calculate all the numbers.

Â We will leave this calculation as an exercise,

Â instead of spending time lecturing here.

Â [MUSIC]

Â