0:01

So the cluster dendogram that's produced by default in R is relatively nice, but

Â it, it is, there is possible to kind of make it a little bit nicer.

Â So this myplclust function which is available on the

Â course website, you can download it straight from there.

Â And once you source it into R.

Â You can get this function which in

Â turn, produces a cluster dendrogram, but then all

Â the, the, the points in the individual

Â clusters are A, labeled by their cluster label.

Â And B, they are

Â colored in separately, so you can see cluster one is labeled with all ones.

Â And it's and it's they're, they're colored black.

Â Cluster two is colored red and cluster three is colored green.

Â 0:38

So of course, in order to do this, you have to determine, you have to specify the

Â how many clusters there are before you can actually

Â you know, label them one, two, and three, etc.

Â [BLANK_AUDIO]

Â Now you can go to the RGraphGallery to see some kind of more

Â interesting cluster diagrams and, and to look

Â at some software that people have used.

Â To create really nice clustering diagrams.

Â So if you want to look at some, for example, this cluster dendrogram which has

Â labels on the, on the leaf nodes, and it's colored each of the groups.

Â It's added a little bit of other information.

Â You can take a look at this gallery, which

Â a lot of other really nice plots on it too.

Â 1:22

So, the other i, issue when it comes to

Â using hierarchical clustering is how do you merge points together?

Â And so the question is, you know, when you, when you, when

Â you, when you merge a point together what, what represents its new location.

Â And so one is called average, so average linkage.

Â And the idea is that if you take two points and their new com,

Â coordinate location, it's just the average of

Â their x coordinates and their y coordinates.

Â So it's kind of like the roughly the center

Â of gravity or the middle of tat group of points, now that seems logical and it can,

Â and it can, it will, it will lead to a kind of certain type of clustering result.

Â [NOISE] The other type of, of approach is called complete linkage and there the

Â idea is that if you want to measure the distance between two clusters of points.

Â Then you take the farthest two points from

Â each, from the two clusters, as the distance.

Â So for example, in this picture, if you look

Â at the upper cluster, and the kind of lower

Â left cluster, the distance between those two clusters is

Â the equal to the distance between the two farthest points.

Â And so that gives you the kind of complete linkage approach.

Â The average linkage, of course, gives you

Â the distance between the two centers of gravity.

Â So that's the difference between the two pluses here.

Â So you can see that in the complete linkage example

Â that distance is relatively far.

Â Where as in the average linkage example the distance is somewhat shorter.

Â And so, it's not like there is one right or one wrong way to do it.

Â But the point is to show that each of

Â the, the two different merging approaches can get to,

Â give you very different results and so it's is

Â useful to try, it's often useful to try both approaches.

Â To see what kind of clustering results you get, in

Â the end and whether one set makes more sense than another.

Â 3:06

The last function I want to highlight here is the heatmap

Â function which is a really nice function for visualizing matrix data.

Â So if you have an extremely large table or a,

Â a large matrix of numbers that are kind of similarly scaled,

Â and you want to, just take a look at them really

Â quickly in an organized way, you can call the heatmap function.

Â And what the heat map function does is essentially runs a hierarchical

Â cluster analysis on the rows of the table and on the columns of

Â the table. So think, on the.

Â So if you can imagine the, the you know, rows of the table are like observations.

Â And then it runs a cluster analysis on the rows of the table.

Â And then, think of the columns of the, of the table as, as sets of observations.

Â Even if they're not actually observations.

Â The columns, for example, might be variables or something like that.

Â But think of them as just different observations.

Â You can write a cluster

Â analysis on that, too.

Â And the idea with the heatmap function

Â is that it uses the hierarchical clustering function

Â to organize the rows and the columns of the tables so that you can visualize them.

Â You can visualize kind of groups or blocks

Â of observations within the table using the image function.

Â So, what it does is it creates a image plot here, and and it

Â reorders the columns and the rows of

Â the table according to the hierarchical clustering algorithm.

Â So here you can see that for example, along the rows I've got this

Â hierarchical, this cluster dendogram which shows that

Â there are probably you know, three clusters.

Â And those clusters, rows are all grouped together.

Â And then there are only two columns in the data frame.

Â So it's not particularly interesting to do a hierarchical cluster analysis on that.

Â But if you had many,many columns, you know,

Â you would reorganize the columns so that we,

Â they would be kind of, the closer ones

Â would be closer together, and the farther ones would

Â be farther apart.

Â And so, the heatmap function is really useful for kind of, or,

Â as a very quickly visualizing this kind of high dimensional table data.

Â 4:58

So this is summarize what for heirarchical clustering.

Â It's a really useful technique for looking at high-dimensional data.

Â It organizes the data in a kind of logical

Â and intuitive way and partic, in particular, functions like

Â the heatmap function are, are really useful for kind

Â of looking, for quickly looking at table or matrix data.

Â 5:17

And so, of course, you need to know, you need to define a

Â notion of what it means for two points to be close or far apart.

Â And you have to have a

Â merging strategy.

Â So we talked about complete linkage and average linkage.

Â So given those two things, you can run, the hierarchical

Â clustering algorithm will run, and it will produce a cluster dendrogram,

Â which shows you kind of how the merging is done, or

Â how the, how the points got merged together through the algorithm.

Â 5:39

So one of, a couple of issues when it comes to, hierarchical clustering.

Â And this is often true for many clustering algorithms.

Â Is that the, the clustering picture may be unstable.

Â And so, for example, if a few points were to change.

Â Or, for example, if you have some outliers.

Â 6:06

So you might, so the, often the use, a useful thing is to try different

Â metrics, distance metrics, to make sure it's not to see how sensitive it is to

Â the different distance metrics.

Â Maybe change emerging strategy to see what kind of

Â picture emer, comes out if you change merging strategy.

Â And also clustering algorithms could be sensitive to the scalings of

Â a given variable, so you might, if, if one variable, for

Â example, is, is measured on with units that are much larger

Â than another variable then that can sometimes throw off the algorithm.

Â So, it may be useful to scale certain

Â variables so that they're more comparable to each other.

Â 6:39

One nice thing about hierarchical clustering, at

Â least the algorithm as discussed here, is

Â that it is deterministic so there's no

Â random starting point, there's no randomness in it.

Â If you run it once, it will run, and with the same

Â parameters and the same data, it will give you the same picture.

Â 6:52

Of course, a key question in any

Â clustering approach is the, is where to cut.

Â And the general idea is, you know, determining how many clusters there are.

Â It's not always obvious

Â to figure out how many clusters there are.

Â And, and so, but there are a number of algorithms

Â that has been proposed, to try to figure that out.

Â But I think the best use of hierarchical clustering is

Â probably just for exploratory analysis and just for looking at

Â your data, visualizing your data, getting a sense of what

Â patterns are there, if there are any patterns at all.

Â And then, you can kind of take these ideas

Â and formalize them later in a more sophisticated models.

Â So I give you some links, the kind of more details

Â about these approaches.

Â And I encourage you to take a look at them to explore further.

Â