Usually, defining and implementing a marketing strategy, requires segmenting the market, identifying target customer segments, and defending a value proposition for the segments otherwise labeled as segmentation, targeting and positioning. To do this, we can use some advanced marketing research techniques called cluster analysis, factor analysis and conjoint. analysis. Segmentation for example, means that you want to break the market up into different groups, or to say differently, you want to put customers, into different buckets. You want those buckets to be such that, customers within each Bucket should be somehow very similar to one another, but you want these customers to be different from customers in different buckets. That's a core issue of segmentation. The question is can you identify such buckets? And how do we create those buckets? What is important to consider here is that, whatever differences there might be between those groups, you want segments to have two characteristics. One, you want segmentation criteria to be actionable. You want something that you can use to define a new marketing strategy. The other thing, is that you want those differences, to be statistically significant. Think about the following example. Do men drink more soft drinks than women? To answer these questions, we can perform simple statistical test, using gender as a simulation criteria. However, more often than not, you are interested in situations, where customers are described by more than one viable. In other words a series of variables. Now that you have more than one characteristics for each customers, you need to apply multivariable method, to segment customers. That is to put them into different buckets. This is where you are going to use cluster analysis to group customers together. Cluster analysis, is a broad expression that incompasses various machine learning techniques such as, Hierarchical clustering, K-mean clustering, which is one of the most widely used technique in marketing, distribution based clustering, and density based clustering. The common features between these techniques,is that they create clusters and then assign objects customers in our case, to each of them. Then the idea here is, that there is some measure of association between the entities. Which is not unlike a big coalition matrix. The idea here is that you want to invent some proximity measure for customers or entities. The last assumption is that those clusters, or those sub-groups actually exist in the data. The goal of the marketing researcher, is to find and identify those clusters. Here you see a chart with an example of K-Means Clustering, where you are going to summarize, a series as variable customers, into two dimensions. The X and the Y axis. Then you are going to try to understand how each customer would be located in these two by two plot. Here you can see that, you are going to have two groups of customers. You are going to have customers represented by Red circles, and the customer represented by Black circles. Now the question is, What are these bold dots in the centers? Those dots are usually called the centroids. Basically the center of the cluster. What K-Means clustering is doing, is measuring how far, each of the individual points are from the center of the cluster. That's how the notion of proximity, or distance is going to be measured in the K-Means clustering method. As you can see, once you have identified different clusters of customer, you can then decide which group you want to target.