In this course what we're going to be looking at is common social media listening techniques and incorporating analytics into that. And how can we use data that's generated on social media platforms to check for marketing insights? And so this is the first module, we're going to be focusing on common tactics for social media monitoring. We're going to look at what are some common metrics to look at, what information do they provide to us, where can we use that information, and where might we come up short? All right so just jumping into the content. One of the most common measures that we look at when we're engaged in social media listening is a volume measure, literally how much conversation is happening around a particular topic? If we're talking about volume on Twitter we might look at both original tweets as well as the volume of retweeting activity. How frequently are posts from users being shared? In this particular example you can see Justin Bieber's tweets shared to a very high degree. We can see the first tweet about, over 30,000 times that message had been shared. So you can see in this particular example that the celebrity status really driving some of those volume measures. To give you another example of looking at volume measures, it was reported on multiple news organizations following the 2012 nominating conventions for the presidential election. The volume of tweeting activities following each of the speeches. And so we see following Michelle Obama's speech there were 28,000 tweets per minute, whereas following Mike Romney's speech there were only 14,000 tweets per minute. So we're starting to see volume metrics from social media reported as if there's information based on those numbers. And that's really what we want to dig into. What do these numbers actually mean? One last example, to look at, this is the volume of social media mentions. Largely driven by Twitter and that's something we'll talk about in a little bit. But mentions for the phrase, Tiger Woods. And so in this case you can see the very clear spikes in activity surrounding news reports about his personal life is the first spike. The second spike corresponding to time that he had announced that he was planning to take a break from golf. And so we can look at these volume measures to say, as a reflection of what's going on, where is people's interest? So that's one way that we can interpret the volume numbers. But what we have to be careful about is reading too much into those volume numbers. We'll talk about establishing appropriate base lines for interpretation. All right so let me highlight one of the problems that we have when we try to read too much into these volume numbers. And this is going back to the 2012 presidential election, the Republican Primary and this was a study released in part by Facebook, and it was the number of posts mentioning different candidates. And so you can see, Ron Paul leading the pack for quite a while compared to his competition Mitt Romney, who ultimately got the nomination. Starting out it looks like he's tied for second, maybe third place following the Iowa caucus. Looks like he's also again in third place, so not leading the pack early on, so if we were to look at these volume numbers, we would think there's huge interest in Ron Paul as a candidate. Well, if this is from December through the beginning of January, if we were to look at polling numbers. I pulled from Real Clear Politics at roughly the same point in time, looking at where Ron Paul's numbers were. So Ron Paul is in orange. Never does he approach the front runner status when we're looking at what the polls are telling us. All right, so we see a little bit of a disconnect here, on the one hand we have a measure of volume based on social media activity. On the other hand we have more of a traditional research technique, saying offline polling, and you can think the same way that we think about using surveys for marketing research to understand perceptions of brands. That they're not always going to be aligned with each other. And why is that? Well, part of that's going to have to do with who are the people participating in the social media conversations vs the people who are participating in traditional polls. When it comes to social media we don't have the same control over the people who are selected to participate in the conversations. The decision that somebody makes to participate in that conversation is probably indicative of their level of interest. So if you start posting about a politician whether it's a presidential candidate or a senate candidate, on Facebook on Twitter, chances are that you're very interested in that race. Whether you're a supporter or a detractor for them, you're probably very interested. You have a strong interest in politics compared to the average person. Whereas, if we look at people who are participating in polls, not everyone has that same level of interest. So we're potentially dealing with two different populations and they may not be representing the same groups. What we see on social media might be more of those hardcore individuals with strong preferences. Whereas what we see with traditional polls, traditional marketing research studies, is that we go out of our way to make sure that it's representative of the population of interest. And that representativeness is going to be one of the big concerns that we're going to have when we're trying to use social media data for marketing insights. So what can we do to get around the misleading inferences that come from volume? Well first and foremost we want to make sure that we establish an appropriate base line. What is the normal level of conversation whether it's around an individual, around a particular topic, around a product. We need to know what the normal level of conversation is. Once we've established that baseline, what we should be focusing on are deviations from the baseline. Am I getting above average volume in conversation, or below average volume in conversation for a sustained period of time? If that's the case, there's been a systematic change, that change is something that I might want to spend a little bit more time investigating. Lastly and probably most importantly, does the measure of volume actually matter to you? And this is something that's going to require linking social media activity to KPIs that organizations have. So what are the metrics that are most important to you and lets establish is there a relationship between social media volume and those performance indicators? Just to give you two examples, when I pulled up these numbers, Coca-Cola had 73 million Facebook fans, only 1.9 million followers on Twitter. We looked at Lady Gaga. She had 59 million Facebook fans and 40 million Twitter followers. I doubt that Coca-Cola's worried about the fact that a performer has more Twitter followers than them. So, we've gotta understand how can we use these metrics? Is it something that ultimately matters to us? Should we be investing in building our social media followings? Well, what's the argument for doing that? If the argument is the more social media followers I have, the more potential exposure of my marketing messages I can have with those individuals. And if you expect that your marketing activity is going to drive that volume, and you're using volume as a gauge for how engaged are your customers with you. Well, that's a reason to have a focus on volume. But if it's just to say well, we've got this many followers we've got this many fans, now we're just using these as vanity metrics and they don't really convey any insight. All right, so let's move beyond volumes, because that's one measure that's commonly reported. The other measure that's commonly reported is sentiment. And so, sentiment is reported in a number of different ways. One way that's coming is a scale from 0 to 100. This graphic was pulled from USA Today during the 2012 election. It was a study performed by a company Topsy which has since been acquired by Apple. The way that Topsy had done sentiment, zero corresponded to negative sentiment, 100 corresponded to positive sentiment. And so this is taking the average across all the comments posted on a daily basis to say, what was the social media sentiment for each of the candidates? I said that's one way that we can implement sentiment scoring is to say 0 is negative, 100 is positive let's take an average and scale things. Scale all the comments so that we get a score between 0 and 100, that's one approach. What that relies on is that each individual comment is being classified as positive, neutral or negative. A common coding scheme that's used, positive is coded as a one, negative's coded as a negative one, a neutral comment is coded as zero. Then, how do we process this information? Well, what Topsy did, was to say, let's take a composite score, let's average all the comments contributed on a particular day together, and let's rescale that. So instead of the average being between negative one and one, let's rescale that to be between 0 and 100. Another approach that's used is to say what's the fraction of positive comments to negative comments? And so the higher that ratio, the more positive the conversation. The lower that ratio, the more negative the conversation. A ratio of one meaning that you have equal number of positive and negative comments. Comparing these two approaches, what's the difference? Well, when we start looking at positive-to-negative ratios, we're throwing away all of those neutral comments. All right, and so we lose a little bit of information there. If I had 80% neutral comments, 10% positive, 10% negative, I end up with the same ratio as if I had 50% positive and 50% negative comments. So it might not reflect truly the degree to which the contributions are polarized. The composite score, on the other hand, is going to take into account those neutral comments. But that's going to have its own limitations because of the averaging that we're doing. If you wanted to convey the most information possible, what you would end up reporting is the full distribution of comments. What fraction of comments were positive? What fraction of comments were neutral? What fraction of comments were negative? From an analysis standpoint, if you have that full distribution, you can incorporate those fractions into your subsequent analysis. You can construct your own composite score. You can construct your own ratio. So whatever metrics you might want to calculate based on the raw data, the full distribution would allow you to do that.