So in this segment, we're gonna talk about how to map collaboration networks. We now have some insight into the building blocks that we can use to describe collaboration networks, but if we wanted to develop and actually be able to map and draw out a collaboration network for our own organization, or a particular organization that we're interested in. How do we come up with those network maps? This was the new product development team that we've been looking at 15 people, in a pretty small team. How do we understand which direction these arrows go in? Who actually collaborates with whom inside this particular team or inside the larger organization? Now, this is a pretty simple network map, as I said, we only have 15 people here, but actually, mapping networks can get much more complicated than this. Here is a network map from a much larger organization where you've got, several hundred people, who are collaborating, you're trying to figure out who collaborates with whom. So this kind of network map, it looks like a giant fuzzy hairball, right? This is pretty typical if you're trying to map collaboration in a larger organization with more employees and what you often see in these kinds of network maps. The ones that look like giant, fuzzy hairballs, is that you have clusters of employees who collaborate a lot. And, then there's bridges to other clusters of employees who collaborate along with each other. So, this kind of collaboration map, as you can imagine, is actually pretty complex, right? And so, how do we go about generating this kind of a map for our organization? What I wanna do is give you a sense first for what we're trying to get to here, what kind of data we want to collect, then we'll back up and look at how do we collect that data. So this is an example of the kind of network data that we're trying to collect and here, this example is for the 15 member product development team. So in this particular matrix what we're trying to generate basically is a matrix that says for every individual in the organization or in this product development team all 15 of them. If you list them down the column and then you list the 15 people, same 15 people along the top, along the row at the top, you're trying to find what values should be in each cell. In other words, if you take Alex right as one member of this team. You wannna know how frequently does Alex seek information from Ali, from Bill, from Carl, from David and so on, all the way down the column, all right? Now this kind of a matrix for this particular network question is also asymmetric, that means what you've got in the bottom left corner are gonna be different values possibly from what you have in the top right corner because if you think about Alex could seek information from Ally, Bill, and Kyle. But if you take Ally, Ally might not seek the same information from Alex as Alex seeks from Ally. In other words Ally can have different values for an answer to that question, they don't, if I say I seek information from you, you don't necessarily say you seek information from me. So for each of these cells, you want to figure out How much? What's their answer to this question of how frequently do I seek information from this person. Halley, Bill, Colin, David, all the way down so, this is the kind of matrix we're trying to populate. We're trying to figure out what are the values in each of these cells in order to generate a network, right? This is what we need to generate a network. So how do we get this data in the first place? There's two main ways. Probably the most commonly used way is through surveys, the other set of ways is from other sources. There's lots of sources, and particularly more and more sources coming available for collecting that work data. But surveys are the main way that network data is usually collected inside organizations, so we're gonna focus mostly on that. So how do you collect network data via surveys? This is fairly straightforward if you're used to administering surveys inside organizations, and you know the basics of survey administration, but I'm gonna just highlight a couple of things that are distinctive for network data. So, we're gonna have to think about how to identify our sample, create the survey, administer and monitor the survey and clean and enter the data. So let's take each of those in turn, so for identifying the sample. So when you're thinking about a network the first thing that you have to do is figure out what's the boundaries of that sample? In other words, who is it that I want to survey? And for a network it's pretty important that you understand the boundaries of that network cause you want to get everybody inside that particular maybe it's a formal unit or division or department of the organization. Maybe it's a particular geographical location, I wanna survey everybody in the New York office and the Michigan office and the Shanghai office and maybe it's the communities of practice. I want everybody who belongs to that particular community of practice, maybe it's cohorts. I want the more senior executives, or I want people who have been here between five and ten years in the organization, maybe it's a particular set of teams that you want to survey. So you need to figure out who is it that you want to survey, what are the boundaries of that sample. The second element for a network survey is you want to be very careful about the number of people inside those boundaries, the number of people in the sample. You don’t want too few really because there’s not much point so you don’t really want, if you’re doing a network survey of less than 25 people you're usually gonna pretty much know who collaborates with whom if you know anything about that unit already. You don't necessarily need to go through this whole in order to do it. So if it's a very small sample it might not be worth the effort, but you also have to be very careful on the other side that the sample isn't too large because we're gonna look in a minute how you ask network questions. But essentially, what you're doing is you're asking people to report on their ties with every single other person in the sample, so 300 is already a lot of people. If you're getting above 300 you're asking people to report on 299 plus other people that they interact with and that's just very time consuming. It's the kind of thing people don't like doing in a survey. They'll stop answering the survey and you won't have any data. So you really need to watch that the sample doesn't get too large. Now, in terms of creating the survey, again, a lot of this is stuff that if you have designed surveys before, you'll know already. But you wanna be very careful to write a good opening statement where you explain what the purpose of the survey is also very important in network survey because you are asking about people's interactions by name with others. To preserve confidentiality and a sure people that if they report on their relationship with somebody else it's not gonna get to somebody else. You need to design network questions, and I'm gonna come to that in a little bit more detail in a second. You need to think about what other questions you want beyond the network questions. Maybe you wanna know about people's rank, or about their ambitions, or about whether they wanna leave the organization within a few years. Whatever other questions you wanna ask, you need to think very carefully, as in all surveys, about the order of the questions and how they're formatted. You need to test and refine the survey before you send it out and as with most surveys, you wanna keep it to 10 to 15 minutes or so. It's very easy for network surveys to get very long, because again you're asking people to report on their relationships with lots of other people. But trying to keep it to not more than 15 minutes is really gonna help your response rates. Let me just say a quick word on the next slide about the network questions, cuz this is obviously the harder of what the kind of data you're trying to collect in a network survey. So when you're asking network questions, as I sort of already indicated, you are asking the persons responding to the survey, to talk about or report on their relationships with everybody else in the sample. So here's just an example of what that would look like for the 15 member product development team that we've been looking at. So, this network question says, below is a list of all the members of your product development team, how frequently do you go to each of these individuals to seek information related to your work. And so if you take Alex, Alex is gonna have to report and fill in one of those bubbles for Ali, for Bill, for Carl, for David, and so on and, this is fine if it's a 15 member team. If it's 25 people it's already quite a lot, if you're talking about 300 people it's really a lot. So when I've done network surveys inside organizations, the largest I've done has been about 280, and usually we try to whittle that number down pretty fast and say, well just report on your most important relationships, and then let me ask you more questions about that. So there's ways you can handle this, but keeping an eye on the number of people that you're asking about their networks is gonna be very important. Okay, so back to the stages, the third stage is to administer and monitor the survey. Again, this is not rocket science it really helps to have a cover note from a senior sponsor saying why we're doing the survey, we really want you to fill it in, thinking carefully about the timing of the survey. Are you sending it out at a time when people are actually gonna be able to respond to it? But the piece I want to emphasize here is sort of the incentives and the need for a high response rate. So for a network survey it's very important to have a very high response rate. So there's many kinds of surveys you can do where, if you get a 10, 15, 20% response rate you're gonna be happy, right? Climate surveys and organizations for example, but in a network survey, that's nowhere near good enough. So in a network survey, because you're trying to map everybody's connections with each other, if you don't have half the people responding you're not gonna know what those people's networks are at all right, you're gonna have a big gap in your data. So the sort of rule of thumb is probably you need at least around 80% response rate for a survey to be successful and for the data to be really good, good enough to do network analysis on. So that means you've gotta really think about these timing issues, the sponsorship issues and maybe offering some kinds of incentives for people to respond to the survey. So in a recent survey that we did, that I did with my research team for example, we looked at, we actually offered incentives in the form of if you fill in the survey and return it on time. We'll give $5 to Special Olympics and so it can be something, current event, it can be something that the organization's invested in at the time. People sometimes do, we'll put your name in a hat and you'll win an iPad, so there's lots of possibilities here. But, because response rates are so important, thinking creatively about this might be more important perhaps than it is for some other kinds of surveys. And then the last element in terms of collecting network data by surveys, is you know once you get the data in, so once you got all the surveys returned, you collate, clean, and enter the data as you would for any survey. Very often for network data we do this in Excel, and we create that kind of spreadsheet that the matrix looks a lot like the one that I just showed you earlier. Once you've got it in Excel, you can use network data analysis packages to analyze this data and, I'm not gonna go into them here, it's a whole another module at least and, possibly a whole course by itself. To understand how to analyze the data using the available software packages, but, at this point I'll just say that you can buy or you can sometimes get for free visualization packages that will help you to plot those networks so the giant fuzzy hairball plotting that by hand is more or less impossible. You get data packages that will help you do that, and to analyze the data and run the metrics. For example, the measures of density and centrality that we were talking about on the network data using specific packages. Some of the best known are UCINET and Netdraw to help analyze and visualize the data. So again, where that gets us to hopefully is this matrix in Excel that tells us who, and in this case information from whom, and then we can use that data, analyze that data using these packages. So collecting network data via surveys is the most common way in which it's done, but there are some drawbacks that it's important to be aware of. The pros are that you can ask exactly what questions you want from your sample and get customized and quite detailed specific information. But on the con side, the issues that I've already mentioned just underscore, you've gotta have high response rates, the network can't be too large, you can't make the survey too long. You can't ask everything you want to ask in the world, you've got to kind of decide what you really need to ask. You have to be very careful in terms of how you word the questions and how you interpret the questions, cause in any survey this is the case, right? If you ask people a specific question, who do you seek information from? Do you wanna ask them only about information related to the tasks? Do you wanna ask them about other kinds of advice? Just understanding exactly what you're asking and what people are gonna be thinking when they answer that question is gonna make a big difference for you actually understand and be able to take action based on what they say. Confidentiality, again very important because you're asking people to report on each other by name. And this is, like many surveys, is a relatively costly method of data collection. Takes a lot of time and attention to develop a network survey and administer it, so what are our other options? Well there are other sources that we can get network data from and just to list some of them here, but there may be others that you are already aware of and certainly there's others coming on line all the time. Probably, the biggest other source right, that is sort of everybody sort of thinking about increasingly right now is big data, right. Because we suddenly have all this access to what have been called the breadcrumbs of our interactions, right? We leave behind breadcrumbs when we interact with each other that can be used for other purposes. So when we correspond by email, or by phone, conferencing, videoconferencing, sometimes when we answer questions on bulletin boards, or we interact via social media, each of those interactions, there's huge data sets available that you can use to generate network maps, right? Because those are dyadic interactions between people, you can use that, you can put that into a matrix exactly like the one we were just looking at for the 15 person development team and apply network methods to understand that and analyse that data. So big data is a big opportunity to collect network data, there are also archival records from, offered from inside organizations, some of the research I do. We go and get proprietary information that organizations are willing to share with us, on things like how people work together in the past, have they worked on common projects, do they go to the same school, do they go to similar events anything that is a proxy, if not a direct measure of much they might have interacted. So, those are corporate databases, you can also use public databases sometimes to get again, measures of how much people collaborate. A lot of work is being done, a lot of research is being done on patenting. So, people patent together or they co-author or they, there are co-citations to their manuscripts, those are all sort of interactions of some kind between those people. And then finally, the sort of more field-intensive, really going out there and trying to collect data from people as they're interacting you can observe people as they're working together, you can ask people to keep diaries of who they're talking to and how much. And sometimes you can even have people wear electronic tags that monitor who they're talking to as they walk down the corridor and each of those those can provide the basis, can provide network data. They can all provide this dyadic data about who interacting with whom. Now often, you don't know exactly what's going on in that interaction, but they do let you know that some interactions are happening. So network data from other sources such as these has some advantages, right? Sometimes particularly with the big data, you can get information on much larger networks than you can get through a survey. It's less invasive, you're often not asking people for anything, it may be less expensive if it's available anyway. And maybe it's more objective measures than asking people in a survey to report what they're doing, hey you actually have, oh well they emailed each other, I know they did something. But on the negative side or on the con side, there are serious privacy concerns about using some of these kinds of data. And if you have measures that are kind of proxy's we don't necessarily know, we may know the people are emailing each other, we don't necessarily know whether it's about work or maybe it's about their social life or their kids. It could be anything, right? So maybe it's not such a good measure unless you start looking at the content of that interaction. And then finally, these large data sets are very promising but as we know large data sets can often generate a statistically significant findings that look like there's something there but they just don't, they're not that big effect. They don't really matter that much. So these other sources have some disadvantages as well as some advantages that are worth noting too. So that is it for mapping collaboration networks. I hope you have a good sense at least of what the kind of data is that we're trying to collect when we map networks and then how we can use that data to start generating network maps that we can then analyze and interpret and go on to evaluate.