Welcome to this module on Societal Impact. Even if you've done your data analysis in a fair way, if your analysis is valid, your data is valid, you haven't made any mistakes, you don't intend to discriminate, you don't have any privacy problems, you could still have Societal Impact in ways that you don't expect. And I want to talk about a few of these things in this module, specifically the four bullets that I point to here and let's begin by talking about distributional unfairness. To discuss what I mean by this, let's first get to an example. We all hate potholes and a few years ago, some people had the idea that cell phones have accelerometers. Things that tell you that the phone moved. And if you can have things that figure out how many steps you took, the same kind of technology can be used to figure out that your car just drove over a pothole. Well, cell phones also have GPS location. And so, the idea was, if you have the street bump app in your car, every time you hit a pothole, it could immediately report that to the city with the precise location of the pothole. So this application was deployed in Boston in 2012 and it's a very clever idea. It's crowd-sourced reporting by citizens. And so, one would think that this would all work very well. One issue that the Boston city government recognized, in fact, even before they deployed this app, is that if you have this kind of crowd-sourced reporting, the reports are going to focus on the roads driven by people who are rich enough to have a car and a smartphone. In other words, poor neighborhoods might be underserved. And so, they actively worked to compensate by having city employees in city vehicles drive around in poorer neighborhoods just to make sure that potholes in all parts of the city got reported and not just the potholes where the better off people driving cars with cell phones would report from. So here's a positive example of somebody proactively thinking about the disparate impact of innovative technology. So, when one does data analytics of any sort, one needs to think about the impact that one could have on certain social groups. For example, the poor, as we just saw in the preceding example. And unfortunately, all too often, this notion of evaluating the impact is far removed from the everyday concerns of the people. The data scientists developing the algorithms. They're concerned about getting their algorithms to do the right things and that's hard enough. And if that's where the focus is, this notion of disparate impact is too far away. A consequence of this is that, demographic groups have different views on technology and for example, how they view privacy. Sociologists have found that these differences are significant and disproportionate and so we shouldn't assume that the values that we have as individuals are reflected in all segments of society. Their personal experience with technology and with how certain things have been used, may make them react in very different ways. Let's look at another example. When you travel internationally and you enter a country, you're subject to a customs inspection. The point of this inspection is to keep people from smuggling things in that they shouldn't be bringing. Well, we know that most travelers aren't smugglers and it's inconvenient, it's time consuming, it's expensive. The customs authorities don't have enough personnel to search every traveler and every piece of luggage. And so, most travelers are not searched. They have the right to search everybody, but they don't search everybody. The question is, how do you choose which travelers to search? Well, obviously they search the travelers who are most likely to be smugglers. Of course, the customs authorities are going to search for travelers who are most likely to be smugglers and even the most likely smuggler travelers are actually not very likely. They're just more likely than the rest of us who are really really not likely. The more important, more interesting question is, How do they figure out who is most likely to be a smuggler? Now, presumably, an experienced customs agent has antenna that can pick up somebody who's behaving in a suspicious way, or has unusual luggage, or whatever. But let's for a moment suppose that an algorithm is doing this. So what if this algorithm chooses most likely based on various characteristics but it turns out that it's based primarily on the country of origin? So now what we find is that travelers from certain countries that are "chosen" are likely to be stopped for a custom search, and travelers originating from other countries are not. Well, what's going to happen is, travelers from such a "chosen" country are going to feel discriminated against because they're going to be stopped all the time and searched. They'll exchange stories on social media about customs harassment and now, we'll have a segment of the population that feels that there's something wrong about the customs enforcement and perhaps, about the nation as a whole. And note that all of this started from possibly and not unreasonable choice made by an algorithm that was data driven. One way to think of this is to compare this against stereotypes. We built stereotypes as humans. This is our shorthand because our brains have lots of things to do and there's usually a grain of truth behind it and we just are jumping to conclusions. It's a common human feeling. We know objectively that it's completely unfair to the individual who's being typecast. We know that we're inflicting on this individual our opinion of some group that this individual belongs to. But it's a shortcut and there's somebody we don't know a lot about, we just use the stereotype to guess what they might be like. Algorithms in effect, are doing the same thing. They're using various attributes that they know about an individual to classify them. For example to classify them as potential smugglers. And if this classification is based on the membership of somebody in some group based on the value of some attribute, well, that's what the algorithm was asked to do. And so in effect, we are doing the same sort of thing as we do with stereotypes except that now we have a measurable objective basis for building the stereotype. Let's look at some numbers to see how this works. I have no idea how many terrorists sympathizers there are in the U. S. But, let's suppose there are a thousand such people. And let's even suppose that the vast majority of these are Muslim. And so we have 900 Muslim terrorist sympathizers in the U.S. and 100 non-Muslim terrorist sympathizers. And so it's not unreasonable to say terrorists in the U.S. implies with 90% probability that this person is a Muslim. Note, that this is a directional implication and this doesn't mean that somebody being a Muslim implies that they're terrorist. There are three million Muslims in the U.S.. And so even assuming the highly skewed terrorist population that I assumed in this example, less than one in 3000 Muslims would be a terrorist sympathizer. And so, if we end up going from terrorist implies Muslim to Muslim implies terrorist, we're being really unfair to the 2,999 Muslims who's not a terrorist sympathizer, just because there was one, who might be. This sort of problem occurs when we think about using data science for issues where we're actually going to arrest somebody or prevent somebody from doing things, in a way that severely restricts their freedom. So, if you say I want to preemptively arrest a person with criminal intent, well, this is something that in general you can't do. You might say terrorism is something that's sufficiently bad, that society, we agree that there is a possibility of doing this. But, if you're going to go there we need to recognize that prediction is probabilistic. It only indicates likelihood. It only suggests therefore greater surveillance. And so, it's appropriate to do things like more audits of tax returns for people who are more likely to have cheated. Because, yes, audits are inconvenient, they're stressful, they can take up a lot of time, but the harm that we're doing to somebody because we unnecessarily subject them to attacks audit is much smaller than the harm that we do to somebody because we lock them up. Even if one gets past this probability issue, there is another problem with things like predictive policing. So, let's say that we are deploying police forces to higher crime areas in greater numbers. We're not actually arresting anybody, we're not doing anything that really hurts anybody. We're just deploying police to particular neighborhoods. Well, a thing that one has to recognize is that more surveillance can lead to more detected crime. Consider for example, that we have a system where if there are known bad drivers, a police car will follow them every time they get on the road. Every little mistake that they make would result in an automatic ticket being issued to them, and unless they drove absolutely perfectly, they would just be racking up points and would never get off the list of known bad drivers. In contrast, the rest of us who are far from being perfect drivers can get away with a few small mistakes every now and then. You know, drive a few miles over the speed limit, not show a turn indicator when switching lanes, or roll through a stop sign. We break traffic rules in minor ways and most of the time, there are no consequences, because, we just aren't policing every little infraction, for most of us most of the time. In other words, we have the self-fulfilling prediction, because the prediction leads to surveillance, which leads to finding faults. The same idea of predictive policing can be applied in a broader societal context. And so China has a goal of preventing another uprising similar to the Tiananmen Square and has said they're using predictive analytics to create a pre-crime unit similar to what was there in the film, Minority Report. This is based on their notion of a citizen's history file or "dang'an". And it is a new system that they're intending to deploy initially in troubled regions: in Xinjiang and Tibet in China. So, to sum up we've got to own the consequences of data science. Our predictions are probabilistic, we know that it will sometimes be wrong. And if we know that, we've got to understand what the societal cost is of errors that we make. A thing we have to recognize is that the costs of errors are usually different for errors of type 1 and type 2. That is errors where we falsely classify somebody as dangerous, or criminal, or a bad potential employee, or something like this, versus type 2 errors where we've wrongly classify somebody as being safe, or not a criminal, or a perfectly good employee, or whatever. Given that we've got this asymmetry in the costs of the errors, we need to factor this into the algorithm itself, and tune it so that we minimize the societal cost. We actually know technically how to do this. In fact, when we build search engines for example, we have a trade-off between recall, which is how comprehensive the result set is, and precision, which is how much irrelevant stuff gets included. And search engines intentionally dial it towards being comprehensive rather than being precise. We could do the same thing in any classification algorithm. The hard part is we need to own the weights that we put for the two types of errors, and that's hard. We can easily see that it's asymmetric, but the algorithm needs these things to be quantified. And since that quantification is difficult, data scientists will often say, I don't know what those weights should be, I'm going to assume they're equal. I think that even if one doesn't know what the weight should be, one can do a lot better, than just assuming that they're equal.