That brings us to our next point and that is actually discrimination. So discrimination and personalization which goes together because here as well we made in this previous example, we had a personalized recommendation who to invite to a job interview and we can even write this further. For example, when you go online shopping, for example, on a travel online search engine or a travel online agency, we see price discrimination. So it might be that you and me search for the same place, but you and me are offered different prices. Now, from a point of view of economic efficiency and effectiveness, that's actually in agreement with economic theory. The supply and demand adjust and supply and demand if they adjust on the individual level is called star degree price discrimination, that's actually optimal for economic efficiency. But it still feels like why do you pay more than me? Often it's also more subtle, it's not that the prices are really different, it just the order is different. So they would recommend me a different hotel than they would recommend to you, that they would recommend another, so the order of hotels is different, which is because of personalization. So personalization yes, they know the kinds of hotels I want, they know the kinds of gels you like from content-based and collaborative filtering, but it feels like discrimination as well. We just get discriminated here. There are many studies and a growing body of studies that actually shows how unintentional this discrimination actually comes about when we use these machine learning algorithms in order to make decisions. For example, you probably took the SAT before you came to college and the Princeton Review of online SAT tutoring packages, it turns out it is has a price different cost between $6,600 and $8,400. This is SAT tutoring packages depending on the ZIP code. If you ask Princeton SAT review what they say is well, the Princeton Review says pricing is based on the cost of running our business. So yes in some neighborhoods it might be more expensive than in other neighborhoods to run their business, and the competitive attributes of the given market. So if it's very competitive, not competitive and that also makes sense. Yes, the San Francisco is more expensive than other places and within Los Angeles there some neighborhoods that are more expansive than others. So yes it makes sense you adjust your price, and it's not like everything costs the same everywhere. It's prices supply and demand, people have money, people don't. But what turned out then by doing that what people found is that Asians are more likely to be among those charged higher prices by the Princeton Review. Now, why might that be? Now, the Princeton Review says we didn't sort out Asians in order to charge them higher, that would be ethnic discrimination, racial discrimination. So why is that? Why do you think that happened? Well, the explanation is right here on the sheet. It says, Asians make up to about five percent of the US population overall, but they account for more than eight percent of the population in areas where the Princeton Review charges higher prices for the SAT packages propel. So actually, what happens is that Asians just happen to live in more expensive areas, but then we look at the other way around turns out that Asians pay more. Do they discriminate against Asians? No, it was an intentional. They discriminate according to zip codes, but Asians happened to live in this zip codes, but then it turns out you discriminate against Asians. Don't you, do you, how could you prove it? That's a very contentious issue to say in the word is to researchers because you have this disparate impact at the data analysis. By definition, data mining is always a form of statistical and therefore seemingly rational discrimination. That's what data mining does. Indeed, the very point of data mining is to provide a rational basis upon which to distinguish between individuals and to reliably confer to the individuals the qualities possessed by those who seem statistically similar, as what we did, for example, in collaborative filtering. So the idea is always to look for similarities and differences. So data analysis is, you look for differences. Data discriminates, you look how you can put people in different boxes. When you're dealing with boxes you discriminate. So data analysis is inherently that. This can have these unintended consequences. Now, the good news is I gave you a couple of examples now of how artificial intelligence actually could be racist discriminate against race or have this unintended consequences here just because we either feeded by his data or data is biased or some other aspects which leads to a discrimination. The good news is that we can work on that. Even better news is that it's much easier to change an algorithm than to change a human brain. Because we know we all have this prejudices. We all bias. We've always stereotypes evolutionary wise that had a very important role. Stereotypes help us to take decisions quickly that protect us potentially wrong but better safe than sorry. So evolutionary via stereotypes had a very important role in that and still have. But we cannot get them out of our minds. So even if you go, I don't know you go on to be on the supreme court and you have 40 years of training as a judge to be completely impartial, we have proven and studies and studies that you will still not be impartial. You will still be bias, you will still have prejudice stereotypes and so forth. We cannot get them out of this information processor. However, in algorithms, this is a very growing and blooming and very active field of research can be trained algorithm that has still accurate, that still discriminate, that still put people in boxes so we can make predictions but don't discriminate against the few things that are protected by our constitutions by laws. For example, if you know that what is protected by law is gender equality, race and ethnicity and religion. For example, you can take these three or four variables and make sure that while you make your machine learning algorithm, for example, a decision tree or whatever you want to do, that in this machine learning algorithm, that these variables are not discriminated against. Now, you'll lose some information because more data you can make better predictions. So these variables you don't want to use use, you actually want to make sure that they are equally represented in the outcome. So you lose accuracy. Now, from a computer science or statistical perspective, from a data analytics perspective, all you loose accuracy that's horrible. But it has been shown that actually you have to lose very little accuracy. So instead of having accuracy of 90 percent you can have 89.5 percent. So you lose very little accuracy but then the outcome is you can completely eradicate any bias. So if you would use an adjusted algorithm in order to ask who to invite to a job interview. It will still make extremely good predictions, and these variables that you predefined gender, race and religion, for example, would be equally than represented in your job interviews, you can guarantee for that. No human research manager can guarantee for that because you can just not handle the state and our prejudices will creep in intentionally on intention. So that's actually the good news in a very blooming field of research of how you can make algorithms that discriminate, learn some aspects, discriminate among them. Who were the beneficial clients? Who were the good job candidates? Who were not? But does not discriminate among others which are things that we want to have protected, that we do not want to discriminate against.