A balancing act. Privacy, explainability, fairness. In this lesson, our goal is to evaluate tradeoffs involved in ethical model production. How do we put these privacy methods into practice? Well, in previous courses in this specialization, we've mentioned that our definition of an ethical model has to be first accurate. Second, explainable. And third, fair. In the first course, we covered what makes algorithms accurate and how machine learning works. In the second course, we discussed what makes models fair and how to reduce bias and increase fairness for different groups being input into our model. And now we arrive at explainability. To explain this, we need to look at the context of privacy. Because it would seem that at the surface, the more protected a model is, the more it becomes a black box. So how do we balance individual user privacy with ethical explainable models? Well first, why build an ethical model at all? The truth is that in a standard commercial application, a typical model has many advantages over an ethical model. It is first more accurate than the competition, which leads to increased profits and just a lead in the space. Second, it's less explainable, which means their competitors cannot copy and improve upon the algorithm and increase that lead. And then third, it's less fair, which doesn't really matter that much as long as people aren't paying attention and there's no press around it. In an ethical model, you have to give up some accuracy fairness. In addition, an explainable model would bring your algorithm's reasoning closer to the surface. Therefore making it easier for competitors to copy your model. And of course it's more fair, which may matter to us, but in a business sense that might not matter that much, as long as it's not a marketing tactic. So as a more explainable model gets closer to open source, does this really make business sense? Well, there are some hidden explainable model advantages. The first being broad adoption. You are able to tap into the machine learning community with an explainable model. And this would make your company more like a research institution, publishing papers and gaining participation in the advancement of the overall machine learning community. And this has broad effects. Second is primary credibility. This helps with brand strength and security. Because having primary credibility means that anyone who needs help with the model or one of the subjects that the research paper covers, is going to reach out to the company and not someone else for assistance. And third, is recruiting leverage. Companies with explainable models and published research have an advantage in that they are able to attract people coming out of top universities that are working on these papers. You can imagine a PhD graduate looking at a paper published by an explainable company. They are now more likely to apply to that company. We have actually seen this trend over the last couple of years. And in early 2020, famously closed and secretive company Apple, launched a machine learning research blog. And as we saw previously in our reading, they've also been very open about their differential privacy practices. The trend is that for especially well established companies, the advantages of ethical modeling are paying off more than the disadvantages of typical models. And this is an important trend to follow. However, there are some limits to openness, especially when it comes to security. If we think about many AI companies that are working with either governments or government contractors, there is a lot of sensitive data here. And we have to apply limits to openness. If not for ethics, then legally. There have to be certain groups protected and certain information that cannot be made public. Second is individual privacy versus the aggregate. Because even though differential privacy works very well to protect individuals, it can not protect against information in the aggregate that is still sensitive. Take a look at what happened as an example in 2018 when the fitness app Strava, which used a heat map to show popular running routes. And used those routes in their recommendation model. Accidentally leaked sensitive military information. What happened here is the individual soldiers were wearing fitness trackers. And because the service was protecting them only by k-anonymity or differential privacy in the dataset. We're not exactly sure. It did not matter, because in aggregate it was possible to see base patrol routes. And this led to quite the scandal. So it's very important to realize that as we move to open models, we still have some limits to openness. And we'll explore more in the next lesson.