The reason that ethics matter in data science is usually because there is impact on humans of whatever things we're doing when we practice data science. To understand how people have thought about this, in this module, we're going to look at human subjects research and the concept of informed consent. Let's begin by talking about what human subjects research means. There is an infamous study conducted at Tuskegee University and funded by the Centers for Disease Control, where the idea was to try to understand the development of untreated syphilis. In 1932, syphilis didn't have good treatments. And so understanding the kind of debilitation it caused was possibly a reasonable thing to do from a medical investigation point of view. To be able to conduct the study the study organizers recruited 600 black sharecroppers in Alabama and setup a fairly conventional looking study. There were 399 participants who had syphilis, 201 who didn't, and they could do studies where they could see the disease versus control group in terms of various things. And what the participants were required to do was to show up at the clinic on a regular basis where they'd be given a health exam, they would be given health care for other small things that might be affecting them. They'd get hot meals, and so they participated. Now these visits, they could also have things like blood drives and these things like the blood specimens would then get analyzed as part of the study. The problem is, antibiotics were invented. Penicillin became the standard treatment for syphilis by 1947 but nobody ever bothered to tell the subjects of the study. Nobody treated the subjects of the study with penicillin. And the study just went on, and on, and on. Until in 1972 there was a public outcry driven by some whistleblowers. And there was a general strong feeling at that point that whatever medical value the study may have had, the harm that was being done, in terms of having several hundred poor people not being treated for syphilis when treatment was available, was just not a conscionable thing to do. The result is the creation of what is called an IRB. Which we'll talk about in a minute. The thing that these IRBs, these review boards will monitor is a process called informed consent. And the idea is that if any research is conducted on a human subject, then this human subject must be informed about the experiment. Must consent to the experiment voluntarily without any coercion and must have the right to withdraw consent at any time. So, even if they agreed to participate in the experiment and serve as a subject, if after a while they change their mind and they wish to drop out, they should have the right to do so. These principles of informed consent were not met at Tuskegee because the subjects were not informed about the experiment. They were misinformed, they were not told about possible treatments. They were not told that their syphilis would not be treated. And with this incorrect information, they did voluntarily consent to the experiment. And maybe they had a right to withdraw consent at any time but it's not clear that anybody ever told them that they had that right. The big issue here is that the experimenter was assessing the benefit versus the harm. And usually in these sorts of situations, the harm is to the individual subject, and the benefit is to society or to science. And if the benefit is to one party and the harm is to another party. The assessment is sometimes different than if the harm is to the same party as the one that benefits. And a key principle of informed consent is that the party that might potentially be harmed, that is the human subject. Is the one who has to decide that on balance as benefit to society and possibly the way that's reflected in terms of payments or free food or whatever it is that the human subjects were getting. Was worth it in terms of the risk to them. Since the full details of the benefits and harm are difficult for the human subject to be fully informed about, there is this notion of an institutional review board, or an IRB. And what this IRB is supposed to do is, look at the human subject study. Try to weigh the harm versus the benefits. Make sure that the informed consent principles are appropriately followed. And this board has a diversity of membership including non-scientists. It's supposed to include some scientists who can make a pitch for what the science value is, but also non-scientists would represent society in broad terms. The institution review board has to approve the study. And in particular, they approve the informed consent that the human subjects will actually sign before they can participate in the study. Now it turns out that informed consent has exceptions. For example, in psychology there is often a thing that somebody wants to study. Say for instance, somebody's reactions to race, and if that's what they want to study, they may have study volunteers, people who conduct the experiment, study assistants of different races. And the study itself is to try to see how the human subjects react to scientists of different races let's say, but that's not what they're told. What they're told is that the study is about something completely irrelevent just because that's an excuse for their participating in the study. So this kind of ruse is a very common thing in psychology experiments. And the problem with having any kind of ruse is that you're actually not informing the subject. The Institutional Review Board is particularly careful when there are violations of the informed consent principle and they try to make sure that it is the minimum violation necessary for the study to be conducted. And that the type of violation is such that there is unlikely to be harm to this human subject. In things where harm are much more likely, things like medical study. Usually, one doesn't need to have violations of informed consent in terms of you don't need a ruse for the study. And therefore one would not expect an IRB to approve such violations.