Large complex health systems exist in an environment of explicit incentives intended to drive behavior of patients and healthcare providers. For example, increasing copayments for the healthcare services that they get such as emergency care has been shown to decrease utilization of these services dramatically. These incentives are often for the purpose of improving the efficiency of healthcare. If that's your goal, there are two ways of achieving it. First, you could improve the quality of healthcare, or second, you could reduce healthcare cost or the utilization of healthcare resources. What does this have to do with AI? The existence of incentive structures that are built into healthcare systems through payment models means that physicians, hospitals, and insurers need to measure things related to the quality of care, costs, and utilization. These incentive structures have also led to classifying and predicting features of patients and patient care. It turns out that AI is particularly good at classifying and predicting, especially using really large datasets and especially in terms of probabilities. AI-based predictive analytics are feverishly being applied to help data for classifying and predicting. These AI models like more traditional predictive modeling, use statistical techniques to correlate features of patients such as vital signs or diagnoses with outcomes such as mortality or length of stay in the hospital. But there are now so-called big data on really large numbers of patients and huge amounts of electronic information. These data can be collected from individual patients, from medical records, digital images, and monitors. This means not only the AI would be useful for analyzing all these data, but actually necessary for improving the accuracy of classification and prediction. AI-based models can be used to help healthcare systems, providers, and insurers to classify and predict things that could be used to allow them to respond to the incentives I just talked about. That is the models are designed to predict or identify sources of risk, where risk means the probability of harm. Here, one way I'm defining risk is as the probability of facing a financial loss associated with the use of healthcare. AI can be particularly good at identifying risk that is unexpected or less obvious but as I'll talk about next, has limitations when applied to health data, and these limitations have ethical implications. Another type of risk is the probability of medical harm associated with the use of healthcare. This type of harm could come from medical error in diagnosis or treatment or from undertreatment or overtreatment based on uncertainty or because of incentive structures. However, note that financial risks are largely measured and predicted from the perspective of healthcare systems, providers, and insurers. That means they are considering mainly financial risk to themselves, not to patients. What are some of the examples of applications that use AI-based predictive analytics? Here's one. An insurance company builds an AI model that uses insurance claims, electronic health records, and consumer data to predict which of its members are likely to incur the highest cost of care over the next year. An ethical challenge in building this model arises, because we know that vulnerable populations such as the poor, people with disabilities, and people with racial and ethnic minorities tend to incur disproportionately high healthcare costs. But that could be because these populations are sicker when they do enter the healthcare system and thus need emergency or specialty care. This could in turn be because the populations have low access to preventive or primary care. In order to minimize the chances that predictive models discriminate against patients in these vulnerable groups, it's important for the model to risk adjust accurately. Models need to distinguish patients that incur high costs from patients that are sicker or have more medical needs, and to distinguish the effects of variables such as costs, severity of illness, or medical need from effects of characteristics such as race. It is also important to be aware that in the healthcare domain in particular, the data that are available rarely reflect what we actually want to measure. That means that our measurements used for predicting and classifying are proxy measures. When you use proxy measures, this introduces the possibility of systematic error or bias because proxies are always imperfect. When the possibility of bias is introduced, there's always the possibility of discrimination, which is bias that leads to negative consequences for certain groups. It's important to recognize that the limitations of the models that are created when you lack data that are necessary to really explain the outcome of interest. For example, risk adjustment models that use health data typically do not include social factors such as income or education. These factors are well known to have strong effects on all health related outcomes, including hospital readmissions. This example illustrates some specific features of AI model design that have ethical implications and that could lead to harm to patients. Deep knowledge of the clinical characteristics of patients represented in datasets and familiarity with the limitations of available data are necessary to avoid ethical pitfalls in model design.