Hello, everyone. Now that we've defined the concept of validity, we're going to be talking in this lecture about threats to validity, specifically for design data collections. Let's get back to the big picture of the total data quality framework. Again, our focus right now is on the validity dimension among all the different measurement dimensions. We're thinking about a theoretical construct, and then we're thinking of what variables, what questions, what fields we could use in order to measure that particular construct. We want to make sure that those variables, those fields, those questions, are going to provide valid measures of the theoretical construct that we're interested in. Now, we're thinking about threats to validity in terms of going from our construct to our variable of interest. Let's think about some threats to validity for design data. The question here really is, what reduces the validity of survey items or questions? In particular, our focus right now is on design data collections, like we talked about last week in terms of our web survey in our SurveyMonkey example. We want to see, what is it necessarily that poses a risk to the validity of survey items or survey questions? Here's just a list of some of these possible threats that we might be worried about. First of all, poor questionnaire construction. We have spelling errors, we have phrases that are difficult to comprehend, response options which don't make a whole lot of sense. Here's an example, on the right-hand, side of a survey question that was not very well designed. You can see that the question isn't really even a question, it just says depth, the material presented was the right technological depth. At first glance, you'd think the answers to this kind of question would be strongly agree, agree, neutral, disagree, strongly disagree, but you can see that the choices of response options don't really match with the actual question itself. Depth, the material presented was the right technological depth. Excellent, very good, good. There's an inconsistency in terms of the response options and the actual question being asked. If a person has been invited to a web survey and they see a question like this, they're immediately going to be turned off and say, this doesn't make any sense, I'm no longer interested in taking this web survey, these questions don't mean anything, and that's a serious risk to design data collection, so we have to be very careful and make sure the questions are carefully worded and the response options make sense. Questionnaire pretesting is also very important to make sure that these issues don't emerge. We want to make sure that we carefully pretest our questionnaires before we actually put them out in the real world and out in the wild, as we might say, where people are going to be responding to these questions from all walks of life. Translation issues could also be a serious threat to validity, where different cultures may interpret question texts differently. We'll come back to that issue here in a second. Specification error more specifically is a gap between the theoretical construct of interest and what it is that we're actually measuring. This is an error that can emerge in designing questionnaires where you have a theoretical construct in mind but what you're actually measuring via a poorly worded question or poor response options is different from your construct of interest. Specification error is very closely related to validity. We want to make sure that the questions that we're specifying are in fact capturing our theoretical constructs of interests. Construct bias is also a potential threat to validity where you could be looking at non-identical constructs that are measured across cultures or countries. These constructs may mean different things to different cultures or different countries. All of these are threats to the validity of survey items. Here's some examples. One example is the questions related to euthanasia posed a challenge in several countries, such as Ethiopia, for example, where the entire concept of euthanasia doesn't even exist. There's a reference here, the World Values Survey in 2005. This is an example of specification error, where you're interested in a theoretical construct, this idea of euthanasia, or maybe how people feel about it or any experiences with it, but in that particular country, that particular culture, the entire concept doesn't even exist, so we have this mismatch between our theoretical construct and the question that we're asking, which is not really getting at any meaningful construct. Another example, the interpretation of filial piety, which is behaviors associated with being a good son or a good daughter, that interpretation of that concept differs between Western societies and Chinese society. Again, here's another citation. Where you can see more of a description of what was meant here. Some of the threats that emerged in this particular study included translation issues, where interpreting that particular term meant something different in Western societies and Chinese society, and also construct bias, thinking about this theoretical construct and whether people from different cultures are going to be interpreting it the same way when they respond to their survey question. That too could pose a threat to the validity of that particular question depending on the culture in which we're asking that question. There's more details in that 1997 reference here. Some additional examples. In the World Value Survey, there were dramatic between country differences found in support for military rule. In general, there was higher support found in Vietnam, Iran, and Albania. This was explained by substantive changes that were made to the translations of these items in these different countries. There's a reference here by Kurtzman in 2014 that talks about this in more detail. This was due to translation issues and also poor questionnaire pretesting. Again, very important if you make any kinds of changes, so the way that items are being asked, that you carefully pretest those questionnaires and make sure that respondents in your country of interests can understand everything. There have also been difficulties in understanding survey questions that are more often reported among ethnic minorities, and this is an article by Dutwin and Lopez in 2014. Here, again, translation issues can emerge, problems with construct bias, and also problems with pretesting. A lot of these threats to validity can come up in multiple different contexts. These articles will have a complete reference list available at the end of the course where you can see all these different articles to which we're referring. These talk about all of these issues that can arise in more detail. These threats to validity, again, make the responses to the survey questions that are being asked less valid, and that ultimately has an effect on the quality of any estimates based on the surveys that you're reporting. Here's another example. Suppose that you have a survey measure with perfect validity. It can still produce a bias survey estimate. For example, suppose that you ask a question about a person's weight in pounds, and suppose that everyone answers with their weight only minus 10 pounds. Maybe there was something weird about the way that question was worded or how exactly that question was asking individuals to report their weight, but everybody responds 10 pounds less than what their actual weight is. The correlation of these responses with their true values as a measure of the validity of this particular question won't be changed by simply subtracting 10 from everybody's observation, you're going to get the same correlation of those reports minus 10 pounds with the true values of their survey weight because we subtracted 10 pounds from everybody. Everybody's on the same footing. They're just reporting a value that's 10 pounds less than it should be. A common measure of validity is to look at the correlation of the survey responses with the true values for that variable of interest. That correlation's going to be the same, even if everybody subtracts 10 pounds from the truth. But when you go to compute an estimate based on all those survey reports, the resulting mean weight, say if you were interested in calculating the average weight in some population, that resulting mean weight is going to be biased low. This is a case where when we get to analysis, there's going to be some analytic error. The way that the survey question was worded, it seems like it's providing a valid measure of weight given that correlation, but the overall analysis is going to be affected because everyone was reporting biased responses relative to what the true value should have been. This gets back to the idea of questionnaire pretesting and making sure that everybody's interpreting the question correctly and they're reporting their actual weight, not the weight minus 10 pounds. What's next in terms of our discussion of validity now? Given that we've talked about threats to validity for design data, we'll start to think about tools for how to mitigate these threats, starting with a technique known as cognitive interviewing. This is a very important form of pretesting survey questions and making sure that survey questions can be understood by a variety of different people. Again, some of these threats are mitigated. We're also going to discuss automated question scoring tools for design surveys. We're going to look at something called the Survey Quality Predictor, and we'll do a demonstration of that online. Again, as a tool for assessing the overall quality of survey questions, taking their text and the response options into account. Then, we'll turn to a discussion of threats for gathered data. So far we've been talking about threats for survey questions in design data collections. We'll also turn to a discussion of threats to validity for gathered data. Thank you.