0:00

[BLANK_AUDIO]

Â So far we talked about a variety of different,

Â more or less qualitative techniques to test questions.

Â And in this segment, we'll shift the

Â focus a little bit on more quantitative techniques.

Â And, you know, this shouldn't turn into a statistics class, even so we'll talk about

Â statistical models for question testing, latent class

Â analysis and a hint at structural equation modeling.

Â 0:33

So, you know, we're now going to talk about these techniques in general

Â just to give you a sense, this is out there and if you were to dig deeper where to

Â look and in what kind of arena you'll find the research on the practical tools there.

Â 0:48

We'll also talk a little bit about field tests,

Â and then in the final segment we'll talk about

Â other question testing methods that have come up more recently.

Â So latent class analysis is nothing specific for question

Â design, it is used in all kinds of fields,

Â 1:05

but it is a technique that sort of allow us of for, you

Â know, mitigating the effect of measurement error

Â or estimating measurement error with particular items.

Â Brief background, you basically have a set of

Â questions that you observed, that you measure, multiple indicators.

Â And observables in latent variable modeling notation

Â are displayed in boxes here.

Â And they all are the function of a latent variable.

Â So let's say someone is pregnant or not pregnant, right, that

Â would be in the early weeks the unobservable construct that we try to measure.

Â And then you do tests of various sorts, you know,

Â you can ask the person, she may or may not know.

Â You can take a urine sample that these days is usually pretty good but maybe in

Â earlier days they weren't quite as good, so maybe you want to take a few of those.

Â Let's say you have flawed urine test and three of them?

Â Any error associated with that, you know, should be independent of each other.

Â So each sample, if they weren't bought in the same place and produced in

Â the same batch, should have a different likelihood to show up erroneous or not.

Â But, given that the person is pregnant or not pregnant, they in

Â principle should all lead to the same result. So that's the spirit here.

Â Meaning that you have, you know, in latent class analysis a setting

Â where you have indicators that don't necessarily have to be error free.

Â Any error associated with the error is assumed to be independent

Â condition on the latent variable that you actually tried to measure.

Â 2:41

And so what you estimate are unconditional and

Â conditional probabilities, that's sort of happening in the process.

Â For question testing two condition probabilities are important, one is

Â false positive rates and the other one false negative rates.

Â Indicators with high error rates, they are usually assumed to be bad questions.

Â So if you think of this in a 2 x 2 table you can have an indicator u1 and u2

Â and your latent outcome c1 and c2.

Â Every indicator can be correct.

Â So, your indicator u1, the question, for example, for drug use in your survey,

Â says "yes". And in your latent class, the actual construct, this is a drug user,

Â it is also "yes". This would be a correct measurement and

Â likewise this one, "no" in both, would be a correct measurement.

Â What you don't want is a lot of values in

Â these off diagonals where your measurement device has a false

Â negative. So it would be "no" in the measurement, but

Â "yes" in the true underlying construct and likewise here, falls negative.

Â The unconditional probabilities is what we refer to as the

Â actual probabilities to be in one or the other latent class.

Â 3:56

So, inside the cell it's the

Â conditional probabilities, down here, the unconditional probabilities.

Â In the study then we did the University of Maryland alumni, Ting Yan, Roger Tourangeau, and I had

Â fielded three different questions that all got grades, whether students ever had

Â a failing grade in their time at the University of Maryland.

Â We had administrative records to compare these answers to.

Â We didn't know who these responders were,

Â this is all, you know, stripped of identifiers,

Â but we were able to link the true score record to the respondent answers.

Â So, we can examine the error rate relative to the actual values.

Â And so, what you see here are the error rates for these three different questions,

Â relative to the truth, in this particular study.

Â So, you know, Q12 has a very low error rate and Q18b

Â very low error rate, and Q18a had a very high error rate.

Â So these are all false negative rates that you see going up here,

Â and these are all false positive rates. You see a dashed and a solid line.

Â Well, the solid line is the comparison to the actual records, just comparing the

Â two. The dashed line is the result of the of the LCA, the latent class analysis.

Â And you see that the technique

Â gave us the same pattern although it did not quite

Â give us the same point estimate for the false negative rates.

Â You can read more about this in the paper, that is provided to you.

Â 5:24

There are some limitations here, you know, you need to have

Â a separate data collection, the sample size can't be too small.

Â For latent classes, two classes, you need at least

Â three items in order for the model to be identified.

Â If you don't have that, there

Â can assumptions be made to achieve identifiability.

Â A lot of work that Paul Biemer and colleagues have done, is

Â going through these assumption. He's using early work from Hue and Walter,

Â to have grouping variables that help with the identifiability.

Â 5:55

However, you can get biased estimates of these

Â error rates when the assumptions are not met, and

Â that also is discussed in the paper that I

Â just mentioned and that you have in your course pack.

Â You can identify bad survey items this way, if

Â the model assumptions are held or you have enough indicators.

Â One problem, though, is that it doesn't help you

Â to know why a particular problem exists in the question.

Â So it doesn't suggest a fix, unless the fix

Â is to take out that question that you didn't like.

Â 6:27

Another sort of latent variable modeling/

Â structure equation modeling technique that's out there

Â is at the core of the SQP software.

Â So this is a piece of software written by Daniel Oberski

Â and Willem Saris and developed in that group from Willem Saris.

Â The link is provided online for this unit's course pack.

Â 6:51

What this SQP software does,

Â it has a collection of results from a

Â series of multi-trade multi-method experiments that were conducted on

Â multiple items across multiple countries, in order to

Â assess, generally, the validity and reliability of survey questions.

Â So the idea being, if you tried to measure the same thing,

Â the same underlying trait,

Â 7:13

with different methods, then all these methods, all these results,

Â should correlate highly, if they're related to this trait. And

Â they should be, you know, correlated, these answers to this

Â question should be correlated much less so with a different trait.

Â But there is, of course, some method effect going on, the kind

Â of scale you use, or, you know, where you ask the question,

Â they might contribute to measurement error. So you would see

Â correlations across items of different traits measured with the same methods.

Â So you can separate out methods effect and others, you know, in generally

Â trying to estimate validity and reliability.

Â So this huge amount of work,

Â way to many, we could have a whole lecture on how this was done,

Â 8:00

but a good tool to look up and we have the references in the pack.

Â When you use this piece of software, you code your question characteristics in

Â it, and it uses, you know, underlining regression models to

Â fit over all reliability and validity scores for you.

Â 8:21

When you use, as I say, when you use the program, you enter your survey

Â question, you code your question, and then

Â you get these predicted validity and reliability scores.

Â 8:32

So when you use SQP program, as I said, you enter your survey

Â question, you code your questions, and

Â then you get validity and reliability scores.

Â I strongly encourage you to try that out.

Â Look at the website for the link that gives you more information here.

Â 8:48

Now after these two more quantitative techniques

Â I'll move to the last one that we had here, field test.

Â So field tests are sometimes referred to as conventional

Â pretest, sometimes referred to as dress rehearsal or pilot study.

Â Any of those, after the notion that they implement the questionnaire for a smaller

Â sample, could be, you know, 15 to

Â 35 respondents similar to your actual respondent

Â or if it's in, you know, a real dress rehearsal

Â maybe even thousands, if you're rehearsing the next US Census.

Â The goal is to adopt a similar data collection protocol

Â than that you will actually use in fielding the survey

Â with the goal to find out practical problems. Is

Â there an issue with interviewers? Is there an issue with the respondent?

Â Does the length work? You want to time

Â on the question level, on the section level, for the full instrument to

Â see if you are in the scope, if you matched your production targets.

Â And definitely you want to look at the

Â action distribution on your key variables, tabulation, missing data.

Â This field test data is also great

Â resource to actually use the results and start,

Â you know, coding up your analysis in accordance

Â with your analysis plan that we talked about in unit one.

Â Because now you can double check, triple check for that matter,

Â "Do I really have all the variables in my data set?",

Â "Are there any issues?", "Should I have measured them on a different scale?", "Do

Â I need to make any changes in order to implement this in the right way?"

Â 10:22

With the field test, you can also do, in addition, behavior coding, what

Â we talked before. You can build in

Â cognitive probes, we learned about those. You

Â can do some respondent debriefing, ask the

Â respondent afterwards, you know, "Any experiences you

Â want to share with us with respect to

Â this questionnaire?" You could debrief the interviewer.

Â And then you can start your statistic analysis, validity,

Â reliability, the latent class models, the structural equation models.

Â All that is possible, if your field test is large enough and

Â you actually have enough cases to do any of the more quantitative techniques.

Â The more of cases you have, the more costly of course this

Â will be, especially if you have these add- ons in your goal here.

Â