We've talked about the gold standard method of sampling households for household survey. Now I want to talk about some other methods of sampling households that you may have heard about or you may even have used yourself. There are a variety of different methods for sampling households in household surveys. These vary in terms of their level of rigor. Some are not rigorous and we don't recommend them and we'll explain why. Others are novel approaches that are still being tested but are potentially promising, and still others may be acceptable under specific circumstances. All of these methods tend to be less expensive, faster, and require less technical expertise than the gold standard sampling method. But we generally don't recommend trading rigor for costs. In other words, this is not an area where you want to cut costs. If you use non-rigorous sampling methods, then it can call into question all of your survey results. The first method is using existing lists of household. Rather than doing all this mapping, sending teams out to the cluster to update the maps and develop a list of households, is there an existing list of households that we can use? Some examples of existing lists might be lifts maintained by village leaders or by NGOs working in the area. This may be a feasible approach if the quality of the list is good and up-to-date and the household definition is consistent with the definition of a household that you are using. This is an important consideration because actually defining a household is tricky. Often lists maintained by village leaders, for example, may include an entire family, including the extended family in their definition of the household where you may be using a more restrictive definition or sometimes vice versa. This also works best if you are using villages as your clusters because lists of households maintained by village leaders usually cover the entire village. If you're using enumeration areas and there are multiple enumeration areas in your village, this is not going to work as well because it will be hard to figure out from that list which households are in which enumeration areas. As we've noted though, we don't recommend using villages as your clusters. If you plan to do this, it's really important to check the quality and the completeness of the list in a subset of clusters. So pick a couple of clusters at random, do an actual enumeration of households in those clusters, and match that up against the list maintained by a village leader or the NGO to check essentially the quality of those lists. If you do all this and you decide that you can use these lists, interviewers will still need either a map of the cluster or a very good guide to help them locate the households, because those lists of households often are not accompanied by a map, and so it may be difficult to figure out where individual households live. This method is acceptable under specific circumstances. Another method that people may be familiar with is random walk sampling, which is sometimes called spin the pen or EPI sampling. This was originally developed by the EPI program, the expanded program on immunization in the 1980s for post immunization surveys that they were doing to assess the levels of immunization coverage. I want to note that EPI has changed and since approximately 2015, EPI surveys have been using and having recommending standard probability sampling, so what we talked about in the previous lesson. The main feature of the random walk approach is that there is no sampling frame. Households are instead sampled by interviewers in the field during a random walk. What do we mean by random walk? There are lots and lots of different variations on this. Here I'll give you the EPI procedure as of 2008, which is pretty representative of other procedures. You generate a random starting household, for example, by spinning a pen. Spin a pen, you walk in the direction that the pen points to the edge of the cluster, counting the number of households along the way. Then let's say there are 10 households between where you start on the edge of the cluster, you choose a number between one and 10 at random. You go to that household, that is your starting point. If there are people in the household, you interview the household. Then to choose the next household, you go to you stand at the doorway of the household you just interviewed, you find the nearest dwelling and you go to that dwelling and interview one household. Then you continue on to the next nearest dwelling and so forth until you have interviewed enough households. This method has a number of limitations, which is one of the reasons why it's no longer used by EPI. It can produce a highly clustered sample where the sample is just within a very small area of the cluster. More problematically, it puts the decision about which households to interview in the hands of the interviewers in the field, and that increases the possibility of both implicit and explicit bias. Interviewers may, if the dwelling they're supposed to go to is a little bit farther away or harder to get to, they may instead choose to go to the one that's easier to get to and there's really no way to check that, there's no way to supervise it. Those decisions may be implicit or explicit. There are lots of factors that can influence what households they choose to go to. You also don't have a sampling frame. There's no possibility of calculating the probability of selection, so there's many issues with this approach. It is not rigorous, it is not recommended. It is cheap and reasonably easy and this is why it's done a lot. Another newer approach to sampling involves the use of geographic information systems and satellite images. In most of the variants of this method that have been tried, you use satellite images to identify and sample possible residential dwellings. Interviewers then visit those dwellings and interview the household or if there are multiple households within the dwelling, for example, in the case of an apartment building, they will enumerate those households and sample one to interview. This has been piloted in a number of settings, but at relatively small-scale. There is ongoing work to pilot this or test this in a variety of settings with larger sample sizes and to assess the feasibility and cost of the method. Finally, there's lot quality assurance sampling. This is an approach that was originally developed for quality control in manufacturing. Essentially you sample and test a small number of units within say a lot of cans that you are manufacturing. You don't want to test them all. You sample a small number to test and determine whether the entire lot is of acceptable quality. Household surveys have used LQAS designs to assess whether lots, often small administrative units, have reached acceptable levels of coverage. Essentially, what it tells you is if let's say you set a coverage threshold of 80 percent for antenatal care attendance, what it will tell you then is whether a particular area or lot has achieved that 80 percent threshold or not. But you won't get precise information around the exact level of coverage in that administrative area. What you can do is aggregate samples across lots to obtain precise estimates over a larger area. But if you're doing that, you still need an adequate sample size and you still need to use appropriate sampling. LQAS is a good approach if you want to get local level information for small administrative units for program management purposes. You can also use it to get these point estimates for larger areas, but it's not a way to do a survey with a smaller sample size or with easier sampling. That's not the reason to use LQAS. The reason to use it is to get local level information.