We need now to consider one last topic on accuracy assessment, to complete the theoretical aspects of remote sensing before we move onto radar. When we looked at training a classifier algorithm, we had to have enough training pixels per class to estimate reliably any parameters in the technique we chose. We now turn to the related matter of choosing independent samples from a thematic map, that is referenced data or testing data, in order to get reliable estimates of the maps accuracy. If we don't choose enough, then we will end up with a poor estimate of map accuracy. We will undertake this analysis by looking at how well the testing samples we choose actually sample the thematic map. In this slide, the top diagram represents the thematic map itself. We are going to work out this theory with just two classes, but we will comment on the multi-class situation later. In a sense, we don't have to know what happens with each of the two classes. All we are interested in is whether a pixel in the map is correctly labeled irrespective of its class. By knowing how many pixels are accurate overall, we can get the map accuracy. In this top diagram, the shaded pixels are those which have been labeled incorrectly by the classifier. We can describe the situation by a binary period, which takes the value one if a pixel has the correct label, and zero if it is in error. The second diagram shows a random sample of testing pixels. Because they are chosen randomly and because we do not know which pixels in the map above are correct and which ones are in error, we have to choose enough testing pixels to be sure we sample the map so that our accuracy estimate is reliable. The bottom diagram shows the testing samples layered over the thematic map. We now define a new boundary period which takes the value one if a correctly labeled pixel is sampled, and zero otherwise. First, we look at the distribution of errors described by the variable y_i as shown. That theorist has a binomial distribution. If we take the sum of all y_i, we get the total number of correctly labeled pixels because the incorrect ones contributes zero to the value of the sum. That means the accuracy of the map can be expressed as shown here by the formula P equals one over N times the sum of the y_i periods. N is the total number of pixels in the map. What we are interested in is getting an accurate estimate of P, the map accuracy. Now consider the properties of g_i. It also has a binomial distribution. By definition, little n is this begin. Otherwise, we will be sampling every pixel in the thematic map. The sum of all the g_i is the number of correctly labeled pixels, but in this case as found in the testing set. As before g_i is zero for testing pixels which are incorrectly labeled so that they do not contribute to the value of the sum. The proportion of correctly labeled testing set pixels, which is always what we use as a measure of accuracy, is given by the expression for p. That is, one over n times the sum of the g_i. We want p found from the testing set to be a good estimate of the actual accuracy of the thematic map, P. The question is, how many testing set pixels n, are needed to ensure that? Since the sum of binomials is itself a binomial distribution, the map accuracy estimate p, is binomially distributed. We can assume that g_i and y_i come from the same underlying distribution. They then had the same means so that the expected value of p is equal to the actual map accuracy P. Although the expected value of the map accuracy found from the testing pixels is the main P, the actual value found can be different. Hopefully, it would be in the vicinity of P, but depending on the number of testing pixels taken, it might be quite different. But how different might that be? To answer that, we look at the variance of p about the main P, and that variance is given by the expression shown on the slide. Usually, the number of pixels in a remote sensing image far exceeds the number of testing pixels so that the variance takes the form of the bottom equation on this slide. It is inversely proportional to the number of testing pixels n so that N reduces severance and makes the estimate p, closer to the actual map accuracy P. Repeating the message from the previous slide, the variance reduces with more testing pixels. Thus, more testing pixels gets us closer to the true value of the accuracy of the thematic map. To a good approximation, little p can be assumed to be normally distributed about the mean big P, as illustrated in the diagram. Two standard deviations about the mean contain 95 percent of the population. Thus, with 95 percent confidence, we can say that our estimate of the accuracy of the thematic map lies within about two standard deviations of the true value. Let's say now how it can use that information. With 95 percent confidence, we know our estimate of the accuracy of the thematic map, lies within about two standard deviations of the true value. Now suppose we are happy for our estimate of the accuracy to be within plus or minus epsilon of its true value. That is, we're looking for little p, which is equal to big P plus or minus epsilon. With 95 percent confidence, we know it will be within the range if epsilon represents two standard deviations. Thus, we have the formula for epsilon equal to 2 times the square root of p times 1 minus p divided by n, which gives the number of testing pixels shown on the bottom of the slide. Interestingly, note that it is a function of the true accuracy p. Is that unusual? We'll know. If p was equal to 1. In other words, the map was 100 percent correct. We really don't need to center it at all. Whereas if P were low, the numerator, P minus p squared is large, meaning that more samples are required if the accuracy is relatively poor. As an example, suppose we are happy for an estimate to be within plus or minus 0.04 of a true proportion, which is thought to be about 0.85. That is, we are happy with accuracies in the range of 0.81-0.89. Then at the 95 percent confidence level, n equals 319. That means randomly sampling or randomly selecting 319 testing pixels will allow a thematic map accuracy of about 85 percent to be checked with an uncertainty of plus or minus 4 percent, with 95 percent confidence. Further examples are in the table on the next slide. Here we see explicitly that the required number of testing pixels decreases with the actual accuracy of the thematic map. Our previous analysis was focused on the need to establish a sufficient number of testing pixels so that the overall accuracy of a thematic map could be assessed. What we want to know now is how to ensure we have enough testing pixels per class, to know that each class is accurately labeled. One approach upon the paper referenced here, leads to the numbers shown in the table below. In this case, with an uncertainty in the estimate plus or minus 10 percent at the 95 percent confidence level. When looking at overall map accuracy, we were able to use binomial statistics since only two outcomes are possible, a correctly labeled pixel or an incorrectly labeled pixel. If we wish to look at class wise accuracies for each pixel in a thematic map, a multinomial probability distribution is needed. We will not go into the development here. But note that it leads to the following conservatively high estimate for the number of testing pixels required per class. When we are interested in whether we have the correct class for a pixel. Not just that it's this correctly labeled. We can use the expression shown on the slide. Epsilon is again the level of tolerance we need around the estimate of the class accuracy. B is the upper beta percentile for the course grade distribution. With one degree of freedom. Beta is the overall precision needed divided by the total number of classes. For the background theory here, see that paper by Tortora. Consider this example from the book by Congalton and Green. Suppose we have eight classes in a thematic map. We want to find the estimate of the accuracies within plus or minus 5 percent, and we want our results to be at the 95 percent confidence level. In this case, beta equals 0.05. divided by 8, which is the 0.625 percentile of the distribution, giving b the value of 7.568. That's the total number of testing pixels needed, is 757 as shown. Therefore, we need just under 100 per class. By way of summary, note, just as in training, where we need enough labeled reference pixels to get good estimates of the parameters of a classifier, in testing, we must have enough labeled reference or testing pixels to ensure you can get good estimates of the accuracy of a thematic map. We need enough testing pixels to ensure that the estimate of the map accuracy obtained is as close as possible to the actual accuracy of the thematic map generated by the classifier. We can look for estimates of the overall map accuracy or the accuracies with which each of the individual classes have been mapped. The latter requires more training pixels. It is important to understand the messages behind each of these three questions, if you are to be sure you understand the guidelines for selecting the number of testing pixels.