Earlier we defined machine learning as the process of generalizing from examples. In this video, we're going to discuss how very limited that generalization is, and see some ways machine learning differs from human learning. By the end of this video, you will be able to describe how machine learning systems have limited generalization and rely on specific problem definition. In popular articles, it's common to see machine learning programs described in terms of how children learn, but that can be a terribly misleading way to think about things. How well the qualm generalizes has more to do with thorough testing than the computer actually knowing anything. More formerly, generalization is limited by two things. First, by the examples the system has to learn from or the data you feed it, and second, generalization is limited by the learning algorithm itself because different learning algorithms produce different kinds of qualms, and different kinds of qualms can capture different kinds of knowledge. You might remember that in 2011, IBM's Watson program played world champion jeopardy players in a televised competition. Watson is a sophisticated AI system, that was able to correctly answer questions in ordinary English. Watson was also trained to better amounts that would thoroughly beat human players in the game of daily doubles. It is tempting to think that Watson understands the answer-question format. However, Watson makes several mistakes that demonstrate how it's generalization is limited. Look at this screenshot from the episode. The answer is familiarity is said to breed this. The question, which Watson correctly identified is, what is contempt? But take a look at the second and third choices Watson found. Watson's second choice is contemn, a misspelling of the correct answer, and a mistake a human might make. Not correct, but not so far off. Watson's third choice however, is despised icon. Despised icon is a Montreal based death metal band. This is definitely not the kind of error a human would make. It is unclear why Watson chose despised icon as the third most likely phrase, and yet it did. Here's a different example where machine learning has been surprisingly successful, classifying images. As humans, we can easily distinguish between a cat or a dog in a photograph. We can also point out exactly where the animal is in the photo, and even draw an outline around it. For machines on the other hand, this is much more difficult. In fact, this activity that seems very natural to us must be broken up into several different tasks for the machine. First, we have single class classification, where our qualm tells us what single object is in the picture. In this example, it determines whether or not the image has a cat. It is important to note that this classifier is only able to detect the presence of a cat in an image, no other animals. Next, we can combine classification with something called localization. Localization means building a qualm that can put a box around a single object in the image. What if our pictures have more than one object? Object detection is about building qualms that can put boxes around each object in an image, distinguishing them from each other as well as the background. Even though recognizing cats and dogs in images feels like a single straightforward task for humans. For machines, it must be broken up into very specific tasks, and each of these tasks involves lots and lots of training. Notice the rigidity of the system. It cannot detect objects that it's not been trained to detect. In fact, if it was trained only on images of real cats, it would not be able to correctly classify images of cartoon cats, even though humans, even very young children can easily classify cartoon objects based on their real-world counterparts. Here's an example of a system that describes images with a sentence. Notice that this problem is broken up into three tasks, detect words, generate sentences, and then rank these sentences. The final sentence the system came up with is a woman holding a camera in a crowd. This describes the image quite well. However, if we dig a bit deeper, we can see that the system hasn't understood the picture in the same way we do. The first task is object detection and classification. The qualm identifies a crowd, something purple in the image, a camera, and so on. It even recognizes the action of holding. The qualm also identified the woman's hair as a cat. This of course is wrong, but we can understand why it made that mistake. For the second task, a different qualm uses these keywords as input and generate sentences. Sure, one of those sentences is a woman holding a camera in a crowd, but we also have a purple camera with a woman and a woman holding a cat. These are not unreasonable sentences, but also not related to the image. No human would suggest those sentences as captions for this photo. For the third task, yet another qualm takes the list of sentences from the previous qualm and ranks them. Here we finally arrive at the answer, a woman holding a camera in the crowd, a pretty good caption. Although this system captions images quite well, it's not because it understands images the way we do. Each specific piece of the problem required a different qualm, and it's not until those qualms are chained together that we have a complete image captioning system. I hope that by now you're convinced that generalization is a difficult thing for machines to do. How well a qualm is able to generalize depends on both the examples in the learning data, and the chosen learning algorithm.