To understand deep learning, let's start with understand the origins of the term neural network. Neural network is a complex network of neurons in the brain, which work together to perform complex calculations. In 1943, a neuropsychologist Warren McCulloch, and a mathematician Walter Pitts work together to introduce a computational model for how these individual neurons in the brain actually work. Multiple signals arrive at the dendrites of a neuron. When the signals come in through the dendrites, are added together in the cell body. And if the accumulated signal exceeds some threshold, the neuron fires, meaning it's activated, and it passes on an output signal. It's important to understand that an individual neuron in the brain really can't do a lot. But when connected to the thousands of other neurons in the brain, these neural networks can achieve very complex calculations. Likewise, it was proposed by McCulloch, Pitts in the early 1940s. The complex networks of artificial neurons could achieve very complex calculations, and approximate very complex functions. A neural network that contains many layers is referred to as a deep neural network, which is the origin of the term deep learning. So, let's look at the history of neural networks. After the original proposal by McCulloch, Pitts in the 1940s that really launched an era of early research into the field of neural networks. Researchers, and scientists made great progress throughout the 40s, and the 1950s when the first model of an artificial neuron was proposed by Rosenblatt. And shortly after, the Stanford researchers, Woodrow and Hoff proposed the first successfully functioning neural network. One of the challenges that held back research during this time period, was that researchers really didn't have a great way to successfully train large complex networks of neurons. Advances in the 1980s lead to breakthroughs such as the technique of what's called back propagation. Which is a method for training complex neural networks that contain multiple layers. In the 1980s, however, progress has held back due to lack of both data availability as well as computing power. And it really wasn't until the early 2000s that computational power caught up. And that we had sufficiently vast amounts of data to use, to train deeper and deeper neural networks. As neural network models became deeper and deeper with more neurons, organizing into more layers the power of these networks in terms of achieving very complex calculations, continued to grow. And as a result in recent years, we've seen a boom in the field of deep learning. Where large powerful neural networks are now being used to achieve a wide variety of very complex tasks. There's a number of key enablers of this recent boom in the use of deep learning. The first and probably the most important, is that the amount of data that is now available for training of large complex neural networks has grown exponentially. One of the important things to remember about neural network, is they require very, very large amounts of data to successfully train. And it's really only been in the last decade or two that we have enough data that's made available to us, through a variety of sources such as computers, connected devices, pervasive sensors. But we've also taken the time, and put in the effort by scientists, and researchers, and engineers to organize, and label all of this data. In a way that they can be consumed for training neural networks. Computational power has also caught up to state that they are in terms of algorithm design. And that's allowed for much deeper, and much more complex neural network architectures than we previously could achieve. Researchers have also made great advances in terms of the algorithms themselves. There's some inherent limitations of the architecture of a neural network. And recent advances have largely overcome a number of these limitations. As a result, today, neural networks can be found really all around us in the world in a wide variety of different applications. Let's look at a couple of representative applications of of neural networks. The first would be for image classification, and image recognition. For example, when you take a picture of someone on your phone, and uploaded it to Facebook. And Facebook automatically tags that picture with a friend's name. In order to do that, it's using a neural network model that has been built-in, and trained to recognize pictures of your friends to enable this automation. Another application of deep learning is found in what's called neural machine translation. Automatic translation websites, and applications are able to use complex, and specific types of neural network models to translate very easily back and forth between a very large number of languages. The applications of deep learning in the healthcare space are really at the very early days. There's tremendous application potential from their own networks within healthcare. Driven by the large amount of healthcare data that's collected by our health care system about patients. One of the early breakthroughs in the use of deep learning models in the healthcare space was the use of neural networks as a predictive model, to predict the onset of sepsis within ICU patients. They only use automated models to predict sepsis onset based on physiological signals coming from sensors. Enables ICU doctors, and nurses to better manage, and take proactive care of patients at high risk of sepsis onset. Let's look at one final example of an innovative application of deep learning. One of the major pizza chains here in the US, is using a computer vision deep learning model to perform quality control on the pizzas coming out of their ovens, in the restaurants. Rather than human employees having to perform quality control on the pizzas. They use a camera connected with the deep learning model to do the quality control. Looking for things like the ratio of cheese to sauce on the pizza. Whether the pizza has the proper ingredients, that the customer has actually ordered. Whether the number of pepperoni on the pizza is up to the standard number, that they're supposed to put on the pizza. In this way they're able to automate that quality control process. And use their human employees for more sophisticated tasks within the restaurant. If we think about the examples I just presented, we could see a couple of common themes in terms of where deep learning really excels. One of these things, is that we need vast amounts of training data, to successfully train deep learning models to perform challenging tasks. The second theme we observe, is that deep learning really excels in applications where we have a very large number of features. For example, in unstructured data when we're dealing with text, or we're dealing with imagery, or video. We have a very, very large number of features in the case of image classification. For example, we may have pictures, and each individual pixel within that picture represents a separate feature. So if we have for example, 5x12 pixels by 5x12 pixels, we're dealing with thousands of potential features. And this is where deep learning applications really can shine. Number three, is the deep learning applications are able to do very well when we have complex relationships between the input features, and the target. Where again, we have many input features, and we have complex nonlinear relationships between the features, and the targets. Deep learning networks are able, given enough data, to learn those complex relationships. Finally, it's important to note that applications of deep learning generally have a low concern for explainability. One of the challenges that we'll discuss later about the use of neural networks, is that they are often considered black boxes. Because they're so complex with so many equations. That it's really difficult to understand how in their own network is reaching an output prediction. As a result, we generally focus their use on specific applications. Where we don't necessarily need to present users with a sophisticated explanation for how the machine got to its prediction. So for things like, tagging of images with your friends names, or counting the number of pepperoni cheese on the pizza. This is generally really not a concern, because interpretability, and explainability is really not a key driver in these types of applications. If we're thinking about building models for example to determine whether applicants to a graduate school are accepted. Or whether somebody is approved for a loan to which they have applied, for example. These are applications with high stakes for users, and also as a result a high need for interpretability, and explainability. And so these types of applications, we need to be really careful about the use of neural networks.