Welcome to Module IV of information extraction on health data. We are now getting to an end of different approaches we have seen for extracting information from free text in healthcare. In Module I, we looked at templated text and regular expressions as a way to do it. In module two, we looked at different lists and thesaurus available to kind of get clinical data from, clinical concept names from. And then we looked at supervised machine learning models to kind of build on those resources in a machine learning framework. This module we're going to look at deep learning approaches. How can be used even advanced approaches with neural networks to identify and extract clinically relevant information. So, let's start with what is deep learning? Deep learning should be considered in contrast to traditional machine learning approaches. In traditional machine learning, we have some training data, we identify features that are relevant for the concepts of interest. And then we train a machine learning model, probabilistic model or some sort of kind of supervised training approaches that we then test on unseen data, right? So, we have this train test split and a supervised machine learning framework. Indeed learning, we do not have the luxury of actually looking at features in that detail. So, in traditional machine learning approaches, there is high reliance on feature engineering. What features work? Should we train the features in one particular way? Or should we have capitalization as a feature? Or should we have a combination of capitalization and the first few characters, right? So, those are all kind of details that become important in the feature engineering setup. So, it is hard for unstructured data to kind of do this consistently and with high reliability. So, for example in images or in speech or text, it's very hard to kind of do this feature engineering right. What if we could have an automated way of discovering the right representation for features? That was a challenge that started getting addressed when deep learning approaches came. So, machines can learn both features and then use those features in a particular task in a unified fashion. So, for example, in the traditional sense when you had documents, you first did in feature extraction and then you trained a classifier. And had a classification module to then finally decide whether something was a green or red or something was a yes or no right? Some typical examples would be in a sentiment analysis model and so on. So, that was a traditional machine learning approach. For deep learning, we combine these two components, the feature extraction and classification is all kind of grouped into one deep learning approach so that the effort that is spent in engineering features is actually taken away. And in fact you can think of more deeper representation of data directly. So, what are the pros and cons of actually doing it this way? Deep learning, one of the biggest advantages is that it now is the state of the art models in numerous tasks and across different domains. So, for example, in computer vision we have image recognition and visual art recognition are all models that now we use deep learning approaches almost as the main approach to go to. Same thing with automatic speech recognition approaches. In natural language processing tasks across the spectrum of NLP tasks like passing, semantic labeling, machine translation, text classification or linking entities to one another used deep learning approaches. Similarly, in medical informatics, we have entity recognition or adverse drug event models that started with more traditional approaches a few years back, let's say five years back. And more recently, deep learning models have been used extensively and with high accuracy. Deep learning is also used in bioinformatics and in many other fields. The biggest disadvantage of deep learning is that these models are not very interpretable. So, they do not truly represent the importance of input features. So, the the input is given as it is and the model kind of learns different representations of these input. But you don't really understand which ones are good features and which ones are not because there is no feature engineering. There is also inability to kind of explain why the model performs the way it does. So, how do you improve the model? It's mostly configuration of the models and see whether different configuration makes it better. And there are some ways to do it systematically. But really the the advantages that traditional machine learning models give really identifying what features actually makes sense. The fact that entities have capitalization. So, we should probably have capitalization as a feature. It was a good insight that could have been put into the models that does not happen directly in a deep learning set up. The other bigger disadvantage is that it is very likely that models will over fit when they are trained on small data, and hence it needs much larger datasets than traditional machine learning models. Even with these two big disadvantage, the fact that most of the state of the art models now use deep learning shows the power of going beyond just traditional approaches and linear models into a more non linear model that deep learning provides. So, in the rest of the videos, this module we're going to talk about how do you configure a deep learning model? What does that mean? What are these individual hidden layers actually mean? And then applications of deep learning.