Hello, and welcome. In this video, we'll provide an overview of recurrent neural networks and explain their architecture. A recurrent neural network, or RNN for short, is a great tool for modeling sequential data. The RNN is able to remember the analysis that was done up to a given point by maintaining a state or a context, so to speak. You can think of the state as the memory of RNN, which captures information about what's been previously calculated. This state recurs back into the net with each new input, which is where the network gets his name, let's take a closer at how this works. Imagine that the RNN were using has only one hidden layer, the first data point flows into the network as input data, denoted as X. As we mentioned before, the hidden units also received the previous state, or the context, denoted as H previous, along with the input. Then in the hidden layer, two values will be calculated, first the new or updated state, denoted as H new, is to be used for the next data point in the sequence. And second, the output of the network will be computed, which is denoted as y, the new state is a function of the previous state and the input data as shown here. If this is the first data point then some form of initial state is used, which will differ depending on the type of data being analyzed, but typically it is initialized to all zeros. Please notice that Wx in this equation is the weight matrix between the input and the hidden unit, and Wh are the weights that are multiplied by the previously hidden state in the equation. The output of the hidden unit is simply calculated by multiplication of the new hidden state and the output weight matrix. So after processing the first data point in addition to the output, a new context is generated that represents the most recent point. Then this context is fed back into the net with the next data point, and we repeat these steps until all the data is processed. Recurrent neural networks are extremely versatile and are used in a wide range of applications that deal with sequential data. One of these applications is speech recognition, as you can see it is a type of a many-to-many network, that is, the goal is to consume a sequence of data and then produce another sequence. Another application of RNN is image captioning, although it's not purely recurrent, you can create a model that's capable of understanding the elements in an image. Then using the RNN, you can string the elements as words together to form a caption that describes the scene. Typically, RNN has outputs at each time step, but it depends on the problem that RNN is addressing, for example, in this type of RNN for captioning, there is one input as image and the output is a sequence of words. So it is sometimes called one-to-many, RNN can also be many-to-one, that is, it consumes a sequence of data and produces just one output. For example, to predict the stock market price, where we might only be interested in the price of a particular stock in tomorrow's market, or for sentiment analysis, where we may only care about the final output, not the sentiment after each word. We've only covered a few applications, but variance of the recurrent models are continuing to solve increasingly complex problems, which are beyond the scope of this video. Despite all of its strengths, the recurrent neural network is not a perfect model, one issue is that the network needs to keep track of the states at any given time. There could be many units of data or many time steps, so this becomes computationally expensive, one compromise is to only store a portion of the recent states in a time window. Another issue is that recurrent neural networks are extremely sensitive to changes in their parameters, as a result gradient descent optimizers may struggle to train the net. Also, the net may suffer from the vanishing gradient problem, where the gradient drops to nearly zero and training slows to a halt. Finally, it may also suffer from the exploding gradient, where the gradient grows exponentially off to infinity, in either case, the model's capacity to learn will be diminished. By now, you should have a good understanding of the main ideas behind the recurrent neural network model. Thanks for watching this video.