There are three important techniques or formulae for differentiating more complicated functions in terms of simpler functions, known as the Chain Rule, the Product Rule, and the Quotient Rule. In today's video, we introduce and explain the first of these, the Chain Rule, which is considered by many to be the simplest to remember, explain, and apply. Let y equal f of u, be a function of the variable u, whilst u itself is a function of another variable x, say u equals g of x. Using Leibniz's notation, the Chain Rule says simply; dy dx equals dy, du times du dx. You can think of that differential du in the numerator, canceling with du in the denominator as though these were ordinary fractions. There's an equivalent statement using the function dash notation for the derivative. f circle g dashed of x equals f dash to g of x times g dashed of x. The two statements look quite different but in fact carry exactly the same information, and I'll come back to the function dash notation version later. For now, let's focus on the version of the chain rule using Leibniz's notation and work through some examples. Here, y equals negative 3x plus six squared, and we want to find the derivative dy dx. We can solve this directly by expanding the brackets to get the quadratic 9x squared minus 36x plus 36, and then differentiating the usual way to get 18x minus 36 which factorizes as 18 times x minus two. We can also solve this indirectly using the chain rule. To do this, we put y equals u squared, where u equals minus 3x plus six, so that y becomes a particularly simple function of u. The chain rule says dy, dx equals dy du times du dx. But the derivative of y with respect to u is just 2u since y equals u squared, and the derivative of u with respect to x is just minus three. So, this product becomes 2u times minus three, which is minus 6u. Converting back into an expression involving x, this becomes minus six times minus 3x plus six, which quickly simplifies to 18 times x minus two, which is the same answer that we obtained by the direct method. It's interesting that we get to the same answer by quite different routes. In this example, there's about the same amount of work using either method. In more elaborate examples, the chain rule can save a lot of time and work. With little bit of practice, you will develop fluency quickly and find the chain rule effective, powerful and easy to use. In the next example, we want to find the derivative of y equals e to the minus x. To set up the chain rule, we put y equal to e to the u, where u equals minus x. Then again dy dx equals dy du times du dx. But y equals h to the u, so dy du is just h to the u, and du dx is minus one. So, this becomes minus h to the u, which becomes minus e to the minus x, getting an expression in terms of x. This shows that the derivative of e to the minus x is minus e to the minus x. More generally, we can consider y equals e to the kx, where k is a constant, and imitate the previous calculation when k was equal to negative one. Put u equal to kx so that y equals e to the u. Again, dy dx equals dy du times du dx, which similar to before becomes e to the u times k, which is ke to the u, which becomes ke to the kx. This shows that the derivative of e to the kx is ke to the kx. You may be curious to know why the chain rule works and we now sketch a proof. We take the heuristic idea of canceling differentials in the numerator and denominator, and relate this to the formal definition of the derivative in terms of limits. Using Leibniz's notation, the derivative is defined as the limit as delta x tends to zero of the quotient delta y over delta x, where delta x is a small change in x and delta y is the corresponding small change in y. We also have the variable u somewhere in-between acting like a stepping stone in the composition of functions, and the change in x causes a change in u denoted by delta u. We can suppose in this sketch that delta u is non-zero, and then insert delta u both in the numerator and denominator of the fraction without affecting the overall value of the expression of which we're taking the limit. But the limit of a product is the product of the limits. So, we can rewrite this as a limit as delta x approaches zero of delta y over delta u multiplied by the limit of delta u over delta x, and we can adjust the way we've written the first limit. Because we expect the delta u to get really, really small as delta x does, that is delta u goes to zero as delta x goes to zero. Then recognize the first limit as a definition of dy du and the second limit as a definition of du dx. This completes the sketch of the proof that dy dx equals dy du times du dx. Here's the statement of the chain rule again with both versions. Let's focus now on the version using function dash notation. The explicit connection is that y is a function u with rule f of u, and u as a function of x with rule g of x. So, we're feeding the rule for g into the rule for f, which creates the composite function f circle g. Then f circle g dashed of x, is the derivative of y with respect to x, not with respect to u, which we had previously just call y dashed, also called dy dx in Leibniz's notation, which is dy du times du dx by the first form of the chain rule that we have discussed already in detail. But dy du becomes f dashed of u and du dx becomes g dashed of x in function notation, which is f dashed of g of x times g dashed of x because u equals g of x, which gives us the required statement of the chain rule using function notation. Let's practice this version of the chain rule on an example. Let f and g be functions with rules f of x equals x cubed and g of x equals x squared minus one. We wish to find the derivatives of the composite functions fog, that is f circle g, and gof, that is g circle f. Certainly, the derivative of f is 3x squared and the derivative of g is 2x. So, applying the chain rule using function notation, we'll get fog dashed of x is equal to f dashed of g of x times g dashed of x, which is f dashed of x squared minus one times g dashed of x. Which is three times x squared minus one squared times 2x, which becomes six x times x squared minus one all squared. On the other hand, applying the chain rule to these functions but composing in the other order, we have gof dashed to x equals g dashed of f of x times f dash g of x, which becomes g dashed of x cubed, times f dashed of x, which is 2x cubed times 3x squared, which simplifies to six x to the fifth and this solves the problem. Notice that we didn't need to know the rules for f circle g or g circle f, and can follow the formula for the chain rule quite mechanically. Of course, we can solve this problem also without the chain rule. In the case of f circle g, the rule becomes f of g of x, which becomes f of x squared minus one equal to x squared minus one cubed. Which expands out to become x to the sixth, minus 3x to the fourth, plus 3x squared minus one, with derivative of 6x to the fifth minus 12x cubed plus 6x, which factorizes and 6x outside of x to the fourth minus 2x squared plus one, which factorizes further as 6x times x squared minus one squared, which happily agrees with the answer we found quickly and indirectly using the chain rule. In the case of g circle f, the rule becomes g of f of x, which becomes g of x cubed, which is x cubed squared minus one equal to x to the sixth minus one with the derivative of 6x to the fifth, which again happily agrees with the answer we found using the chain rule. In today's video, we introduced the Chain Rule, using both Leibniz's notation and function notation, and applied it to differentiate composite functions in several contrasting examples. Please read the notes and when you're ready please attempt the exercises. Thank you very much for watching and I look forward to seeing you again soon.