In elementary calculus, we learn how to differentiate sums, products, quotients, and composite functions. We now generalize these ideas to functions of several variables, paying particular attention to the differentiation of composite functions. The rule for differentiating composites, called the chain rule, takes on a more profound form for functions of several variables than for those of one variable.
If \(f\) is a real-valued function of one variable, written as \(z = f(y)\), and \(y\) is a function of \(x\), written \(y= g(x)\), then \(z\) becomes a function of \(x\) through substitution, namely, \(z = f(g(x))\), and we have the familiar chain rule: \[ \frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx} = f' (g(x)) g' (x). \]
If \(f\) is a real-valued function of three variables \(u,v\), and \(w\), written in the form \(z = f(u,v,w)\), and the variables \(u,v,w\) are each functions of \(x,u = g(x), v= h(x)\), and \(w = k (x)\), then by substituting \(g(x), h(x)\), and \(k (x)\) for \(u,v\), and \(w\), we obtain \(z\) as a function of \(x\colon\, z = f (g (x), h (x), k(x))\). The chain rule in this case reads: \[ \frac{dz}{dx} = \frac{\partial z}{\partial u} \frac{du}{dx} + \frac{\partial z}{\partial v} \frac{dv}{dx} + \frac{\partial z}{\partial w} \frac{dw}{dx}. \]
One of the goals of this section is to explain such formulas in detail.
These rules work just as they do in one-variable calculus.
125
The proofs of rules (i) through (iv) proceed almost exactly as in the one-variable case, with a slight difference in notation. We shall prove rules (i) and (ii), leaving the proofs of rules (iii) and (iv) as Exercise 27.
126
Verify the formula for \({\bf D}h\) in rule (iv) of Theorem 10 with \[ f(x,y,z)=x^2+y^2+z^2 \hbox{ and } g(x,y,z)=x^2+1. \]
solution Here \[ h(x,y,z)=\frac{x^2+y^2+z^2}{x^2+1}, \] so that by direct differentiation \begin{eqnarray*} {\bf D}h(x,y,z) &\!=\!& \bigg[\frac{\partial h}{\partial x},\frac{\partial h}{\partial y}, \frac{\partial h}{\partial z}\bigg] \!=\!\bigg[\frac{(x^2+1)2x-(x^2+y^2+z^2)2x}{(x^2+1)^2}, \frac{2y}{x^2+1},\frac{2z}{x^2+1}\bigg]\\[6pt] &=&\!\bigg[\frac{2x(1-y^2-z^2)}{(x^2+1)^2},\frac{2y}{x^2+1},\frac{2z}{x^2+1}\bigg]. \end{eqnarray*}
By rule (iv), we get \[ {\bf D}h=\frac{g{\bf D} f-f{\bf D}g}{g^2}= \frac{(x^2+1)[2x,2y,2z]-(x^2+y^2+z^2)[2x,0,0]}{(x^2+1)^2}, \] which is the same as what we obtained directly.
As we mentioned earlier, it is in the differentiation of composite functions that we meet apparently substantial alterations of the formula from one-variable calculus. However, if we use the \({\bf D}\) notation, that is, matrix notation for derivatives, the chain rule for functions of several variables looks similar to the one-variable rule.
Let \(U\subset {\mathbb R}^n\) and \(V \subset {\mathbb R}^m\) be open sets. Let \(g\colon\, U\subset {\mathbb R}^n\rightarrow {\mathbb R}^m\) and \(f\colon\, V \subset {\mathbb R}^m\rightarrow {\mathbb R}^p\) be given functions such that \(g\) maps \(U\) into \(V\), so that \(f \circ g\) is defined. Suppose \(g\) is differentiable at \({\bf x}_0\) and \(f\) is differentiable at \({\bf y}_0=g({\bf x}_0)\). Then \(f \circ g\) is differentiable at \({\bf x}_0\) and \begin{equation*} {\bf D}(f\circ g)({\bf x}_0)={\bf D} f({\bf y}_0){\bf D}g({\bf x}_0).\tag{1} \end{equation*}
The right-hand side is the matrix product of \({\bf D} f({\bf y}_0)\) with \({\bf D} g({\bf x}_0)\).
127
We shall now give a proof of the chain rule under the additional assumption that the partial derivatives of f are continuous, building up to the general case by developing two special cases that are themselves important. (The complete proof of Theorem 11 without the additional assumption of continuity is given in the Internet supplement for Chapter 2.)
This is a test.