13.4 Properties of the Derivative

In elementary calculus, we learn how to differentiate sums, products, quotients, and composite functions. We now generalize these ideas to functions of several variables, paying particular attention to the differentiation of composite functions. The rule for differentiating composites, called the chain rule, takes on a more profound form for functions of several variables than for those of one variable.

If \(f\) is a real-valued function of one variable, written as \(z = f(y)\), and \(y\) is a function of \(x\), written \(y= g(x)\), then \(z\) becomes a function of \(x\) through substitution, namely, \(z = f(g(x))\), and we have the familiar chain rule: \[ \frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx} = f' (g(x)) g' (x). \]

If \(f\) is a real-valued function of three variables \(u,v\), and \(w\), written in the form \(z = f(u,v,w)\), and the variables \(u,v,w\) are each functions of \(x,u = g(x), v= h(x)\), and \(w = k (x)\), then by substituting \(g(x), h(x)\), and \(k (x)\) for \(u,v\), and \(w\), we obtain \(z\) as a function of \(x\colon\, z = f (g (x), h (x), k(x))\). The chain rule in this case reads: \[ \frac{dz}{dx} = \frac{\partial z}{\partial u} \frac{du}{dx} + \frac{\partial z}{\partial v} \frac{dv}{dx} + \frac{\partial z}{\partial w} \frac{dw}{dx}. \]

One of the goals of this section is to explain such formulas in detail.

Sums, Products, Quotients

These rules work just as they do in one-variable calculus.

125

Theorem 10 Sums, Products, Quotients

  • (i) Constant Multiple Rule. Let \(f\colon\, U \subset {\mathbb R}^n \rightarrow {\mathbb R}^m\) be differentiable at \({\bf x}_0\) and let \(c\) be a real number. Then \(h ({\bf x}) = cf ( {\bf x})\) is differentiable at \({\bf x}_0\) and \[ {\bf D} h ( {\bf x}_0) = c {\bf D} f ( {\bf x}_0) \qquad \hbox{(equality of matrices)}. \]
  • (ii) Sum Rule. Let \(f\colon\, U \subset {\mathbb R}^n \rightarrow {\mathbb R}^m\) and \(g\colon\, U \subset {\mathbb R}^n \rightarrow {\mathbb R}^m\) be differentiable at \({\bf x}_0\). Then \(h ( {\bf x}) = f ( {\bf x})+ g ( {\bf x})\) is differentiable at \({\bf x}_0\) and \[ {\bf D} h ( {\bf x}_0) = {\bf D} f ( {\bf x}_0) + {\bf D} g ( {\bf x}_0) \qquad \hbox{(sum of matrices).} \]
  • (iii) Product Rule. Let \(f\colon\, U \subset {\mathbb R}^n \rightarrow {\mathbb R}\) and \(g\colon\, U \subset {\mathbb R}^n \rightarrow {\mathbb R}\) be differentiable at \({\bf x}_0\) and let \(h ( {\bf x}) = g ({\bf x}) f ({\bf x})\). Then \(h\colon\, U \subset {\mathbb R}^n \rightarrow {\mathbb R}\) is differentiable at \({\bf x}_0\) and \[ {\bf D} h ( {\bf x}_0 ) = g ( {\bf x}_0) {\bf D}f ( {\bf x}_0)+ f ( {\bf x}_0) {\bf D} g ( {\bf x}_0). \] (Note that each side of this equation is a \(1\times n\) matrix; a more general product rule is presented in Exercise \(31\) at the end of this section.)
  • (iv) Quotient Rule. With the same hypotheses as in rule (iii), let \(h ( {\bf x}) = f ({\bf x}) / g ( {\bf x})\) and suppose \(g\) is never zero on \(U\). Then \(h\) is differentiable at \({\bf x}_0\) and \[ {\bf D} h ( {\bf x}_0) = \frac{ g ( {\bf x}_0) {\bf D} f ( {\bf x}_0) - f ( {\bf x}_0) {\bf D} g ( {\bf x}_0)}{ [ g ( {\bf x}_0)]^2} . \]

proof

The proofs of rules (i) through (iv) proceed almost exactly as in the one-variable case, with a slight difference in notation. We shall prove rules (i) and (ii), leaving the proofs of rules (iii) and (iv) as Exercise 27.

  • (i) To show that \({\bf D} h ( {\bf x}_0) = c {\bf D}f ({\bf x}_0)\), we must show that \[ {\mathop {{\rm limit} }_{{\bf x} \to {\bf x}_0}} \ \frac{ \| h ( {\bf x} ) - h ( {\bf x}_0) - c {\bf D} f ( {\bf x}_0) ( {\bf x}- {\bf x}_0) \| }{ \| {\bf x} - {\bf x}_0 \| } =0, \] that is, that \[ {\mathop {{\rm limit} }_{{\bf x} \to {\bf x}_0}} \ \frac{ \| cf( {\bf x}) - cf( {\bf x}_0) - c {\bf D} f ( {\bf x}_0) ( {\bf x}- {\bf x}_0) \| }{ \| {\bf x} - {\bf x}_0 \| } =0, \] [see equation (4) of Section 13.3]. This is certainly true, since \(f\) is differentiable and the constant \(c\) can be factored out [see Theorem 3(i), Section 13.2].
  • (ii) By the triangle inequality, we may write \begin{eqnarray*} &&\frac{ \| h ({\bf x}) - h ( {\bf x}_0) - [{\bf D} f ( {\bf x}_0) + {\bf D} g ({\bf x}_0) ] ( {\bf x}- {\bf x}_0) \| }{ \| {\bf x} - {\bf x}_0 \|} \\[3pt] && =\frac{ \| f({\bf x})-f({\bf x}_0)-[{\bf D} f({\bf x}_0)]({\bf x}-{\bf x}_0)+g({\bf x})-g({\bf x}_0)-[{\bf D}g({\bf x}_0)]({\bf x}-{\bf x}_0) \|}{\| {\bf x}-{\bf x}_0 \| } \\[3pt] && \leq \frac{ \| f({\bf x})-f({\bf x}_0)-[{\bf D} f({\bf x}_0)]({\bf x}-{\bf x}_0) \| }{ \| {\bf x}-{\bf x}_0 \| }+ \frac{ \| g({\bf x})-g({\bf x}_0)-[{\bf D}g({\bf x}_0)] ({\bf x}-{\bf x}_0) \| }{ \| {\bf x}-{\bf x}_0 \| }, \end{eqnarray*} and each term approaches 0 as \({\bf x}\rightarrow {\bf x}_0\). Hence, rule (ii) holds.

126

example 1

Verify the formula for \({\bf D}h\) in rule (iv) of Theorem 10 with \[ f(x,y,z)=x^2+y^2+z^2 \hbox{ and } g(x,y,z)=x^2+1. \]

solution Here \[ h(x,y,z)=\frac{x^2+y^2+z^2}{x^2+1}, \] so that by direct differentiation \begin{eqnarray*} {\bf D}h(x,y,z) &\!=\!& \bigg[\frac{\partial h}{\partial x},\frac{\partial h}{\partial y}, \frac{\partial h}{\partial z}\bigg] \!=\!\bigg[\frac{(x^2+1)2x-(x^2+y^2+z^2)2x}{(x^2+1)^2}, \frac{2y}{x^2+1},\frac{2z}{x^2+1}\bigg]\\[6pt] &=&\!\bigg[\frac{2x(1-y^2-z^2)}{(x^2+1)^2},\frac{2y}{x^2+1},\frac{2z}{x^2+1}\bigg]. \end{eqnarray*}

By rule (iv), we get \[ {\bf D}h=\frac{g{\bf D} f-f{\bf D}g}{g^2}= \frac{(x^2+1)[2x,2y,2z]-(x^2+y^2+z^2)[2x,0,0]}{(x^2+1)^2}, \] which is the same as what we obtained directly.

Question 13.116 Section 13.4 Progress Check Question 1

9dXZCZH35bHzXpuPvfCOBI18P5OhdQ0oAgkzYG6BaLR6eiI0yjfvmYmvkGeCkmmLE4dm5PsBAkqdLNx9jvTtWLjA50qIgcrj9dJe4JYrHFflCG0z3DdusAP+6CWa8LZZblzAOLJ1rO7iPzgujI306s/5oQ6LlcFhjTILtwyc+mAEYuMSs/PoJmJLm9wQfIk8Y1r9oq0vpZ68/a5/JzMMuGxBqf3D4PZpGaMQRgKx8KTuueMSnHQVbs87XK2PvInBnwgdIXHfL9ojPQ9wKKgRlEUABvy9CIfmxA60bAOFkraYaox3cHl8usUCjAQ7EXvNDhngbJ+ddSWcePRS+C6KiuaKJjoJRrQBgKBX7r7JlNydj+Hebd5ORJB0q3+b9vfMGnnIiv2reBPZdWjudA4zB9v5LenESlSD7z1S8dy/sXnnGZjH9BcvFKhYR0+A3jk2sKdwJhnESURQqln7aeqwHF4tJKXYHQ8g8ISUaM3AtG9JH0lp4D+J/krZXK9j88TzZhCOxqitPPpfPVbYnZjMstFhEv13rJ+LjExaihFu9fesyF/xuz8khtfNsLnpno3lUPfx9KLqKXWhyDw7k+dEyhCV0EQ16M1a8/hl+rS6UbwlXlLsQPcugeyKDb5wK3mp1sEel9kG6xexcHWJ/FXv2ZF2ObfjlZUCLiA1bqv2IMb2ETSf9I613dhXKKx24w26Vutchl5rZh7sDClInrPvjLF/RF0GaNmMu6lShU/Nec8I1dv3sI1mEy6pI/h9NCmOhsohi+8SGGwnGR7CbA2C5KbL5qJtFL9t1ozP+Gj5/k5G/e8DNPhGF3MXNbqTWNh2VqA+JOVpBBrHBNUqCIhb88t3s/NRibC7h63F447DbqTQJ17yuWCv4XR2HSEWAYj0w64WikPwT407UbtIgaVhM5yHMqSgmjS52dfbZDgkvoKzns/T2Kpu2csZfmjoKue6fo8uC7rqgCZLqscsQP8gtIPm7+khrzY/PvV2tSjuDpPbAmFFu9Itm1HBSjY=
3
Correct.
Incorrect.
Keep trying, you are almost there.

Chain Rule

As we mentioned earlier, it is in the differentiation of composite functions that we meet apparently substantial alterations of the formula from one-variable calculus. However, if we use the \({\bf D}\) notation, that is, matrix notation for derivatives, the chain rule for functions of several variables looks similar to the one-variable rule.

Theorem 11 Chain Rule

Let \(U\subset {\mathbb R}^n\) and \(V \subset {\mathbb R}^m\) be open sets. Let \(g\colon\, U\subset {\mathbb R}^n\rightarrow {\mathbb R}^m\) and \(f\colon\, V \subset {\mathbb R}^m\rightarrow {\mathbb R}^p\) be given functions such that \(g\) maps \(U\) into \(V\), so that \(f \circ g\) is defined. Suppose \(g\) is differentiable at \({\bf x}_0\) and \(f\) is differentiable at \({\bf y}_0=g({\bf x}_0)\). Then \(f \circ g\) is differentiable at \({\bf x}_0\) and \begin{equation*} {\bf D}(f\circ g)({\bf x}_0)={\bf D} f({\bf y}_0){\bf D}g({\bf x}_0).\tag{1} \end{equation*}

The right-hand side is the matrix product of \({\bf D} f({\bf y}_0)\) with \({\bf D} g({\bf x}_0)\).

127

We shall now give a proof of the chain rule under the additional assumption that the partial derivatives of f are continuous, building up to the general case by developing two special cases that are themselves important. (The complete proof of Theorem 11 without the additional assumption of continuity is given in the Internet supplement for Chapter 2.)

First Special Case of the Chain Rule

Suppose \({\bf c}\colon\, {\mathbb R}\rightarrow {\mathbb R}^3\) is a differentiable path and \(f\colon\, {\mathbb R}^3\rightarrow {\mathbb R}\) is a differentiable function. Let \(h(t)=f({\bf c}(t))=f(x(t),y(t),z(t)),\) where \({\bf c}(t)=(x(t),y(t),z(t))\). Then \begin{equation*} \frac\it dhlwYQV97wrBN3If8i\it dt/AcpHVRQFuknXeer= \frac{\partial f}{\partial x}\frac\it dxlAr+PWJ4NJj8RqSD\it dt/AcpHVRQFuknXeer +\frac{\partial f}{\partial y}\frac\it dyR5TtFAxSlsoucz7C\it dt/AcpHVRQFuknXeer +\frac{\partial f}{\partial z}\frac\it dzwt6OFmsNnBie2e4P\it dt/AcpHVRQFuknXeer.\tag{2} \end{equation*}

That is, \[ \frac{{\it dh}}{{\it dt}}={\nabla} f({\bf c}(t))\,{ \cdot}\,{\bf c}'(t), \] where \({\bf c}'(t)=(x'(t),y'(t),z'(t))\).

This is the special case of Theorem 11 in which we take \({\bf c}=g\) and \(f\) to be real-valued, and \(m=3\). Notice that \[ \nabla f({\bf c}(t))\,{ \cdot}\,{\bf c}'(t)={\bf D} f({\bf c}(t)) {\bf Dc}(t), \] where the product on the left-hand side is the dot product of vectors, while the product on the right-hand side is matrix multiplication, and where we regard \({\bf D} f({\bf c}(t))\) as a row matrix and \({\bf Dc}(t)\) as a column matrix. The vectors \(\nabla f({\bf c}(t))\) and \({\bf c}'(t)\) have the same components as their matrix equivalents; the notational change indicates the switch from matrices to vectors.

proof of equation (2)

By definition, \[ \frac{dh}{dt}(t_0)=\mathop {{\rm limit}\ }_{t\rightarrow t_0}\frac{h(t)-h(t_0)}{t-t_0}. \]

Adding and subtracting two terms, we write \begin{eqnarray*} \frac{h(t)-h(t_0)}{t-t_0}&=&\frac{f(x(t),y(t),z(t))-f(x(t_0),y(t_0),z(t_0))}{t-t_0}\\[5pt] &=&\frac{f(x(t),y(t),z(t))-f(x(t_0),y(t),z(t))}{t-t_0}\\[5pt] & & +\frac{f(x(t_0),y(t),z(t))-f(x(t_0),y(t_0),z(t))}{t-t_0}\\[5pt] & &+\frac{f(x(t_0),y(t_0),z(t))-f(x(t_0),y(t_0),z(t_0))}{t-t_0}.\\[-15.8pt] \end{eqnarray*}

Now we invoke the mean-value theorem from one-variable calculus, which states: If \(g\colon\, [a,b]\rightarrow {\mathbb R}\) is continuous and is differentiable on the open interval \((a,b)\), then there is a point \(c\) in \((a,b)\) such that \(g(b)-g(a)=g'(c)(b-a)\). Applying this to \(f\) as a function of \(x\), we can assert that for some \(c\) between \(x\) and \(x_0\), \[ f(x,y,z)-f(x_0,y,z)=\bigg[\frac{\partial f}{\partial x}(c,y,z)\bigg] (x-x_0). \]

128

In this way, we find that \begin{eqnarray*} \frac{h(t)-h(t_0)}{t-t_0}&=&\bigg[\frac{\partial f}{\partial x}(c,y(t),z(t))\bigg] \frac{x(t)-x(t_0)}{t-t_0}+ \bigg[\frac{\partial f}{\partial y}(x(t_0),d,z(t))\bigg] \frac{y(t)-y(t_0)}{t-t_0} \\[6pt] && + \bigg[\frac{\partial f}{\partial z}(x(t_0),y(t_0),e)\bigg] \frac{z(t)-z(t_0)}{t-t_0}, \end{eqnarray*} where \(c,d\), and \(e\) lie between \(x(t)\) and \(x(t_0)\), between \(y(t)\) and \(y(t_0)\), and between \(z(t)\) and \(z(t_0)\), respectively. Taking the limit \(t\rightarrow t_0\), using the continuity of the partials \(\partial f/\partial x, \partial f/\partial y, \partial f/\partial z\), and the fact that \(c,d\), and \(e\) converge to \(x(t_0),y(t_0)\), and \(z(t_0)\), respectively, we obtain formula (2).