# Chain Rule

In calculus, the chain rule is a formula for the derivative of the composition of two functions.

In intuitive terms, if a variable, y, depends on a second variable, u, which in turn depends on a third variable, x; then, the rate of change of y with respect to x can be computed as the product of the rate of change of y with respect to u multiplied by the rate of change of u with respect to x.

Suppose, for example, that one is climbing a mountain at a rate of 0.5 kilometres per hour. The temperature is lower at higher elevations; suppose the rate by which it decreases is 6 °F per kilometre. If one multiplies 6 °F per kilometre by 0.5 kilometre per hour, one obtains 3 °F per hour. This calculation is a typical chain rule application.

In algebraic terms, the chain rule (of one variable) states that if the function f is differentiable at g(x) and the function g is differentiable at x. The chain rule may be stated in any of several equivalent forms:

$(f\circ g)'(x)=f'(g(x))g'(x),\,$

or in the Leibniz notation

${\frac {df}{dx}}={\frac {df}{dg}}\cdot {\frac {dg}{dx}},$

or

${\frac {df}{dx}}={\frac {d}{dx}}f(g(x))=f'(g(x))g'(x).$

In integration, the counterpart to the chain rule is the substitution rule.

## The general power rule

The general power rule (GPR) is derivable, via the chain rule.

### Example I

Consider $f(x)=(x^{2}+1)^{3}$. $f(x)$ is comparable to $h(g(x))$ where $g(x)=x^{2}+1$ and $h(x)=x^{3}$; thus,

 $f'(x)$ $=3(x^{2}+1)^{2}(2x)$ $=6x(x^{2}+1)^{2}.$

### Example II

In order to differentiate the trigonometric function

$f(x)=\sin(x^{2}),$

one can write $f(x)=h(g(x))$ with $h(x)=\sin x$ and $g(x)=x^{2}$. The chain rule then yields

$f'(x)=2x\cos(x^{2})$

since $h'(g(x))=\cos(x^{2})$ and $g'(x)=2x$.

## Chain rule for several variables

The chain rule works for functions of several variables as well. For example, if we have a function $f(u(x,y),v(x,y))$ where

$u(x,y)=3x+y^{2}$ and $v(x,y)=\sin(xy)$, and if $f=u+v$,

then

${\partial f \over \partial x}={\partial f \over \partial u}{\partial u \over \partial x}+{\partial f \over \partial v}{\partial v \over \partial x}=3+\cos(xy)y.$

## Proof of the chain rule

Let f and g be functions and let x be a number such that f is differentiable at g(x) and g is differentiable at x. Then by the definition of differentiability,

$g(x+\delta )-g(x)=\delta g'(x)+\epsilon (\delta )\,$ where ${\frac {\epsilon (\delta )}{\delta }}\to 0\,$ as $\delta \to 0.$

Similarly,

$f(g(x)+\alpha )-f(g(x))=\alpha f'(g(x))+\eta (\alpha )\,$ where ${\frac {\eta (\alpha )}{\alpha }}\to 0\,$ as $\alpha \to 0.\,$

Now

 $f(g(x+\delta ))-f(g(x))\,$ $=f(g(x)+\delta g'(x)+\epsilon (\delta ))-f(g(x))\,$ $=\alpha _{\delta }f'(g(x))+\eta (\alpha _{\delta })\,$

where $\alpha _{\delta }=\delta g'(x)+\epsilon (\delta )\,$. Observe that as $\delta \to 0,$ ${\frac {\alpha _{\delta }}{\delta }}\to g'(x)$ and ${\frac {\eta (\alpha _{\delta })}{\delta }}\to 0$. Hence

${\frac {f(g(x+\delta ))-f(g(x))}{\delta }}\to g'(x)f'(g(x)){\mbox{ as }}\delta \to 0.$

## The fundamental chain rule

The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : EF and g : FG are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative (the Fréchet derivative) of the composition g o f at the point x is given by

${\mbox{D}}_{x}\left(g\circ f\right)={\mbox{D}}_{{f\left(x\right)}}\left(g\right)\circ {\mbox{D}}_{x}\left(f\right).$

Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices (namely Jacobians), the composition on the right hand side turns into a matrix multiplication.

A particularly clear formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let

f : MN and g : NP

be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write

${\mbox{d}}\left(g\circ f\right)={\mbox{d}}g\circ {\mbox{d}}f.$

In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of C manifolds with C maps as morphisms.

## Tensors and the chain rule

See tensor field for an advanced explanation of the fundamental role the chain rule plays in the geometric nature of tensors.

## Higher derivatives

Faà di Bruno's formula generalizes the chain rule to higher derivatives. The first few derivatives are

${\frac {df}{dx}}={\frac {dg}{dx}}{\frac {df}{dg}}$
${\frac {d^{2}f}{dx^{2}}}=\left({\frac {dg}{dx}}\right)^{2}{\frac {d^{2}f}{dg^{2}}}+{\frac {d^{2}g}{dx^{2}}}{\frac {df}{dg}}$
${\frac {d^{3}f}{dx^{3}}}=\left({\frac {dg}{dx}}\right)^{3}{\frac {d^{3}f}{dg^{3}}}+3{\frac {dg}{dx}}{\frac {d^{2}g}{dx^{2}}}{\frac {d^{2}f}{dg^{2}}}+{\frac {d^{3}g}{dx^{3}}}{\frac {df}{dg}}$
${\frac {d^{4}f}{dx^{4}}}=\left({\frac {dg}{dx}}\right)^{4}{\frac {d^{4}f}{dg^{4}}}+6\left({\frac {dg}{dx}}\right)^{2}{\frac {d^{2}g}{dx^{2}}}{\frac {d^{3}f}{dg^{3}}}+\left\{4{\frac {dg}{dx}}{\frac {d^{3}g}{dx^{3}}}+3\left({\frac {d^{2}g}{dx^{2}}}\right)^{2}\right\}{\frac {d^{2}f}{dg^{2}}}+{\frac {d^{4}g}{dx^{4}}}{\frac {df}{dg}}$