Chain rule

From Example Problems
Jump to: navigation, search
Topics in calculus

Fundamental theorem | Function | Limits of functions | Continuity | Mean value theorem | Vector calculus | Tensor calculus

Differentiation

Product rule | Quotient rule | Chain rule | Implicit differentiation | Taylor's theorem | Related rates

Integration

Integration by substitution | Integration by parts | Integration by trigonometric substitution | Integration by disks | Integration by cylindrical shells | Improper integrals | Lists of integrals

In calculus, the chain rule is a formula for the derivative of the composition of two functions.

In intuitive terms, if a variable, y, depends on a second variable, u, which in turn depends on a third variable, x; then, the rate of change of y with respect to x can be computed as the product of the rate of change of y with respect to u multiplied by the rate of change of u with respect to x.

Suppose, for example, that one is climbing a mountain at a rate of 0.5 kilometres per hour. The temperature is lower at higher elevations; suppose the rate by which it decreases is 6 °F per kilometre. If one multiplies 6 °F per kilometre by 0.5 kilometre per hour, one obtains 3 °F per hour. This calculation is a typical chain rule application.

In algebraic terms, the chain rule (of one variable) states that if the function f is differentiable at g(x) and the function g is differentiable at x, that is we have f\circ g=f(g(x)). Then

{\frac  {df}{dx}}={\frac  {d}{dx}}f(g(x))=f'(g(x))g'(x).

Alternatively, in Leibniz notation, the chain rule can be expressed as:

{\frac  {df}{dx}}={\frac  {df}{dg}}{\frac  {dg}{dx}}

where {\frac  {df}{dg}} indicates f depends on g as if it were a variable.

In integration, the counterpart to the chain rule is the substitution rule.

The general power rule

The general power rule (GPR) is derivable, via the Chain Rule.

Example I

Consider f(x)=(x^{2}+1)^{3}. f(x) is comparable to h(g(x)) where g(x)=x^{2}+1 and h(x)=x^{3}; thus,

f'(x) =3(x^{2}+1)^{2}(2x)
=6x(x^{2}+1)^{2}.

Example II

In order to differentiate the trigonometric function

f(x)=\sin(x^{2}),

one can write f(x)=h(g(x)) with h(x)=\sin x and g(x)=x^{2}. The chain rule then yields

f'(x)=2x\cos(x^{2})

since h'(g(x))=\cos(x^{2}) and g'(x)=2x.

Chain rule for several variables

The chain rule works for functions of several variables as well. For example, if we have a function f(u(x,y),v(x,y)) where

u(x,y)=3x+y^{2} and v(x,y)=\sin(xy),

then

{\partial f \over \partial x}={\partial f \over \partial u}{\partial u \over \partial x}+{\partial f \over \partial v}{\partial v \over \partial x}=3+\cos(xy)y.

Proof of the chain rule

Let f and g be functions and let x be a number such that f is differentiable at g(x) and g is differentiable at x. Then by the definition of differentiability,

g(x+\delta )-g(x)=\delta g'(x)+\epsilon (\delta )\, where {\frac  {\epsilon (\delta )}{\delta }}\to 0\, as \delta \to 0.

Similarly,

f(g(x)+\alpha )-f(g(x))=\alpha f'(g(x))+\eta (\alpha )\, where {\frac  {\eta (\alpha )}{\alpha }}\to 0\, as \alpha \to 0.\,

Now

f(g(x+\delta ))-f(g(x))\, =f(g(x)+\delta g'(x)+\epsilon (\delta ))-f(g(x))\,
=\alpha _{\delta }f'(g(x))+\eta (\alpha _{\delta })\,

where \alpha _{\delta }=\delta g'(x)+\epsilon (\delta )\,. Observe that as \delta \to 0, {\frac  {\alpha _{\delta }}{\delta }}\to g'(x) and {\frac  {\eta (\alpha _{\delta })}{\delta }}\to 0. Hence

{\frac  {f(g(x+\delta ))-f(g(x))}{\delta }}\to g'(x)f'(g(x)){\mbox{ as }}\delta \to 0.

The fundamental chain rule

The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : EF and g : FG are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative of the composition g o f at the point x is given by

{\mbox{D}}_{x}\left(g\circ f\right)={\mbox{D}}_{{f\left(x\right)}}\left(g\right)\circ {\mbox{D}}_{x}\left(f\right).

Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices (namely Jacobians), the composition on the right hand side turns into a matrix multiplication.

A particularly nice formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let f : MN and g : NP be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write

{\mbox{d}}\left(g\circ f\right)={\mbox{d}}g\circ {\mbox{d}}f.

In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of C manifolds with C maps as morphisms.

Tensors and the chain rule

See tensor field for an advanced explanation of the fundamental role the chain rule plays in the geometric nature of tensors.af:Kettingreël de:Kettenregel fr:Règle de dérivation en chaîne he:כלל השרשרת nl:Kettingregel pl:Reguła łańcuchowa sv:Kedjeregeln