# Binomial coefficient

See binomial (disambiguation) for a list of other topics called by that name.

In mathematics, particularly in combinatorics, the binomial coefficient of the natural number n and the integer k is defined to be the natural number

$\displaystyle {n \choose k} = \frac{n!}{k!(n-k)!} \quad \mbox{if } n\geq k\geq 0 \qquad \mbox{(1)}$

and

$\displaystyle {n \choose k} = 0 \quad \mbox{if } k<0 \mbox{ or } k>n$

where m! denotes the factorial of m. According to Nicholas J. Higham, the

$\displaystyle {n \choose k}$

notation was introduced by Albert von Ettinghausen in 1826, although these numbers have been known centuries before that; see Pascal's triangle.

The binomial coefficient of n and k is also written as C(n, k), nCk or $\displaystyle C^{k}_{n}$ (C for combination) and read as "n choose k". For compactness, from here on we will use the first of these three notations.

For example,

$\displaystyle \mathrm{C}(7, 3) = \frac{1 \cdot 2 \cdot 3 \cdot 4 \cdot 5 \cdot 6 \cdot 7}{(1 \cdot 2 \cdot 3)(1 \cdot 2 \cdot 3 \cdot 4)} = \frac{7\cdot 6 \cdot 5}{3\cdot 2\cdot 1} = 35.$

This example can be generalized in the following manner (can be easily be proved using the definition of the factorial function)

$\displaystyle \mathrm{C} (n, k) = \frac{n!}{k!(n-k)!} = \frac{n \cdot (n-1) \cdots (n-k+1)}{k \cdot (k-1) \cdots 1}.$

The binomial coefficients are the coefficients in the expansion of the binomial (x + y)n (hence the name):

$\displaystyle (x+y)^n = \sum_{k=0}^{n} {n \choose k} x^{n-k} y^k. \qquad (2)$

This is generalized by the binomial theorem, which allows the exponent n to be negative or a non-integer.

## Derivation from binomial expansion

For exponent 1, (x+y)1 is x+y. For exponent 2, (x+y)2 is (x+y)(x+y), which forms terms as follows. The first factor supplies either an x or a y; likewise for the second factor. Thus to form x2, the only possibility is to choose x from both factors; likewise for y2. However, the xy term can be formed by x from the first and y from the second factor, or y from the first and x from the second factor; thus it acquires a coefficient of 2. Proceeding to exponent 3, (x+y)3 reduces to (x+y)2(x+y), where we already know that (x+y)2 = x2+2xy+y2. Again the extremes, x3 and y3 arise in a unique way. However, the term x2y is either 2xy times x or x2 times y, for a coefficient of 3; likewise xy2 arises in two ways, summing the coefficients 1 and 2 to give 3.

This suggests an induction. Thus for exponent 4, each term has total degree (sum of exponents) 4 (in general, n), with 4−k factors of x and k factors of y. If k is neither 0 nor 1 (terms x4 or y4), then the term arises in two ways, from adjacent coefficients with total degree 3. For example, x2y2 is both xy2 times x and x2y times y, thus its coefficient is 3+3. This is the origin of Pascal's triangle, discussed below.

Another perspective is that to form xnkyk from n factors of (x+y), we must choose y from k of the factors and x from the rest. To count the possibilities, consider all n! permutations of the factors. Represent each permutation as a shuffled list of the numbers from 1 to n. Select an x from the first nk factors listed, and a y from the remaining k factors; in this way each permutation contributes to the term xnkyk. For example, the list 〈4,1,2,3〉 selects x from factors 4 and 1, and selects y from factors 2 and 3, as one way to form the term x2 y2.

(x +1 y) (x +2 y) (x +3 y) (x +4 y)

But the distinct list 〈1,4,3,2〉 makes exactly the same selection; the binomial coefficient formula must remove this redundancy. The nk factors for x have (nk)! permutations, and the k factors for y have k! permutations. Therefore n! / (nk)! k! is the number of truly distinct ways to form the term xnkyk.

This discussion extends to the case when each factor is a sum of multiple variables, leading naturally to the definition of a multinomial coefficient. A convenient notation uses a list of variables, x = (x1,…,xm), with the exponents given as another list, E = (e1,…,em), called a multi-index. The terms in the expansion of (x1+⋯+xm)n have the form

$\displaystyle {\mathbf x}^E = x_1^{e_1} x_2^{e_2} \cdots x_m^{e_m},$

where |E| = e1+⋯+em = n, and the coefficient of such a term is the multinomial coefficient

$\displaystyle \frac{n!}{e_1! e_2! \cdots e_m!}.$

The simple binomial coefficients are the case m = 2.

## Pascal's triangle

Pascal's rule is the important recurrence relation

$\displaystyle \mathrm{C}(n,k) + \mathrm{C}(n,k+1) = C(n+1,k+1), \qquad (3)$

which follows directly from the definition. This recurrence relation can be used to prove by mathematical induction that C(n, k) is a natural number for all n and k, a fact that is not immediately obvious from the definition.

It also gives rise to Pascal's triangle:

row 0                     1
row 1                   1   1
row 2                 1   2   1
row 3               1   3   3   1
row 4             1   4   6   4   1
row 5           1   5   10  10   5   1
row 6         1   6   15  20  15   6   1
row 7       1   7   21  35  35   21  7   1
row 8     1   8   28  56  70  56   28  8   1


Row number n contains the numbers C(n, k) for k = 0,...,n. It is constructed by starting with ones at the outside and then always adding two adjacent numbers and writing the sum directly underneath. This method allows the quick calculation of binomial coefficients without the need for fractions or multiplications. For instance, by looking at row number 5 of the triangle, one can quickly read off that

(x + y)5 = 1x5 + 5 x4y + 10 x3y2 + 10 x2y3 + 5 x y4 + 1y5.

The differences between elements on other diagonals are the elements in the previous diagonal - consequential to the recurrence relation (3) above.

In the 1303 AD treatise Precious Mirror of the Four Elements, Zhu Shijie mentioned the triangle as an ancient method for solving binomial coefficients indicating that the method was known to Chinese mathematicians five centuries before Pascal.

## Combinatorics and statistics

Binomial coefficients are of importance in combinatorics, because they provide ready formulas for certain frequent counting problems:

• Every set with n elements has $\displaystyle \mathrm{C}(n, k)$ different subsets having k elements each (these are called k-combinations).
• The number of strings of length n containing k ones and n − k zeros is $\displaystyle \mathrm{C}(n, k).$
• There are $\displaystyle \mathrm{C}(n+1, k)$ strings consisting of k ones and n zeros such that no two ones are adjacent.
• The number of sequences consisting of n natural numbers whose sum equals k is $\displaystyle \mathrm{C}(n+k-1, k)$ ; this is also the number of ways to choose k elements from a set of n if repetitions are allowed.
• The Catalan numbers have an easy formula involving binomial coefficients; they can be used to count various structures, such as trees and parenthesized expressions.

The binomial coefficients also occur in the formula for the binomial distribution in statistics and in the formula for a Bézier curve.

## Formulas involving binomial coefficients

The following formulas are occasionally useful:

$\displaystyle \mathrm{C}(n,k)= \mathrm{C}(n, n-k)\qquad\qquad(4)\,$

This follows from expansion (2) by using (x + y)n = (y + x)n, and is reflected in the numerical "symmetry" of Pascal's triangle.

$\displaystyle \sum_{k=0}^{n} \mathrm{C}(n,k) = 2^n \qquad (5)$

From expansion (2) using x = y = 1. This is equivalent to saying that the elements in one row of Pascal's triangle always add up to two raised to an integer power.

$\displaystyle \sum_{k=1}^{n} k \mathrm{C}(n,k) = n 2^{n-1} \qquad (6)$

From expansion (2), after differentiating and substituting x = y = 1.

$\displaystyle \sum_{j} \mathrm{C}(m,j) \mathrm{C}(n-m,k-j) = \mathrm{C}(n,k) \qquad (7a)$

As C(n, k) is defined to be zero if k > n, the sum is actually finite. By expanding (1+x)m (1+x)n-m = (1+x)n with (2). Equation (7a) generalizes equation (3). Equation (7a) is Vandermonde's convolution formula (after Alexandre-Théophile Vandermonde) and is essentially a form of the Chu-Vandermonde identity. It can be shown to hold for arbitrary, complex-valued $\displaystyle m$ and $\displaystyle n$ .

$\displaystyle \sum_{m} \mathrm{C}(m,j) \mathrm{C}(n-m,k-j) = \mathrm{C}(n+1,k+1) \qquad (7b)$

While equation (7a) is true for all values of m, equation (7b) is true for all values of j.

$\displaystyle \sum_{k=0}^{n} \mathrm{C}(n,k)^2 = \mathrm{C}(2n,n) \qquad (8)$

From expansion (7) using m = k = n and (4).

$\displaystyle \sum_{k=0}^{n} \mathrm{C}(n-k,k) = \mathrm{F}(n+1) \qquad (9)$

Here, F(n + 1) denotes the Fibonacci numbers. This formula about the diagonals of Pascal's triangle can be proven with induction using (3).

$\displaystyle \sum_{j=k}^{n} \mathrm{C}(j,k) = \mathrm{C}(n+1,k+1) \qquad (10)$

This can be proven by induction on n using (3).

## Divisors of binomial coefficients

The prime divisors of C(n, k) can be interpreted as follows: if p is a prime number and pr is the highest power of p which divides C(n, k), then r is equal to the number of natural numbers j such that the fractional part of k/pj is bigger than the fractional part of n/pj. In particular, C(n, k) is always divisible by n/gcd(n,k).

## Bounds for binomial coefficients

The following bounds for C(n, k) hold:

• $\displaystyle \mathrm{C}(n, k) \le \frac{n^k}{k!}$
• $\displaystyle \mathrm{C}(n, k) \le \left(\frac{n\cdot e}{k}\right)^k$
• $\displaystyle \mathrm{C}(n, k) \ge \left(\frac{n}{k}\right)^k$

## Generalization to real and complex argument

The binomial coefficient $\displaystyle {z\choose k}$ can be defined for any complex number z and any natural number k as follows:

$\displaystyle {z\choose k} = \prod_{n=1}^{k}{z-k+n\over n}= \frac{z(z-1)(z-2)\cdots (z-k+1)}{k!} \qquad (11)$

This generalization is known as the generalized binomial coefficient and is used in the formulation of the binomial theorem and satisfies properties (3) and (7).

For fixed k, the expression $\displaystyle f(z)={z\choose k}$ is a polynomial in z of degree k with rational coefficients.

f(z) is the unique polynomial of degree k satisfying

f(0)=f(1)=...=f(k-1)=0

and

f(k)=1.

Any polynomial p(z) of degree d can be written in the form

$\displaystyle p(z) = \sum_{k=0}^{d} a_k {z\choose k}$

This is important in the theory of difference equations and can be seen as a discrete analog of Taylor's theorem.

Newton's binomial series gets the simple form

$\displaystyle (1+z)^{\alpha} = \sum_{r = 0}^{\infty}{\alpha\choose r}z^r = 1+{\alpha\choose1}z+{\alpha\choose 2}z^2+..$ .

It is not hard to show that the radius of convergence of this series is 1.

## Generalization to q-series

The binomial coefficient has a q-analog generalization known as the Gaussian binomial.