# Formal power series

In mathematics, formal power series are devices that make it possible to employ much of the analytical machinery of power series in settings that do not have natural notions of "convergence". They are also useful, especially in combinatorics, for providing compact representations of sequences and for finding closed formulas for recursively defined sequences; this is known as the method of generating functions.

## Informal introduction

A formal power series can be loosely thought of as a "polynomial with infinitely many terms". Alternatively, for those familiar with power series (or Taylor series), one may think of a formal power series as a power series in which we ignore questions of convergence. For example, consider the series

$\displaystyle A = 1 - 3x + 5x^2 - 7x^3 + 9x^4 - 11x^5 + \cdots.$

If we studied this as a power series, its properties would include, for example, that its radius of convergence is 1. However, as a formal power series, we may ignore this completely; all that is relevant is the sequence of coefficients [1, −3, 5, −7, 9, −11, ...]. In other words, a formal power series is just an object that records a sequence of coefficients.

Arithmetic on formal power series is carried out by simply pretending that the series are polynomials. For example, if

$\displaystyle B = 2x + 4x^3 + 6x^5 + \cdots,$

then we add A and B term by term:

$\displaystyle A + B = 1 - x + 5x^2 - 3x^3 + 9x^4 - 5x^5 + \cdots.$

We can multiply formal power series, again just by treating them as polynomials:

$\displaystyle AB = 2x - 6x^2 + 14x^3 - 26x^4 + 44x^5 + \cdots.$

Notice that each coefficient in the product AB only depends on a finite number of coefficients of A and B. For example, the x5 term is given by

$\displaystyle 44x^5 = (1\times 6x^5) + (5x^2 \times 4x^3) + (9x^4 \times 2x).\,\!$

For this reason, one may multiply formal power series without worrying about the usual questions of absolute, conditional and uniform convergence which arise in dealing with power series in the setting of analysis.

Similarly, many other operations that are carried out on polynomials can be extended to the formal power series setting, as explained below.

## Formal development

### Two definitions of the formal power series ring

We start with a commutative ring R. We want to define the ring of formal power series over R in the variable X, denoted by R[[X]]; elements of this ring should be thought of as power series whose coefficients are elements of R.

Perhaps the most efficient definition of R[[X]] is as the completion of the polynomial ring R[X] with respect to the I-adic topology determined by the ideal I of R[X] generated by X. This results in a complete topological ring containing R[X] as a dense subspace. This method determines the ring structure and topological structure simultaneously.

However, it is possible to describe R[[X]] more explicitly and with less algebraic machinery, giving the ring structure and topological structure separately, as follows.

#### Ring structure

We begin with the set RN of all infinite sequences in R. We define addition of two such sequences by

$\displaystyle \left( a_n \right) + \left( b_n \right) = \left( a_n + b_n \right)$

and multiplication by

$\displaystyle \left( a_n \right) \times \left( b_n \right) = \left( \sum_{k=0}^n a_k b_{n-k} \right).$

This type of product is called the Cauchy product of the two sequences of coefficients, and is a sort of discrete convolution. With these operations, RN becomes a commutative ring with zero element (0, 0, 0, ...) and multiplicative identity (1, 0, 0,...).

If we identify the element a of R with the sequence (a, 0, 0, ...) and define X := (0, 1, 0, 0, ...), then using the above definitions of addition and multiplication, we find that every sequence with only finitely many nonzero terms can be written as the finite sum

$\displaystyle (a_0, a_1, a_2, \ldots, a_N, 0, 0, \ldots) = a_0 + a_1 X + \cdots + a_N X^N = \sum_{n=0}^N a_n X^n.$

#### Topological structure

We would like to extend the above formula to a similar one for arbitrary sequences in RN, that is, we would like

$\displaystyle (a_0, a_1, a_2, a_3, \ldots) = \sum_{n=0}^\infty a_n X^n \qquad (1)$

to hold. However, for the infinite sum on the right to make sense, we need a notion of convergence in RN, which involves introducing a topology on RN. There are several equivalent ways to define the appropriate topology.

$\displaystyle d((a_n), (b_n)) = 2^{-k},\,\!$
where k is the smallest natural number such that akbk; if there is no such k, then the two sequences are identical, so we set their distance to be zero.
• We may give RN the I-adic topology, where I = (X) is the ideal generated by X, which consists of all sequences whose first term a0 is zero.

All of these definitions of the topology amount to declaring that two sequences (an) and (bn) are "close" if their first few terms agree; the more terms agree, the closer they are.

Now we can make sense of equation (1); the partial sums of the infinite sum certainly converge to the sequence on the left hand side. In fact, any rearrangement of the series converges to the same limit.

One must check that this topological structure, together with the ring operations described above, form a topological ring. This is called the ring of formal power series over R and is denoted by R[[X]].

### Universal property

The ring R[[X]] may be characterized by the following universal property. If S is a commutative associative algebra over R, if I is an ideal of S such that the I-adic topology on S is complete, and if x is an element of I, then there is a unique Φ : R[[X]] → S with the following properties:

• Φ is an R-algebra homomorphism
• Φ is continuous
• Φ(X) = x.

### Operations on formal power series

#### Inverting series

The series

$\displaystyle \sum_{n=0}^\infty a_n X^n$

in R[[X]] is invertible in R[[X]] if and only if its constant coefficient a0 is invertible in R. An important special case is that the geometric series formula is valid in R[[X]]:

$\displaystyle \left( 1 - X \right)^{-1} = \sum_{n \ge 0} X^n.$

#### Composition of series

Given formal power series

$\displaystyle f(X) = \sum_{n=1}^\infty a_n X^n = a_1 X + a_2 X^2 + \cdots$

and

$\displaystyle g(X) = \sum_{n=0}^\infty b_n X^n = b_0 + b_1 X + b_2 X^2 + \cdots,$

one may form the composition

$\displaystyle g(f(X)) = \sum_{n=0}^\infty b_n f(X)^n = \sum_{n=0}^\infty c_n X^n,$

where the coefficients cn are determined by "expanding out" the powers of f(X). A more explicit description of these coefficients is provided by Faà di Bruno's formula.

The critical point here is that this operation is only valid when f(X) has no constant term, so that the series for g(f(X)) converges in the topology of R[[X]]. In other words, each cn depends on only a finite number of coefficients of f(X) and g(X).

##### Example

If we denote by exp(X) the formal power series

$\displaystyle \exp(X) = 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} + \frac{x^4}{4!} + \cdots,$

then the expression

$\displaystyle \exp(\exp(X) - 1) = 1 + x + x^2 + \frac{5x^3}6 + \frac{5x^4}8 + \cdots$

makes perfect sense as a formal power series. However, the statement

$\displaystyle \exp(\exp(X)) = e \exp(\exp(X) - 1) = e + ex + ex^2 + \frac{5ex^3}6 + \cdots$

is not a valid application of the composition operation for formal power series. Rather, it is confusing the notions of convergence in R[[X]] and convergence in R; indeed, the ring R may not even contain any number e with the appropriate properties.

#### Formal differentiation of series

Given a formal power series

$\displaystyle f = \sum_{n\geq 0} a_n X^n$

in R[[X]], we define its formal derivative, denoted Df, by

$\displaystyle Df = \sum_{n \geq 1} a_n n X^{n-1}.$

The symbol D is called the formal differentiation operator. The motivation behind this definition is that it simply mimics term-by-term differentiation of a polynomial.

This operation is R-linear:

$\displaystyle D(af + bg) = a Df + b Dg \,\!$

for any a, b in R and any f, g in R[[X]]. Additionally, the formal derivative has many of the properties of the usual derivative of calculus. For example, the product rule is valid:

$\displaystyle D(fg) = f(Dg) + (Df) g; \,\!$

and the chain rule works as well:

$\displaystyle D(f(u)) = (Df)(u) Du, \,\!$

whenever the appropriate compositions of series are defined (see above under composition of series).

In a sense, all formal power series are Taylor series. Indeed, for the f defined above, we find that

$\displaystyle (D^k f)(0) = k! a_k, \,\!$

where Dk denotes the kth formal derivative (that is, the result of formally differentiating k times).

### Algebraic properties of the formal power series ring

R[[X]] is an associative algebra over R which contains the ring R[X] of polynomials over R; the polynomials correspond to the sequences which end in zeros.

The Jacobson radical of R[[X]] is the ideal generated by X and the Jacobson radical of R; this is implied by the element invertibility criterion discussed above.

The maximal ideals of R[[X]] all arise from those in R in the following manner: an ideal M of R[[X]] is maximal if and only if MR is a maximal ideal of R and M is generated as an ideal by X and MR.

Several algebraic properties of R are inherited by R[[X]]:

If R = K is a field, then K[[X]] has several additional properties.

### Topological properties of the formal power series ring

The metric space (R[[X]], d) is complete.

The ring R[[X]] is compact if and only if R is finite. This follows from Tychonoff's theorem and the characterisation of the topology on R[[X]] as a product topology.

## Applications

Formal power series can be used to solve recurrences occurring in number theory and combinatorics. For an example involving finding a closed form expression for the Fibonacci numbers, see the article on generating functions.

One can use formal power series to prove several relations familiar from analysis in a purely algebraic setting. Consider for instance the following elements of Q[[X]]:

$\displaystyle \sin(X) := \sum_{n \ge 0} \frac{(-1)^n} {(2n+1)!} X^{2n+1}$
$\displaystyle \cos(X) := \sum_{n \ge 0} \frac{(-1)^n} {(2n)!} X^{2n}$

Then one can show that

$\displaystyle \sin^2 + \cos^2 = 1$

and

$\displaystyle D \sin = \cos$

as well as

$\displaystyle \sin (X+Y) = \sin(X) \cos(Y) + \cos(X) \sin(Y)$

(the latter being valid in the ring Q[[X,Y]]).

In algebra, the ring K[[X1, ..., Xr]] (where K is a field) is often used as the "standard, most general" complete local ring over K.

## Interpreting formal power series as functions

In mathematical analysis, every convergent power series defines a function with values in the real or complex numbers. Formal power series can also be interpreted as functions, but one has to be careful with the domain and codomain. If f=∑an Xn is an element of R[[X]], S is a commutative associative algebra over R, I is an ideal in S such that the I-adic topology on S is complete, and x is an element of I, then we can define

$\displaystyle f(x) = \sum_{n\ge 0} a_n x^n$

This latter series is guaranteed to converge in S given the above assumptions on x. Furthermore, we have

$\displaystyle (f+g)(x) = f(x) + g(x)$

and

$\displaystyle (fg)(x) = f(x) g(x)$

Unlike in the case of bona fide functions, these formulas are not definitions but have to be proved.

Since the topology on R[[X]] is the (X)-adic topology and R[[X]] is complete, we can in particular apply power series to other power series, provided that the arguments don't have constant coefficients: f(0), f(X2-X) and f( (1-X)-1 - 1) are all well defined for any formal power series fR[[X]].

With this formalism, we can give an explicit formula for the multiplicative inverse of a power series f whose constant coefficient a=f(0) is invertible in R:

$\displaystyle f^{-1} = \sum_{n \ge 0} a^{-n-1} (a-f)^n$

If the formal power series g with g(0) = 0 is given implicitly by the equation

$\displaystyle f(g) = X$

where f is a known power series with f(0) = 0, then the coefficients of g can be explicitly computed using the Lagrange inversion theorem.

## Generalizations

### Formal Laurent series

If R = K is a field, then K[[X]] is an integral domain, so we may consider its quotient field. This is called the ring of formal Laurent series, and is denoted by K((X)). It is a topological field, and its relationship to formal power series is analogous to the relationship between power series and Laurent series. Its elements are of the form

$\displaystyle f = \sum_{n \ge -M} a_n X^n$

where M is an integer which depends on f.

One may define formal differentiation for formal Laurent series in a natural way (term-by-term). In addition to the rules listed above under formal differentiation of series, the quotient rule will also be valid.

### Power series in several variables

It is relatively straightforward to extend the above ideas to define a formal power series ring over R in r variables, denoted R[[X1,...,Xr]]. Elements of this ring may be expressed uniquely in the form

$\displaystyle \sum_{\mathbf{n}\in\Bbb{N}^r} a_\mathbf{n} \mathbf{X^n}$

where now n = (n1,...,nr) ∈ Nr, and Xn denotes the monomial X1n1...Xrnr. This sum converges for any choice of the coefficients anR, and the order of summation is immaterial.

#### Definition

One possible definition of R[[X1,...,Xr]] is to take the completion of the polynomial ring R[X1,...,Xr] in r variables with respect to the I-adic topology, where I is the ideal of R[X1,...,Xr] generated by X1,...,Xr. That is, I is the ideal consisting of polynomials with zero constant term.

Alternatively, one may proceed in a similar way to the more explicit discussion given above for the single-variable case, giving the ring structure first in terms of "multi-dimensional" sequences, and then defining the topology.

The topology on R[[X1,...,Xr]] is the J-adic topology, where J is the ideal of R[[X1,...,Xr]] generated by X1,...,Xr. That is, J is the ideal consisting of series with zero constant term. Therefore, two series are considered "close" if their first few terms agree, where "first few" means terms whose total degree n1 + ... + nr is small.

#### Warning

Although R[[X1, X2]] and R[[X1]][[X2]] are isomorphic as rings, they do not carry the same topology. For example, the sequence of elements

$\displaystyle f_n = X_1^n, \quad n \geq 1$

converges to zero in R[[X1, X2]] as n → ∞; however, in the ring R[[X1]][[X2]], it does not converge, since the copy of R[[X1]] embedded in R[[X1]][[X2]] has been given the discrete topology.

#### Operations

All of the operations defined for series in one variable may be extended to the several variables case.

• Addition is carried out term-by-term.
• Multiplication is carried out simply by "multiplying out" the series.
• A series is invertible if and only if its constant term is invertible in R.
• The composition f(g(X)) of two series f and g is defined only if the constant term of g is zero.

In the case of the formal derivative, there are now r different partial derivative operators, which differentiate with respect to each of the r variables. They all commute with each other, as they do for continuously differentiable functions.

#### Universal property

In the several variables case, the universal property characterizing R[[X1, ..., Xr]] becomes the following. If S is a commutative associative algebra over R, if I is an ideal of S such that the I-adic topology on S is complete, and if x1, ..., xr are elements of I, then there is a unique Φ : R[[X1, ..., Xn]] → S with the following properties:

• Φ is an R-algebra homomorphism
• Φ is continuous
• Φ(Xi) = xi for i = 1, ..., r.

### Replacing the index set by an ordered abelian group

Suppose G is an ordered abelian group, meaning an abelian group with a total ordering "<" respecting the group's addition, so that a < b iff a + c < b + c for all c. Let I be a well-ordered subset of G, meaning I contains no infinite descending chain. Consider the set consisting of

$\displaystyle \sum_{i \in I} a_i X^i$

for all such I, with ai in a commutative ring R, where we assume that for any index set, if all of the ai are zero then the sum is zero. Then R((G)) is the ring of formal power series on G; because of the condition that the indexing set be well-ordered the product is well-defined, and we of course assume that two elements which differ by zero are the same.

Various properties of R transfer to R((G)). If R is a field, then so is R((G)). If R is an ordered field, we can order R((G)) by setting any element to have the same sign as its leading coefficient, defined as the least element of the index set I associated to a non-zero coefficient. Finally if G is a divisible group and R is a real closed field, then R((G)) is a real closed field, and if R is algebraically closed, then so is R((G)).

This theory is due to Hans Hahn, who also showed that one obtains subfields when the number of (non-zero) terms is bounded by some fixed infinite cardinality.