In linear algebra, the Cayley–Hamilton theorem (named after the mathematicians Arthur Cayley and William Rowan Hamilton) states that every square matrix over a commutative ring (such as the real or complexfield) satisfies its own characteristic equation.
If A is a given n×n matrix and In is the n×nidentity matrix, then the characteristic polynomial of A is defined as
where det is the determinant operation and λ is a scalar element of the base ring. Since the entries of the matrix are (linear or constant) polynomials in λ, the determinant is also an n-th order monic polynomial in λ. The Cayley–Hamilton theorem states that substituting the matrix A for λ in this polynomial results in the zero matrix,
The powers of A, obtained by substitution from powers of λ, are defined by repeated matrix multiplication; the constant term of p(λ) gives a multiple of the power A0, which is defined as the identity matrix. The theorem allows An to be expressed as a linear combination of the lower matrix powers of A. When the ring is a field, the Cayley–Hamilton theorem is equivalent to the statement that the minimal polynomial of a square matrix divides its characteristic polynomial.
The theorem was first proved in 1853 in terms of inverses of linear functions of quaternions, a non-commutative ring, by Hamilton. This corresponds to the special case of certain 4 × 4 real or 2 × 2 complex matrices. The theorem holds for general quaternionic matrices.[nb 1] Cayley in 1858 stated it for 3 × 3 and smaller matrices, but only published a proof for the 2 × 2 case. The general case was first proved by Frobenius in 1878.
For a 1×1 matrix A = (a1,1), the characteristic polynomial is given by p(λ) = λ − a, and so p(A) = (a) − a1,1 = 0 is obvious.
As a concrete example, let
Its characteristic polynomial is given by
The Cayley–Hamilton theorem claims that, if we define
We can verify by computation that indeed,
For a generic 2×2 matrix,
the characteristic polynomial is given by p(λ) = λ2 − (a + d)λ + (ad − bc), so the Cayley–Hamilton theorem states that
which is indeed always the case, evident by working out the entries of A2.
Determinant and inverse matrix
See also: Determinant § Relation to eigenvalues and trace, and Characteristic polynomial § Properties
For a general n×ninvertible matrixA, i.e., one with nonzero determinant, A−1 can thus be written as an (n − 1)-th order polynomial expression in A: As indicated, the Cayley–Hamilton theorem amounts to the identity
The coefficients ci are given by the elementary symmetric polynomials of the eigenvalues of A. Using Newton identities, the elementary symmetric polynomials can in turn be expressed in terms of power sum symmetric polynomials of the eigenvalues:
where tr (Ak) is the trace of the matrix Ak. Thus, we can express ci in terms of the trace of powers of A.
In general, the formula for the coefficients ci is given in terms of complete exponential Bell polynomials as [nb 2]
In particular, the determinant of A corresponds to c0. Thus, the determinant can be written as a trace identity
Likewise, the characteristic polynomial can be written as
and, by multiplying both sides by A−1 (note −(−1)n = (−1)n−1), one is led to an expression for the inverse of A as a trace identity,
For instance, the first few Bell polynomials are B0 = 1, B1(x1) = x1, B2(x1, x2) = x2
1 + x2, and B3(x1, x2, x3) = x3
1 + 3 x1x2 + x3.
Using these to specify the coefficients ci of the characteristic polynomial of a 2×2 matrix yields
The coefficient c0 gives the determinant of the 2×2 matrix, c1 minus its trace, while its inverse is given by
It is apparent from the general formula for cn-k, expressed in terms of Bell polynomials, that this expression, ½((trA)2 − tr(A2)), always gives the coefficient cn−2 of λn−2 in the characteristic polynomial of any n×n matrix; so, for a 3×3 matrix A, the statement of the Cayley–Hamilton theorem can also be written as
where the right-hand side designates a 3×3 matrix with all entries reduced to zero. Likewise, this determinant in the n = 3 case, is now
This expression gives the negative of coefficient cn−3 of λn−3 in the general case, as seen below.
Similarly, one can write for a 4×4 matrix A,
where, now, the determinant is cn−4,
and so on for larger matrices. The increasingly complex expressions for the coefficients ck is deducible from Newton's identities or the Faddeev–LeVerrier algorithm.
Another method for obtaining these coefficients ck for a general n×n matrix, provided no root be zero, relies on the following alternative expression for the determinant,
Hence, by virtue of the Mercator series,
where the exponential only needs be expanded to order λ−n, since p(λ) is of order n, the net negative powers of λ automatically vanishing by the C–H theorem. (Again, this requires a ring containing the rational numbers.) The coefficients of λ can be directly written in terms of complete Bell polynomials by comparing this expression with the generating function of the Bell polynomial.
Differentiation of this expression with respect to λ allows determination of the generic coefficients of the characteristic polynomial for general n, as determinants of m×m matrices,[nb 3]
Hamilton proved that for a linear function of quaternions there exists a certain equation, depending on the linear function, that is satisfied by the linear function itself.
where is the identity matrix. Cayley verified this identity for and 3 and postulated that it was true for all . For , direct verification gives
The Cayley-Hamilton theorem states that an matrix is annihilated by its characteristic polynomial, which is monic of degree .