Blog Post
The Spectral Theorem
Symmetric matrices can always be diagonalized by an orthogonal matrix — their eigenvectors form a natural coordinate system for the data. This is the spectral theorem, and it underlies PCA, kernel methods, and graph Laplacians.
Views: –4 min readCite
In Post 5 we found that symmetric matrices have a special property: their eigenvalues are always real and their eigenvectors are always orthogonal. This isn't a coincidence — it's the consequence of one of the deepest results in linear algebra.
The Spectral Theorem says that every real symmetric matrix can be decomposed as:
where is an orthogonal matrix (its columns are the eigenvectors) and is a diagonal matrix of eigenvalues.
This factorization is called the eigendecomposition or spectral decomposition, and it unlocks a clean geometric story.
The Three-Step Geometry
Think of multiplying a vector by :
This happens in three steps:
- — rotate into the eigenbasis (the coordinate system defined by eigenvectors)
- — scale each coordinate independently by the corresponding eigenvalue
- — rotate back to the original frame
The eigenvectors are the natural axes of the transformation. In that coordinate system, is just a scaling operation — diagonal. No rotation, no mixing.
Spectral Decomposition: A = QΛQᵀ
Formal Statement
Theorem (Spectral Theorem for real symmetric matrices). Let with . Then there exists an orthogonal matrix and a diagonal matrix such that:
The columns of are orthonormal eigenvectors of , and is the eigenvalue for .
Since is orthogonal, , so this is equivalent to:
That is, diagonalizes .
The Eigenbasis
Any vector can be expressed in terms of the eigenvectors:
The coordinates are just dot products (projections), because the eigenvectors are orthonormal.
Then:
Each eigenvector direction is scaled independently. This is the key insight: in the eigenbasis, is trivially diagonal.
Eigenbasis — Drag the vector
Matrix Powers Made Easy
One of the most practical consequences is that repeated matrix multiplication becomes trivial.
From the spectral decomposition:
because . By induction:
And is just a diagonal matrix with entries — raising each eigenvalue to the -th power.
This means: the long-term behavior of is dominated by the largest eigenvalue. The eigenvector with largest grows fastest; directions with shrink to zero.
Matrix Powers: Aⁿ = QΛⁿQᵀ (λ₁=2, λ₂=0.5)
Applications
PCA (Principal Component Analysis)
Given a data matrix , the covariance matrix is:
is symmetric positive semidefinite. The spectral theorem gives . The columns of are the principal components — the natural axes of variance. The eigenvalues are the variance along each axis.
Graph Laplacians
Given an undirected graph with adjacency matrix , the graph Laplacian is where is the degree matrix. is symmetric positive semidefinite.
Its eigenvectors are the graph Fourier basis — the natural frequencies of signals on the graph. The smallest non-zero eigenvalue (the Fiedler value) measures how well-connected the graph is.
Kernel Matrices
In kernel methods (like SVMs), the kernel matrix is symmetric positive semidefinite. Its spectral decomposition gives the feature space implicitly defined by the kernel.
Positive Definite Matrices
A symmetric matrix is positive definite (PD) if all its eigenvalues are strictly positive:
Equivalently, for all . Covariance matrices are positive semidefinite (); kernel matrices are typically positive definite. We'll explore these in detail in Post 11.
Summary
The spectral theorem is a complete characterization of symmetric matrices:
| Property | Result |
|---|---|
| Eigenvalues | All real |
| Eigenvectors | Orthogonal (orthonormal if normalized) |
| Decomposition | |
| Inverse | (if ) |
| Matrix power |
Every symmetric matrix is, in its own coordinate system, just a scaling operation. The spectral theorem tells us what that coordinate system is.
In Post 11 we'll focus on the positive definite case and see why it's the natural setting for optimization problems.