8.1 Power Method and Inverse Iteration

8.1 Power Method and Inverse Iteration

If eigenvalues describe the “long-term behavior” of a matrix, the power method is the simplest, most direct way to reveal that behavior. At its core, the algorithm is almost laughably straightforward:

Multiply a vector by a matrix repeatedly — and see where it goes.

But behind this simplicity lies one of the most important ideas in numerical linear algebra: dominance. The components of a vector aligned with the direction of the largest eigenvalue grow faster than the others. Over many iterations, those dominant components overwhelm everything else.

The power method captures this phenomenon and extracts the largest eigenvalue and its eigenvector with surprising elegance — provided the matrix behaves nicely.


Why this works: dominance and repeated multiplication

Suppose a matrix A has eigenvalues:

|λ₁| > |λ₂| ≥ |λ₃| ≥ ...

and an initial vector x₀ that has at least some component along the eigenvector for λ₁. (Almost any random vector will do.)

Then:

xₖ = Aᵏ x₀

can be expressed as a combination of eigenvectors scaled by powers of eigenvalues. Because λ₁ has the largest magnitude, its contribution dominates:

Eventually, xₖ points in the direction of the eigenvector associated with λ₁.

What the power method does is straightforward:

  1. Pick a starting vector x₀.
  2. Compute xₖ₊₁ = A xₖ.
  3. Normalize to avoid overflow.
  4. Repeat until it stops changing.

Once the direction stabilizes, you can estimate the corresponding eigenvalue using a simple ratio:

λ ≈ (xₖᵀ A xₖ) / (xₖᵀ xₖ)

This estimate will come back later under a more elegant name: the Rayleigh quotient.


Where the power method shines

The power method is powerful for large, sparse problems:

  • Google’s original PageRank used a form of power iteration.
  • Spectral clustering relies on approximating leading eigenvectors.
  • PCA can use power iteration instead of a full SVD for high dimensions.
  • Large-scale graph algorithms often rely on variants of the method.

Because each iteration needs only a matrix-vector multiplication, it handles gigantic matrices that would never fit into a dense decomposition like QR or SVD.


Its fatal limitation

The power method only finds one eigenvalue: the largest in magnitude. If your matrix has:

  • a cluster of eigenvalues close in magnitude, or
  • a dominant eigenvalue that is not the one you care about

…then the method slows dramatically or converges to the wrong one.

This brings us to a natural question:

How do we find other eigenvalues?


The inverse power method: flipping dominance

Imagine you want the eigenvalue of smallest magnitude instead of the largest. The trick is brilliant in its simplicity:

Instead of A, iterate with A⁻¹.

Why does this work? Because if λ is an eigenvalue of A, then 1/λ is an eigenvalue of A⁻¹. That means:

  • the smallest eigenvalue of A becomes
  • the largest eigenvalue of A⁻¹

and now the power method applies again.

Of course, we do not explicitly compute A⁻¹ — that would be numerical suicide. Instead, each iteration solves a linear system:

(A) yₖ = xₖ

and sets xₖ₊₁ = yₖ.

This shifts the computational difficulty from eigendecomposition to solving systems — something we now have a strong numerical toolbox for (LU, Cholesky, iterative solvers).


Shifted inverse iteration: finding any eigenvalue

Sometimes you don’t want the largest or the smallest eigenvalue—you want one in the middle. In this case, we apply shifted inverse iteration:

(A − μI) yₖ = xₖ

where μ is a shift, your “guess” for the eigenvalue you want.

Eigenvalues near μ get magnified in dominance when applying the inverse, making convergence much faster. This technique lies at the heart of many production eigenvalue solvers.


Why the inverse methods matter in practice

Inverse iteration is the bridge between simple spectral intuition and industrial-strength solvers:

  • It exposes how eigenvalues behave near a chosen shift.
  • It makes it possible to extract the correct eigenvalue even when others dominate.
  • It connects eigenvalue computation to linear system solving — leveraging LU, Cholesky, and preconditioned iterative methods.

In fact, many high-level routines (e.g., scipy.linalg.eigh or LAPACK's dsyevr) use a blend of shifting, iterative improvement, and low-level transformations that are all rooted in this same conceptual framework.


Putting it all together

The power method teaches us how matrices behave under repeated application. Inverse iteration teaches us how to target specific regions of the spectrum. Together, they form the intuitive foundation for more elaborate methods we’ll see next.

But both of these techniques share a powerful idea: The best approximation of an eigenvalue comes from the behavior of A acting on a well-chosen vector.

This leads us naturally to one of the most elegant and useful quantities in numerical linear algebra:

The Rayleigh quotient.

2025-10-07

Shohei Shimoda

I organized and output what I have learned and know here.