8.3 The QR Algorithm (High-Level Intuition)

8.3 The QR Algorithm (High-Level Intuition)

The first time you encounter the QR algorithm in a numerical linear algebra course, it can feel strangely anticlimactic. On the surface, it looks almost too simple:

  1. Factor a matrix A into Q and R.
  2. Form a new matrix by multiplying RQ.
  3. Repeat.

Yet beneath this plain exterior lies one of the most influential algorithms in scientific computing. Nearly every dense eigenvalue routine in libraries like LAPACK, NumPy, MATLAB, and Julia depends on the QR algorithm or one of its relatives. When you call np.linalg.eig, this is (in spirit) the machine turning behind the scenes.

So what makes the QR algorithm so powerful? And why does such a simple sequence—factor, multiply, repeat— reveal eigenvalues with remarkable stability and speed?


The basic loop, in spirit

The QR algorithm begins with a matrix A₀ = A and builds a sequence:

Ak+1 = Rk Qk

where Ak = Qk Rk is the QR factorization of Ak.

With each iteration, the matrix changes but remains similar to the original matrix A:

Aₖ = Q₁ᵀ Q₂ᵀ ··· Qₖᵀ A Qₖ ··· Q₂ Q₁

Since similar matrices have the same eigenvalues, the QR algorithm gently transforms A toward a form where the eigenvalues become visible—typically a triangular matrix where the diagonal entries converge to the actual eigenvalues.

But this still doesn't answer the deeper question:

Why does this work?


Intuition 1: Orthogonal transformations preserve stability

One of the hardest challenges in numerical linear algebra is controlling the growth of roundoff error. But orthogonal matrices—like the Q in the QR factorization—are beautifully well-behaved:

  • They do not magnify vectors.
  • They preserve inner products and norms.
  • They are stable to compute in finite precision arithmetic.

This makes the QR algorithm effectively “error-proofed.” All the heavy lifting happens through orthogonal transformations, which keep numerical blowups under control.


Intuition 2: Repeated QR steps push the matrix toward triangular form

Triangular matrices are special: their eigenvalues are simply their diagonal elements.

The QR algorithm gradually suppresses the off-diagonal elements through repeated orthogonal transformations. Iteration after iteration, Aₖ becomes more triangular—often transforming into the Schur form, a nearly triangular matrix whose diagonal contains the eigenvalues.

This is analogous to how the power method gradually amplifies dominant directions, except now the entire matrix is being reshaped in a way that makes all eigenvalues appear.


Intuition 3: Each QR step is like performing a “global” power iteration

The power method focuses on a single eigenvector. The QR algorithm, by contrast, applies a transformation that simultaneously:

  • pushes eigenvectors into the columns of Q,
  • pushes eigenvalues into the diagonal of Aₖ.

Each iteration amplifies the direction of dominant eigenvectors, but in a matrix-wide fashion. Instead of tracking one direction, it tracks them all, restructuring the matrix accordingly.


Intuition 4: Shifts make the algorithm dramatically faster

In practice, the basic unshifted QR algorithm is rarely used. Instead, we use shifted QR iterations:

Aₖ – μₖ I = Qₖ Rₖ

Aₖ₊₁ = Rₖ Qₖ + μₖ I

where μₖ is a shift chosen to accelerate convergence—often the bottom-right entry of Aₖ.

Shifts:

  • speed up convergence from linear to quadratic or better,
  • allow the matrix to split into smaller blocks,
  • make eigenvalues appear rapidly and cleanly.

The result is so efficient that the QR algorithm can solve eigenvalue problems for modest-sized matrices in a fraction of a millisecond on modern hardware.


Intuition 5: Practical implementations use Hessenberg form

Before applying QR iterations, we usually reduce the matrix to Hessenberg form—a nearly triangular structure with zeros everywhere except the first subdiagonal.

This:

  • reduces computation from O(n³) per iteration to O(n²),
  • preserves eigenvalues,
  • makes QR drastically faster.

Thus, the modern QR algorithm is:

  1. Reduce A → Hessenberg form
  2. Perform shifted QR iterations on the Hessenberg matrix
  3. Extract eigenvalues from the diagonal

Nearly all numerical software follows this blueprint.


Why does the QR algorithm matter?

The QR algorithm is one of those rare tools that is both theoretically elegant and practically dominant. It:

  • works on any real or complex square matrix,
  • finds all eigenvalues, not just one,
  • is stable and well-understood,
  • forms the backbone of numerical linear algebra software.

Even today, with iterative solvers and specialized GPU methods, the QR algorithm remains the gold standard for dense matrices up to moderate size.


Wrapping up: where does QR fit into the bigger picture?

We have now covered three fundamental approaches to eigenvalue computation:

  • Power method — find the largest eigenvalue.
  • Inverse iteration — find a specific eigenvalue near a shift.
  • QR algorithm — compute all eigenvalues through stable orthogonal transformations.

Each of these ideas appears throughout modern AI, ML, and scientific computing. But perhaps nowhere is their impact more visible than in tools like PCA, spectral clustering, and manifold learning—methods that build directly on eigenvalues and eigenvectors.

With this foundation, we can now explore one of the most widely used eigenvalue applications in the world: dimensionality reduction and spectral analysis.

Let’s continue to 8.4 PCA and spectral methods and see how these concepts shape real data.

2025-10-09

Shohei Shimoda

I organized and output what I have learned and know here.