7.1 Gram–Schmidt and Modified GS

If you ask someone to “explain QR decomposition from intuition,” the Gram–Schmidt process is almost always where they begin. It mirrors the geometry we instinctively imagine when we hear the word orthogonal — taking a set of messy, overlapping vectors and polishing them into a clean, perpendicular basis.

But although Gram–Schmidt is easy to visualize and elegant on paper, it contains a subtle flaw: it is numerically fragile. In floating-point arithmetic, the way we implement Gram–Schmidt matters just as much as what it computes.

To understand this, let's start with the classical algorithm — the one every textbook introduces first — and then unravel how real computers distort its behavior.

Classical Gram–Schmidt: the geometric picture

Suppose we are given vectors a₁, a₂, …, aₙ. We want to produce an orthonormal set q₁, q₂, …, qₙ such that:

span(a₁, …, aₙ) = span(q₁, …, qₙ)

The idea is beautifully simple:

Take the first vector a₁ and normalize it.
Remove from a₂ its projection onto q₁.
Normalize what remains to get q₂.
For each new vector aₖ, subtract projections onto all previous qᵢ.

Symbolically:

v₁ = a₁
q₁ = v₁ / ||v₁||

v₂ = a₂ − (q₁ᵀ a₂) q₁
q₂ = v₂ / ||v₂||

v₃ = a₃ − (q₁ᵀ a₃) q₁ − (q₂ᵀ a₃) q₂
q₃ = v₃ / ||v₃||
...

In geometric terms, each vₖ is simply the “new direction” left over after stripping away all previous directions. What could possibly go wrong?

The problem is: computers don’t do geometry — they do floating-point arithmetic. And this makes Gram–Schmidt far more delicate than it appears.

Where Classical Gram–Schmidt breaks

The Gram–Schmidt algorithm contains repeated subtraction between nearly parallel vectors. In floating-point arithmetic, subtraction of nearly equal numbers is one of the most dangerous operations. It results in catastrophic cancellation — the dramatic loss of significant digits.

Imagine two vectors that are almost linearly dependent:

a₂ ≈ α a₁

The projection coefficient q₁ᵀ a₂ is large, and the residual v₂ = a₂ − (q₁ᵀ a₂) q₁ is tiny. Computing “large minus almost-the-same-large” leaves us with a result whose most important digits have been lost.

In practice, this means:

q₂ is not very orthogonal to q₁.
The errors compound at each step.
By the time we reach qₖ, orthogonality may be severely degraded.

This is not a small problem — it is a fundamental numerical limitation. For matrices that are nearly rank-deficient or have nearly redundant columns, classical Gram–Schmidt can produce orthonormal vectors that are, in fact, visibly not orthonormal.

This is why classical Gram–Schmidt is mathematically correct but numerically incorrect.

Modified Gram–Schmidt: the simple but powerful fix

Modified Gram–Schmidt (MGS) is not a different algorithm mathematically. It is the same projection-subtraction idea — but implemented in a way that significantly reduces error.

The key difference is the order of operations. Instead of subtracting all projections at once, MGS subtracts each projection step-by-step from the working vector.

Classical version:

vₖ = aₖ − Σ (qᵢᵀ aₖ) qᵢ

Modified version:

vₖ = aₖ
for i = 1..k-1:
    rᵢₖ = qᵢᵀ vₖ
    vₖ = vₖ − rᵢₖ qᵢ
qₖ = vₖ / ||vₖ||

Why does this matter? Because subtracting projections incrementally keeps the working vector small and reduces catastrophic cancellation. MGS is not perfectly stable, but it is significantly more robust and is used in many practical numerical algorithms.

In fact, in real applications:

MGS is often “good enough” for moderately conditioned problems.
It is dramatically better than classical Gram–Schmidt.
It is still not as stable as the industry-standard: Householder reflections.

Gram–Schmidt in machine learning and statistics

You may wonder: “Do I ever need to hand-code Gram–Schmidt?” Usually, no. But you need to understand it because:

Many ML/AI algorithms implicitly depend on orthogonalization.
Noise in datasets often leads to near-linear dependence.
Regression and least-squares directly depend on orthogonality.
PCA begins with orthogonalization before SVD enters the picture.
Krylov methods (iterative solvers) use variants of Gram–Schmidt every iteration.

Understanding why classical GS fails — and why MGS helps — builds the intuition needed for deeper algorithms. It also teaches a core lesson of numerical computation:

A mathematically correct algorithm is not necessarily a numerically correct algorithm.

The strength of Gram–Schmidt lies in its intuitive geometry. Its weakness lies in floating-point arithmetic. Modified Gram–Schmidt patches that weakness, but only up to a point.

To go further — to reach true numerical stability — we need a different idea entirely: reflections.

Where we go next

Both classical and modified Gram–Schmidt are valuable for understanding orthogonality, but modern numerical libraries almost never use them as the backbone of QR decomposition. Instead, they rely on a method that behaves predictably even for extremely ill-conditioned matrices: Householder reflections.

Let’s now explore Householder transformations — the method that turns orthogonalization into one of the most stable operations in numerical computing.

2025-10-02

Gram–Schmidt

Modified Gram–Schmidt

numerical stability

orthogonalization

QR decomposition

Shohei Shimoda

I organized and output what I have learned and know here.

カテゴリー

Computation & Mathematical Systems

Microsoft Teams Bots

検索ログ

Hello World bot 755 IT assistant bot 702 Deploy Teams bot to Azure 701 Microsoft Bot Framework 678 Adaptive Card Action.Submit 646 Microsoft Graph 627 Bot Framework example 613 Graph API token 610 Bot Framework Adaptive Card 600 Zendesk Teams integration 597 Adaptive Cards 594 Microsoft Teams Task Modules 594 ServiceNow bot 592 Teams app zip 592 Teams bot development 589 Bot Framework CLI 588 Teams chatbot 586 Azure App Service bot 585 Azure bot registration 585 Teams production bot 585 Azure Bot Services 581 Azure CLI webapp deploy 580 Task Modules 576 bot for sprint updates 575 Bot Framework proactive messaging 570 Teams bot packaging 566 Teams bot tutorial 563 Microsoft Entra ID 561 Bot Framework prompts 559 proactive messages 558

Development & Technical Consulting

Working on a new product or exploring a technical idea? We help teams with system design, architecture reviews, requirements definition, proof-of-concept development, and full implementation. Whether you need a quick technical assessment or end-to-end support, feel free to reach out.

7.1 Gram–Schmidt and Modified GS