4.0 Solving Ax = b
4.0 Solving Ax = b
Everything we build in numerical computation eventually comes down to one central question:
Given a matrix A and a vector b, how do we solve the equation A x = b?
It sounds simple—so simple, in fact, that many engineers assume the solution must be straightforward. But in numerical linear algebra, this problem is not just one of many; it is the problem. It sits at the heart of machine learning, optimization, simulations, graphics, control systems, statistical estimation, and every algorithm that manipulates structured data.
Training a model? You're solving Ax = b. Computing least squares? You're solving Ax = b. Simulating a physical system? Ax = b again. Stabilizing a control loop? Same story. Updating embeddings? Same story in disguise.
The entire computational universe keeps circling back to these three symbols—A, x, b—and the ways we can reliably connect them.
Why Linear Systems Matter More Than You Think
On paper, solving a linear system is a deterministic, almost mechanical process. But in real-world computation, this problem is a battlefield where numerical stability, floating-point limits, conditioning, memory layout, and algorithmic structure collide. It is here that many “perfectly correct” mathematical solutions fail dramatically.
In Chapter 3, we explored how floating-point arithmetic shapes the behavior of algorithms. In this chapter, we apply that understanding to one of the most essential computational tasks humans have invented.
And the truth is this:
Not all methods for solving Ax = b are created equal.
Some methods look elegant on paper but collapse instantly under floating-point arithmetic. Others appear complicated but remain stable under extreme conditions. And some algorithms only work if we intervene—by pivoting, rearranging, scaling, or reformulating the problem.
What This Chapter Covers
This chapter focuses on the foundations, pitfalls, and practical decision-making behind solving linear systems. We will revisit familiar techniques—not to memorize steps, but to understand their numerical behavior.
4.1 Gaussian Elimination Revisited
Most of us learned Gaussian elimination in school: reduce the matrix to upper triangular form, then back-substitute. But what we learned was a symbolic procedure. Implementing that same procedure on a computer reveals hidden fragilities: rounding error propagation, unstable elimination paths, magnitude blow-up, and breakdown without pivoting.
We will rediscover Gaussian elimination through the lens of numerical linear algebra—understanding not just how it works, but how it fails, and what we can do about it.
4.2 Row Operations and Elementary Matrices
Row operations may seem like simple mechanical steps, but they encode the structure of elimination. Each operation corresponds to multiplying by an elementary matrix, which helps us understand why elimination produces the factorization:
PA = LU
This abstraction is crucial. It reveals what the computer is doing internally, why pivoting protects stability, and how elimination transforms both structure and error.
4.3 Pivoting Strategies
Pivoting is the quiet hero of numerical computation. Without it, Gaussian elimination is a recipe for disaster. With it, elimination becomes one of the most reliable tools we have.
But pivoting is not monolithic. There is:
- Partial pivoting – cheap, effective, widely used
- Complete pivoting – more robust, more expensive
- Scaled pivoting – helpful for poorly scaled matrices
We will examine when pivoting is necessary, when its absence is catastrophic, and why hardware-friendly implementations overwhelmingly standardize on partial pivoting.
4.4 When Elimination Fails
Elimination can fail in predictable ways: divisions by very small numbers, explosive growth in intermediate values, catastrophic error amplification, or attempts to invert nearly singular matrices. These failures are not edge cases—they happen in ML training, covariance estimation, PCA pipelines, control loops, and more.
Understanding the failure modes of elimination allows us to recognize when to abandon LU and switch to QR or SVD. In practical engineering, choosing the right solver is as important as knowing how each one works.
Why This Chapter Really Matters
You might be tempted to think Gaussian elimination is “old math”—something solved long ago. But in reality, it remains one of the most misunderstood—and most commonly misused—algorithms in engineering.
Every time a system produces NaNs, diverges unexpectedly, or becomes hypersensitive to small changes in inputs, there is a high chance that the root cause is a poorly chosen linear solver or an ill-conditioned elimination process.
This chapter is about giving you the ability to diagnose those failures before they occur.
Transition to 4.1 Gaussian Elimination Revisited
With the stage set, we can now return to the algorithm almost everyone learns early in their education—but few truly understand in its computational form.
To understand how machines solve Ax = b, we must start with the most classic method of all:
4.1 Gaussian Elimination Revisited
Shohei Shimoda
I organized and output what I have learned and know here.タグ
検索ログ
Development & Technical Consulting
Working on a new product or exploring a technical idea? We help teams with system design, architecture reviews, requirements definition, proof-of-concept development, and full implementation. Whether you need a quick technical assessment or end-to-end support, feel free to reach out.
Contact Us