2.3 Overflow, Underflow, Loss of Significance

2.3 Overflow, Underflow, Loss of Significance

Floating-point numbers live inside a bounded world. Their range is enormous compared to integers, but still finite, and it comes with boundaries. When calculations push against those boundaries, they don’t complain—they simply break in ways that can be subtle or catastrophic. Three of the most common boundary failures are overflow, underflow, and loss of significance. Understanding these is essential to writing reliable numerical software, because real systems fail for these reasons every day.


Overflow: When Numbers Grow Too Large

Overflow occurs when a value is so large that it cannot be represented within the floating-point format. Instead of raising an error, IEEE 754 quietly assigns +∞ or -∞.

For float64, the largest finite value is approximately:

1.7976931348623157 × 10³⁰⁸

Any computation that produces a number beyond this range results in infinity. That’s usually much closer than you think. Exponentials, products of moderate numbers, or repeated multiplications inside iterative algorithms can easily cross this threshold.

For example:

exp(2000) → +∞      (overflow)

From that moment on, your computation is poisoned. Infinities flow through algorithms in unpredictable ways:

  • ∞ + x = ∞
  • ∞ - ∞ = NaN
  • x / ∞ = 0

This can corrupt loss functions, gradients, iterative solvers, or any code with multiplicative steps. Overflow is often the hidden culprit behind a “loss diverged to NaN” message in neural network training.

The usual defenses are:

  • rescaling inputs (common in ML),
  • normalizing intermediate results,
  • log-domain computations,
  • using stable formulations (e.g. log-sum-exp).

Overflow is a sign that the computation is too “energetic”—values are expanding faster than the format can contain.


Underflow: When Numbers Become Too Small

Underflow is the opposite: values become so small that they collapse to zero. This may sound harmless, but it destroys information in subtle ways.

Near zero, floating-point resolution becomes extremely fine, but still not continuous. The smallest positive float64 is around:

≈ 5 × 10⁻³²⁴

Below that, numbers silently underflow to 0.0. Even values well above this threshold can become denormalized (subnormal), a special format with reduced precision. These are still representable, but their accuracy is far lower.

Underflow is common in:

  • probability computations (multiplying many small values),
  • likelihood functions,
  • recursive formulas,
  • Markov models,
  • deep learning softmax operations.

A classic example:

p = very_small_value
log(p) → -∞ if p underflows to zero

Suddenly, the log-likelihood collapses. This is why numerical analysts prefer log probabilities—to avoid catastrophic cascading underflows.

The general strategies to avoid underflow mirror those for overflow:

  • normalize intermediate values,
  • stay in the log domain,
  • avoid multiplying long chains of small numbers,
  • use stable formulations (e.g. subtracting max before exponentials).

Loss of Significance: Subtraction That Destroys Information

Of all numerical pitfalls, loss of significance is the most treacherous because it doesn’t create infinities or zeros—it quietly erases meaningful digits. It happens when two nearly equal numbers are subtracted:

(1.23456789 × 10⁵) − (1.23456780 × 10⁵)

This should yield:

9.0 × 10⁰

But floating-point rounding has already discarded some of the lower-precision digits in each operand. The result inherits all that error, producing a value that might be off by orders of magnitude.

Loss of significance appears in:

  • finite-difference gradients (f(x+h) - f(x)),
  • root-finding algorithms,
  • variance and covariance formulas,
  • quadratic equation solvers,
  • signal processing pipelines,
  • iterative refinement loops,
  • PCA and SVD preprocessing steps.

Stable formulations replace the dangerous form with algebraically equivalent but numerically safer alternatives. For example:

Bad (unstable):

f(x+h) - f(x)

Better (stable):

use automatic differentiation,
or rewrite using symbolic factorization,
or use high-precision arithmetic for the difference.

In statistics:

Bad (unstable variance formula):

Var(x) = mean(x²) − mean(x)²

Better:

Welford’s online variance algorithm

The stable version avoids catastrophic cancellation, preserves precision, and works with streaming data.

Loss of significance is one of the reasons “naively translating formulas to code” often leads to broken numerical behavior. Algorithms must be designed for floating-point arithmetic—not merely ported from math textbooks.


The Big Picture: Boundaries Shape Algorithms

Overflow imposes an upper limit. Underflow imposes a lower limit. Loss of significance imposes a precision limit.

Together, they define the “physical laws” of numerical computation. Every algorithm in this book—LU, QR, SVD, eigenvalue solvers, least squares, iterative methods—must be designed with these constraints in mind.

Numerical computing is not simply mathematics executed on a machine. It is mathematics compressed to fit inside imperfect hardware. How a number is stored determines how it behaves; how a value is represented shapes how it evolves through computation.


Where We Go Next

Now that we've explored how floating-point values can break—by growing too large, shrinking too small, or collapsing through cancellation—we’re ready to move from “what numbers can represent” to “how numbers are actually stored.”

To understand factorization algorithms, memory locality, performance, and even stability, we need to examine how vectors and matrices are laid out in RAM. This shapes the behavior of BLAS routines, the speed of linear algebra, and the design of large-scale numerical systems.

Let’s step into the next topic: 2.4 Vector and Matrix Storage in Memory.

2025-09-10

Shohei Shimoda

I organized and output what I have learned and know here.