import matplotlib
if not hasattr(matplotlib.RcParams, "_get"):
    matplotlib.RcParams._get = dict.get

Theory#

Setup#

We import the Python packages used throughout the project.

import numpy as np
import matplotlib.pyplot as plt

from numpy.linalg import svd, matrix_rank
from matplotlib.colors import ListedColormap, BoundaryNorm
from PIL import Image

Singular Value Decomposition (SVD)#

For any real matrix \(A \in \mathbb{R}^{m\times n}\), there exist orthogonal matrices \(U \in \mathbb{R}^{m\times m}\) and \(V \in \mathbb{R}^{n\times n}\), and a diagonal (rectangular) matrix \(\Sigma \in \mathbb{R}^{m\times n}\) such that

\[\begin{equation*} A = U \,\Sigma\, V^T. \end{equation*}\]

The diagonal entries of \(\Sigma\) are the singular values \(\sigma_1 \ge \sigma_2 \ge \cdots \ge 0\). They measure how strongly \(A\) stretches space in certain orthogonal directions.

  • The columns of \(V\) are the right singular vectors (input directions).

  • The columns of \(U\) are the left singular vectors (output directions).

  • \(\Sigma\) contains the stretch factors along those directions.

A convenient mental picture of the decomposition is:

  1. \(V^T\): rotate (or reflect) the input coordinates into a special orthonormal basis,

  2. \(\Sigma\): stretch (and possibly squash) along orthogonal axes,

  3. \(U\): rotate (or reflect) again in the output space.

Computing SVD in NumPy#

A = np.array([[1, 2],
              [-1, 1]], dtype=float)

U, s, Vt = svd(A)          # s is a vector of singular values
Sigma = np.zeros_like(A)    # for square A; for rectangular use (m,n)
np.fill_diagonal(Sigma, s)

# Verify reconstruction
A_recon = U @ Sigma @ Vt
np.allclose(A, A_recon)
True

Geometry of SVD in 2D: Unit Circle \(\rightarrow\) Ellipse#

For a full-rank \(2\times2\) matrix \(A\), the image of the unit circle

\[\begin{equation*} S = \{x\in\mathbb{R}^2 \mid \|x\|=1\} \end{equation*}\]

under the linear map \(x \mapsto Ax\) is an ellipse. SVD explains this ellipse precisely:

  • The right singular vectors \(v_1, v_2\) (columns of \(V\)) are orthonormal directions in the input plane.

  • Their images \(Av_1\) and \(Av_2\) are orthogonal in the output plane.

  • The lengths of these images are the singular values: \(\|A v_1\| = \sigma_1\), \(\|A v_2\| = \sigma_2\).

  • After normalizing \(u_i = \frac{1}{\sigma_i}Av_i\), the vectors \(u_1, u_2\) form an orthonormal basis in the output plane (columns of \(U\)).

So, for any unit vector \(x\), writing it in the \(V\)-basis as \(x = V z\) (with \(\|z\|=1\)) gives

\[\begin{equation*} Ax = U\Sigma V^T x = U\Sigma z. \end{equation*}\]

This means \(\Sigma\) maps the unit circle to an axis-aligned ellipse with semi-axes \(\sigma_1\) and \(\sigma_2\), and then \(U\) rotates that ellipse to its final position.

Singular values and ellipse semi-axes#

  • The largest singular value \(\sigma_1\) equals the maximum stretch of any unit vector.

  • The smallest singular value \(\sigma_2\) equals the minimum stretch (for full rank, \(\sigma_2>0\)).

Rank and singular values#

The rank of \(A\) equals the number of non-zero singular values:

\[\begin{equation*} \mathrm{rank}(A) = \#\{i \mid \sigma_i > 0\}. \end{equation*}\]

In practice (floating point), “non-zero” means “larger than a small numerical tolerance.”

A = np.array([[11, 2, 8],
              [ 5,-10, 8],
              [ 3, -6,12],
              [13, -2, 4]], dtype=float)

matrix_rank(A), svd(A, compute_uv=False)
(3, array([24., 12.,  6.]))

Outer-product expansion of \(A\)#

SVD gives a very useful decomposition of \(A\) as a sum of rank-1 matrices:

\[\begin{equation*} A = \sum_{i=1}^{r} \sigma_i\, u_i\, v_i^T, \end{equation*}\]

where \(r=\mathrm{rank}(A)\), \(u_i\) is column \(i\) of \(U\), and \(v_i^T\) is row \(i\) of \(V^T\). Each term \(\sigma_i\,u_i v_i^T\) is a rank-1 “pattern” in \(A\), scaled by \(\sigma_i\).

In computations, the matrix \(u_i v_i^T\) can be formed with np.outer(u_i, v_i).

U, s, Vt = svd(A)
A_from_rank1 = sum(s[i] * np.outer(U[:, i], Vt[i, :]) for i in range(len(s)))
np.allclose(A, A_from_rank1)
True

Low-rank approximation \(A_k\)#

If \(\sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_r > 0\), then the rank-\(k\) approximation is

\[\begin{equation*} A_k = \sum_{i=1}^{k} \sigma_i\, u_i\, v_i^T, \qquad k \le r. \end{equation*}\]

This keeps only the \(k\) most important rank-1 components and discards smaller singular values, giving a compressed representation. Geometrically, it keeps only the \(k\) dominant orthogonal stretch directions.

A compact implementation:

def svd_approximation(A, k):
    U, s, Vt = svd(A, full_matrices=False)
    A_k = np.zeros_like(A, dtype=float)
    for i in range(k):
        A_k += s[i] * np.outer(U[:, i], Vt[i, :])
    return A_k

Storage ratio for SVD compression#

Storing a full \(m\times n\) matrix requires \(mn\) numbers.

A rank-\(r\) SVD-style representation stores:

  • \(r\) singular values,

  • \(m r\) numbers for the first \(r\) columns of \(U\),

  • \(n r\) numbers for the first \(r\) columns of \(V\).

So the number count is \(r(1+m+n)\), and the storage ratio is

\[\begin{equation*} SR(r,m,n)=\frac{r(1+m+n)}{mn}. \end{equation*}\]
  • If \(r \ll \min(m,n)\), then \(SR\) can be far below 1 (strong compression).

  • If \(r\) approaches \(\min(m,n)\), the ratio approaches (and may exceed) 1, so compression becomes ineffective.

def storage_ratio(r, m, n):
    return r * (1 + m + n) / (m * n)

Retained information#

A simple way to quantify how much of the matrix “signal” is retained by \(A_k\) is to look at the singular values:

\[\begin{equation*} RI(k,r)=\frac{\sigma_1+\cdots+\sigma_k}{\sigma_1+\cdots+\sigma_r}. \end{equation*}\]

This is a cumulative measure: increasing \(k\) increases \(RI\), with \(RI(r,r)=1\).

def retained_information(A):
    s = svd(A, compute_uv=False)
    return np.cumsum(s / np.sum(s))

def plot_retained_information(A):
    ri = retained_information(A) * 100
    ri = np.insert(ri, 0, 0)
    ks = np.arange(len(ri))
    plt.plot(ks, ri, marker='o')
    plt.xlabel("Number of singular values (k)")
    plt.ylabel("Retained information [%]")
    plt.title("Retained information")
    plt.grid(True, linestyle='--', alpha=0.6)
    plt.show()

SVD for images#

An image can be represented as a matrix of pixel intensities.

Grayscale images#

A grayscale image is a 2D array \(A \in \mathbb{R}^{m\times n}\) with values typically in \([0,255]\) (integers) or normalized to \([0,1]\) (floats). Applying SVD and forming \(A_k\) yields a compressed approximation that often remains visually close to the original for moderate \(k\).

A typical workflow is:

  1. load the image as a matrix \(A\),

  2. compute singular values and inspect their decay,

  3. choose \(k\) based on desired retained information and storage ratio,

  4. reconstruct \(A_k\) and compare visually.

def load_image_matrix(path, max_dim=1000, to_gray=True):
    im = Image.open(path)
    im = im.convert("L" if to_gray else "RGB")

    w, h = im.size
    scale = min(1.0, max_dim / max(w, h))
    if scale < 1.0:
        new_size = (max(1, int(round(w * scale))), max(1, int(round(h * scale))))
        im = im.resize(new_size, Image.LANCZOS)

    arr = np.asarray(im, dtype=np.float64) / 255.0
    return arr

RGB images#

An RGB image is a 3D array of shape \((m,n,3)\). A standard approach is to apply SVD separately to each channel (R, G, B), approximate each channel by rank \(k\), and then stack them back together.

def svd_rgb_approximation(img, k):
    img = np.asarray(img, dtype=float)
    img_recon = np.zeros_like(img, dtype=float)

    for c in range(3):
        img_recon[:, :, c] = svd_approximation(img[:, :, c], k)

    return np.clip(img_recon, 0.0, 1.0)