fitpack
Modern Fortran library for curve and surface fitting with splines
Loading...
Searching...
No Matches
Surface Fitting Theory

This page describes the mathematical foundations for bivariate spline fitting in FITPACK: extending the one-dimensional B-spline framework to approximate surfaces \( z = s(x, y) \) defined over a rectangular domain.

Tensor Product Splines

A bivariate spline on the rectangular domain \( [a, b] \times [c, d] \) is defined as a tensor product of univariate B-splines:

\[ s(x, y) = \sum_{i=1}^{n_x - k_x - 1} \sum_{j=1}^{n_y - k_y - 1} c_{ij} \, N_{i, k_x}(x) \, M_{j, k_y}(y) \]

where:

  • \( N_{i, k_x}(x) \) are B-splines of degree \( k_x \) on the knot vector \( \mathbf{t}_x = (t_{x,1}, \ldots, t_{x, n_x}) \),
  • \( M_{j, k_y}(y) \) are B-splines of degree \( k_y \) on the knot vector \( \mathbf{t}_y = (t_{y,1}, \ldots, t_{y, n_y}) \),
  • \( c_{ij} \) are the \( (n_x - k_x - 1)(n_y - k_y - 1) \) unknown coefficients.

The two knot vectors are independent: the number of knots and their placement can differ between the \( x \)- and \( y \)-directions. In FITPACK, both degrees default to \( k_x = k_y = 3 \) (bicubic).

See also
B-Spline Foundations for a detailed treatment of univariate B-spline basis functions and the de Boor–Cox recurrence.

Scattered Data Fitting (surfit)

Given \( m \) scattered observations \( (x_i, y_i, z_i) \) with weights \( w_i > 0 \), the goal is to find a tensor product spline \( s(x, y) \) that approximates the data while remaining smooth.

The Observation Matrix

Each data point \( (x_i, y_i) \) lies in a unique cell of the knot grid. For a given point, let \( \mu \) and \( \nu \) index the non-zero basis functions in \( x \) and \( y \), respectively. The observation matrix \( \mathbf{A} \) has entries:

\[ A_{i, \ell} = N_{\mu, k_x}(x_i) \, M_{\nu, k_y}(y_i), \qquad \ell = (\mu - 1)(n_y - k_y - 1) + \nu \]

where the two-dimensional index pair \( (\mu, \nu) \) is mapped to a single column index \( \ell \). Each row of \( \mathbf{A} \) has at most \( (k_x + 1)(k_y + 1) \) non-zero entries, so the matrix is sparse.

Least-Squares Formulation

The weighted least-squares problem is:

\[ \min_{\mathbf{c}} \sum_{i=1}^{m} w_i^2 \left( z_i - s(x_i, y_i) \right)^2 \]

which in matrix form becomes \( \min \| \mathbf{W}(\mathbf{z} - \mathbf{A} \mathbf{c}) \|_2^2 \) with \( \mathbf{W} = \mathrm{diag}(w_1, \ldots, w_m) \).

Smoothing Formulation

The smoothing spline minimizes a roughness functional subject to a data fidelity constraint:

\[ \min \iint \left[ \left( \frac{\partial^2 s}{\partial x^2} \right)^2 + 2 \left( \frac{\partial^2 s}{\partial x \, \partial y} \right)^2 + \left( \frac{\partial^2 s}{\partial y^2} \right)^2 \right] dx \, dy \quad \text{subject to} \quad \sum_{i=1}^{m} w_i^2 (z_i - s(x_i, y_i))^2 \leq S \]

where \( S \) is the smoothing factor. This is the bivariate analogue of the univariate smoothing spline problem.

Rank Deficiency

When few data points fall within certain cells of the knot grid, the observation matrix \( \mathbf{A} \) can become rank deficient. This is more common in the bivariate case than in one dimension, because adding knots in both directions creates cells that may contain no data at all.

FITPACK handles rank deficiency using Householder QR factorization with column pivoting. When the numerical rank \( r \) of the system is less than the number of unknowns, the algorithm computes a minimum-norm solution among all least-squares solutions.

Adaptive Knot Placement

Starting from a minimal knot set, surfit iteratively refines the approximation:

  1. Fit the current knot configuration by solving the least-squares problem.
  2. If the residual sum of squares exceeds \( S \), identify regions with large residuals.
  3. Insert new knots alternately in the \( x \)- and \( y \)-directions, placing each knot where the local residual is largest.
  4. Repeat until \( \sum w_i^2 (z_i - s(x_i, y_i))^2 \leq S \) is satisfied or a maximum knot count is reached.

The alternating strategy ensures balanced refinement across both coordinate directions.

See also
Dierckx, Ch. 5, §5.3 (pp. 85–98)

Gridded Data Fitting (regrid)

When the data lie on a rectangular grid \( z_{ji} = z(y_j, x_i) \) with \( i = 1, \ldots, m_x \) and \( j = 1, \ldots, m_y \), the problem has a Kronecker product structure that can be exploited for dramatic computational savings.

Kronecker Product Structure

On a grid, the observation matrix factors as a Kronecker product:

\[ \mathbf{A}_{\text{grid}} = \mathbf{B}_x \otimes \mathbf{B}_y \]

where \( \mathbf{B}_x \) is the \( m_x \times (n_x - k_x - 1) \) matrix of univariate B-spline values at the \( x \)-grid points, and \( \mathbf{B}_y \) is the \( m_y \times (n_y - k_y - 1) \) matrix at the \( y \)-grid points. The coefficient matrix conceptually satisfies:

\[ \mathbf{C} = \mathbf{B}_x^{-1} \, \mathbf{Z} \, (\mathbf{B}_y^T)^{-1} \]

where \( \mathbf{Z} \) is the \( m_x \times m_y \) matrix of observed values and \( \mathbf{C} \) holds the spline coefficients.

Computational Advantage

The factored form means the normal equations can be solved dimension by dimension rather than simultaneously:

Approach Complexity
Scattered (surfit) \( \mathcal{O}(m \cdot n_x \cdot n_y) \)
Gridded (regrid) \( \mathcal{O}(m_x \cdot m_y \cdot (k_x + k_y)) \)

For a 100 x 100 grid with 20 knots in each direction, the gridded approach is roughly two orders of magnitude faster. Always prefer fitpack_grid_surface when data lie on a rectangular grid.

The smoothing framework (roughness minimization subject to a fidelity constraint) is identical to the scattered case; only the numerical solution method differs.

See also
Dierckx, Ch. 5, §5.4 (pp. 98–103)

The Smoothing Norm

For bivariate splines, the roughness measure used in the smoothing formulation is the thin-plate energy functional:

\[ R[s] = \iint \left[ \left( \frac{\partial^2 s}{\partial x^2} \right)^2 + 2 \left( \frac{\partial^2 s}{\partial x \, \partial y} \right)^2 + \left( \frac{\partial^2 s}{\partial y^2} \right)^2 \right] dx \, dy \]

This is the natural extension of \( \int (s'')^2 \, dx \) to two dimensions. It penalizes curvature equally in all directions and is invariant under rotation of the coordinate axes. The cross-derivative term \( (\partial^2 s / \partial x \, \partial y)^2 \) ensures that diagonal oscillations are penalized as well.

For a tensor product spline, this integral can be expressed as a quadratic form in the coefficients \( c_{ij} \), involving integrals of products of B-spline second derivatives. FITPACK computes these integrals analytically using the knot vectors.

Choosing the Smoothing Factor

The smoothing factor \( S \) controls the balance between fidelity and smoothness:

  • ** \( S = 0 \)**: Interpolation. The spline passes through every data point (requires enough knots to satisfy the interpolation conditions).
  • **Small \( S \)**: Close fit to the data, but the surface may exhibit spurious oscillations.
  • **Large \( S \)**: Very smooth surface that may underfit the data.

Practical Guidelines

For scattered data with unit weights ( \( w_i = 1 \)), a reasonable starting point is:

\[ S \approx m \]

where \( m \) is the number of data points. This follows from the expectation that, for a good fit with normally distributed residuals, the sum of squared residuals should be \( \mathcal{O}(m) \).

For gridded data with unit weights, the analogous rule is:

\[ S \approx m_x \cdot m_y \]

After fitting, inspect the mean squared error via the mse() method to assess fit quality. If the residual is too large, decrease \( S \); if the surface oscillates, increase \( S \).

When data have non-unit weights, interpret \( S \) relative to the weighted residual. Setting \( w_i = 1 / \sigma_i \) (where \( \sigma_i \) is the measurement standard deviation) makes the residual sum an approximate \( \chi^2 \) statistic, and \( S \approx m \) corresponds to a reduced \( \chi^2 \) of unity.

Parametric Surfaces

For surfaces embedded in \( \mathbb{R}^d \) and parameterized by \( (u, v) \), each coordinate component is represented as a separate bivariate spline sharing common knot vectors:

\[ s_l(u, v) = \sum_{i=1}^{n_u - k_u - 1} \sum_{j=1}^{n_v - k_v - 1} c_{ij}^{(l)} \, N_{i, k_u}(u) \, M_{j, k_v}(v), \qquad l = 1, \ldots, d \]

All \( d \) components share the same knot vectors \( \mathbf{t}_u \) and \( \mathbf{t}_v \), and the same spline degrees \( k_u \) and \( k_v \). Only the coefficient arrays \( c_{ij}^{(l)} \) differ between components.

Periodicity

Either parameter direction (or both) may be periodic:

  • **Periodic in \( u \)**: enforces \( s_l^{(p)}(u_{\min}, v) = s_l^{(p)}(u_{\max}, v) \) for \( p = 0, 1, \ldots, k_u - 1 \).
  • **Periodic in \( v \)**: enforces \( s_l^{(p)}(u, v_{\min}) = s_l^{(p)}(u, v_{\max}) \) for \( p = 0, 1, \ldots, k_v - 1 \).

Periodicity is useful for closed surfaces such as cylinders and tori. The FITPACK routine parsur handles the general case.

See also
Dierckx, Ch. 10, §10.2 (pp. 241–254)

Summary

Problem FITPACK Type Core Routine Key Advantage
Scattered \( (x_i, y_i, z_i) \) fitpack_surface surfit Handles arbitrary point distributions
Gridded \( z(y_j, x_i) \) fitpack_grid_surface regrid Kronecker structure, much faster
Parametric \( \mathbf{r}(u, v) \) fitpack_parametric_surface parsur Surfaces in \( \mathbb{R}^d \)
See also
Scattered Surface Fitting Tutorial for code examples and practical usage
Polar Domain Fitting Tutorial for polar and spherical domain fitting
Book Reference Index for a complete mapping of routines to book chapters