|
fitpack
Modern Fortran library for curve and surface fitting with splines
|
This page describes the mathematical foundations of the curve fitting algorithms in FITPACK: given data points \( (x_i, y_i) \) for \( i = 1, \ldots, m \), find a spline \( s(x) \) in the B-spline space \( S(k, \mathbf{t}) \) that represents the data well. The library supports several formulations of this problem, from simple least-squares with user-supplied knots to fully automatic smoothing with adaptive knot placement.
Given \( m \) data points \( (x_i, y_i) \) with strictly increasing abscissae
\[ x_1 < x_2 < \cdots < x_m \]
and optional positive weights \( w_i \), the goal is to find a spline \( s \in S(k, \mathbf{t}) \) of degree \( k \) on a knot vector \( \mathbf{t} \) that approximates the data. Three common formulations arise:
Suppose a knot vector \( \mathbf{t} = (t_1, \ldots, t_{n+k+1}) \) is given. The spline has the B-spline representation
\[ s(x) = \sum_{j=1}^{n} c_j \, B_j(x; k, \mathbf{t}) \]
where \( n \) is the number of B-spline basis functions and \( c_j \) are the unknown coefficients. The least-squares problem is to minimize
\[ \sum_{i=1}^{m} w_i^2 \left( y_i - \sum_{j=1}^{n} c_j \, B_j(x_i) \right)^2 \]
over all coefficient vectors \( \mathbf{c} \in \mathbb{R}^n \).
Setting the gradient to zero yields the normal equations \( A^T W^2 A \, \mathbf{c} = A^T W^2 \mathbf{y} \), where \( A_{ij} = B_j(x_i) \) is the \( m \times n \) collocation matrix and \( W = \mathrm{diag}(w_i) \). Because each B-spline \( B_j \) is nonzero on at most \( k+1 \) knot spans, the matrix \( A \) has at most \( k+1 \) nonzero entries per row. The resulting normal equations matrix \( A^T W^2 A \) is symmetric, positive semi-definite, and banded with bandwidth \( k+1 \).
Rather than forming and solving the normal equations directly (which squares the condition number), FITPACK solves the least-squares problem by computing a QR factorization of the weighted matrix \( W A \) using Givens rotations. Each data point \( (x_i, y_i) \) contributes one row to the system; Givens rotations annihilate entries below the diagonal one at a time, maintaining the banded structure throughout.
This approach has several advantages:
The back-substitution step solves the upper triangular system \( R \mathbf{c} = Q^T W \mathbf{y} \) for the coefficients.
Instead of fixing the knot vector a priori, the smoothing formulation treats the number and positions of knots as unknowns. The problem is to find a spline \( s \in S(k, \mathbf{t}) \) that minimizes a roughness measure subject to a constraint on the residual:
\[ \min_{s \in S(k, \mathbf{t})} \int_{x_1}^{x_m} \left[ s''(x) \right]^2 \, dx \quad \text{subject to} \quad \sum_{i=1}^{m} w_i^2 \left( y_i - s(x_i) \right)^2 \leq S \]
Here \( S \geq 0 \) is the smoothing parameter that controls the trade-off:
For a given \( S \), the theoretical solution is a natural spline of degree \( 2k + 1 \) with knots at the data sites. FITPACK approximates this solution using a spline of degree \( k \) on a much smaller knot set, determined adaptively.
FITPACK's curfit routine implements an iterative strategy to determine both the number and positions of knots:
Smoothing parameter update. Once enough knots are available, the algorithm uses rational interpolation (Dierckx's fprati) to refine the Lagrange multiplier \( p \) that connects the unconstrained minimization
\[ \eta(p) = p \sum_{i=1}^{m} w_i^2 (y_i - s_p(x_i))^2 + (1 - p) \int [s_p''(x)]^2 \, dx \]
to the constrained formulation. This avoids solving many full least-squares problems and typically converges in a few iterations.
The procedure is designed to produce a spline with as few knots as possible while satisfying the smoothing constraint \( f_p \leq S \).
Selecting an appropriate value of \( S \) is the most important practical decision in spline smoothing. The following guidelines apply:
Known noise variance. If the data satisfy \( y_i = g(x_i) + \varepsilon_i \) where \( \varepsilon_i \) are independent errors with variance \( \sigma^2 \) and the weights are \( w_i = 1 \), the expected value of the residual sum of squares for the true function is \( m \sigma^2 \). A natural choice is
\[ S \approx m \, \sigma^2 \]
More generally, for non-uniform weights, \( S \approx \sum w_i^2 \sigma_i^2 \).
curve\mse(). If the residual is much larger than the expected noise level, decrease \( S \); if the spline oscillates visibly, increase \( S \).A practical workflow is to try a sequence of \( S \) values (e.g., on a logarithmic scale) and select the one that gives the best balance between smoothness and fidelity, or use cross-validation techniques.
When the data represent a function that is periodic on \( [x_1, x_m] \) (for example, angular measurements as a function of time), the fitted spline should satisfy periodicity constraints:
\[ s^{(j)}(x_1) = s^{(j)}(x_m), \quad j = 0, 1, \ldots, k-1 \]
These \( k \) conditions ensure that the spline and its first \( k-1 \) derivatives match at the endpoints, producing a globally smooth periodic function.
In terms of the B-spline representation, periodicity is enforced by wrapping the knot vector and tying the first and last \( k \) coefficients:
\[ c_j = c_{n - k + j}, \quad j = 1, \ldots, k \]
This reduces the number of free coefficients from \( n \) to \( n - k \). The modified least-squares problem is solved with the same Givens rotation approach.
FITPACK's percur routine implements automatic-knot periodic curve fitting. The corresponding type is fitpack_periodic_curve.
A parametric curve in \( \mathbb{R}^d \) is defined by \( d \) coordinate functions of a common parameter \( u \):
\[ \mathbf{x}(u) = \bigl( s_1(u), \, s_2(u), \, \ldots, \, s_d(u) \bigr) \]
Given data points \( \mathbf{x}_i \in \mathbb{R}^d \) for \( i = 1, \ldots, m \), the first step is to assign parameter values \( u_i \).
The default choice is cumulative chord length:
\[ u_1 = 0, \quad u_{i+1} = u_i + \| \mathbf{x}_{i+1} - \mathbf{x}_i \|, \quad i = 1, \ldots, m-1 \]
This ensures that parameter increments are proportional to the spacing between successive data points, preventing bunching in regions of high curvature.
All \( d \) component splines \( s_j(u) \) share the same knot vector \( \mathbf{t} \) and degree \( k \). The smoothing problem becomes: minimize
\[ \sum_{j=1}^{d} \int \left[ s_j''(u) \right]^2 \, du \quad \text{subject to} \quad \sum_{i=1}^{m} \sum_{j=1}^{d} w_i^2 \left( x_{ij} - s_j(u_i) \right)^2 \leq S \]
FITPACK's parcur routine handles this by solving \( d \) coupled least-squares problems with a common knot vector.
For closed parametric curves, the spline must satisfy periodic boundary conditions in every component:
\[ s_j^{(l)}(u_1) = s_j^{(l)}(u_m), \quad l = 0, 1, \ldots, k-1, \quad j = 1, \ldots, d \]
This is implemented by clocur (type fitpack_closed_curve).
In some applications, the tangent vector or higher derivatives at the curve endpoints are prescribed. FITPACK's concur routine supports derivative constraints of the form:
\[ s_j^{(l)}(u_1) = \alpha_{jl}, \quad s_j^{(l)}(u_m) = \beta_{jl} \]
for selected orders \( l \) and components \( j \), implemented by fitpack_constrained_curve.
In many applications the fitted curve must be shape-preserving: convex data should produce a convex spline, concave data a concave spline. FITPACK supports local convexity constraints via the second derivative.
A spline \( s(x) \) of degree \( k \geq 2 \) is convex on an interval if and only if \( s''(x) \geq 0 \) there. For B-splines, this translates into conditions on the second-order divided differences of the coefficients:
\[ \Delta^2 c_j = c_{j+2} - 2 c_{j+1} + c_j \geq 0 \]
for all \( j \) whose corresponding B-splines overlap the interval. Concavity requires \( \Delta^2 c_j \leq 0 \).
FITPACK allows the user to specify convexity constraints independently for each data interval via a flag vector \( v_i \):
| Flag value | Meaning |
|---|---|
| \( +1 \) | The spline must be concave on \( [x_i, x_{i+1}] \) |
| \( -1 \) | The spline must be convex on \( [x_i, x_{i+1}] \) |
| \( 0 \) | No constraint (the spline is free) |
The constrained fitting problem becomes a quadratic program (QP): minimize the roughness integral (or weighted residual) subject to the linear inequality constraints on \( \Delta^2 c_j \) and the smoothing constraint \( f_p \leq S \). FITPACK solves this QP iteratively, combining the adaptive knot strategy with active-set management for the convexity constraints.
Two routines are provided:
concon (type fitpack_convex_curve): automatic knot placement with convexity constraints. The algorithm adds knots as in curfit but enforces the shape constraints at each step.cocosp: fixed knot vector with convexity constraints. The user supplies the knots and the routine solves the constrained least-squares problem.| Formulation | Routine | Type | Key parameter |
|---|---|---|---|
| Automatic smoothing | curfit | fitpack_curve | Smoothing factor \( S \) |
| Interpolation | curfit | fitpack_curve | \( S = 0 \) |
| Fixed-knot least-squares | curfit | fitpack_curve | User knot vector |
| Periodic smoothing | percur | fitpack_periodic_curve | Smoothing factor \( S \) |
| Open parametric | parcur | fitpack_parametric_curve | Smoothing factor \( S \) |
| Closed parametric | clocur | fitpack_closed_curve | Smoothing factor \( S \) |
| Endpoint-constrained | concur | fitpack_constrained_curve | Derivative values |
| Convex (auto knots) | concon | fitpack_convex_curve | Convexity flags \( v_i \) |
| Convex (fixed knots) | cocosp | — | Convexity flags \( v_i \) |