nlsq.least_squares module

This module contains the core least squares solver implementation.

Generic interface for least-squares minimization.

nlsq.core.least_squares.jacobian_mode_selector(n_params, n_residuals, mode='auto')[source]

Select Jacobian automatic differentiation mode based on problem dimensions.

Automatically chooses between forward-mode (jacfwd) and reverse-mode (jacrev) automatic differentiation based on the Jacobian shape to minimize computational cost.

Parameters:
  • n_params (int) – Number of parameters (columns in Jacobian)

  • n_residuals (int) – Number of residuals (rows in Jacobian)

  • mode ({'auto', 'fwd', 'rev'}, optional) – Jacobian mode selection. Default is ‘auto’. - ‘auto’: Automatically select based on problem dimensions - ‘fwd’: Force forward-mode AD (jacfwd) - ‘rev’: Force reverse-mode AD (jacrev)

Returns:

  • selected_mode (str) – Selected mode (‘fwd’ or ‘rev’)

  • rationale (str) – Human-readable explanation of the selection

Raises:

ValueError – If mode is not one of ‘auto’, ‘fwd’, ‘rev’

Return type:

tuple[str, str]

Notes

Selection heuristic for ‘auto’ mode:

  • Use jacrev when n_params > n_residuals (tall Jacobian, more params than residuals)

    • Reverse-mode is O(n_residuals) operations

    • Forward-mode would be O(n_params) operations

  • Use jacfwd when n_params <= n_residuals (wide Jacobian, more residuals than params)

    • Forward-mode is O(n_params) operations

    • Reverse-mode would be O(n_residuals) operations

For high-parameter problems (e.g., 1000 params, 100 residuals), jacrev can be 10-100x faster than jacfwd.

Examples

>>> from nlsq.core.least_squares import jacobian_mode_selector
>>> # Tall Jacobian (many parameters, few residuals)
>>> mode, rationale = jacobian_mode_selector(1000, 100, mode='auto')
>>> print(mode, rationale)
rev jacrev (1000 params > 100 residuals)
>>> # Wide Jacobian (few parameters, many residuals)
>>> mode, rationale = jacobian_mode_selector(100, 1000, mode='auto')
>>> print(mode, rationale)
fwd jacfwd (100 params <= 1000 residuals)
>>> # Manual override
>>> mode, rationale = jacobian_mode_selector(1000, 100, mode='fwd')
>>> print(mode, rationale)
fwd explicit override: fwd
nlsq.core.least_squares.prepare_bounds(bounds, n)[source]

Prepare bounds for optimization.

This function prepares the bounds for the optimization by ensuring that they are both 1-D arrays of length n. If either bound is a scalar, it is resized to an array of length n.

Parameters:
  • bounds (Tuple[np.ndarray, np.ndarray]) – The lower and upper bounds for the optimization.

  • n (int) – The length of the bounds arrays.

Returns:

The prepared lower and upper bounds arrays.

Return type:

Tuple[np.ndarray, np.ndarray]

nlsq.core.least_squares.check_tolerance(ftol, xtol, gtol, method)[source]

Check and prepare tolerance values for optimization.

This function checks the tolerance values for the optimization and prepares them for use. If any of the tolerances is None, it is set to 0. If any of the tolerances is lower than the machine epsilon, a warning is issued and the tolerance is set to the machine epsilon. If all tolerances are lower than the machine epsilon, a ValueError is raised.

Parameters:
  • ftol (float) – The tolerance for the optimization function value.

  • xtol (float) – The tolerance for the optimization variable values.

  • gtol (float) – The tolerance for the optimization gradient values.

  • method (str) – The name of the optimization method.

Returns:

The prepared tolerance values.

Return type:

Tuple[float, float, float]

nlsq.core.least_squares.check_x_scale(x_scale, x0)[source]

Check and prepare the x_scale parameter for optimization.

This function checks and prepares the x_scale parameter for the optimization. x_scale can either be ‘jac’ or an array_like with positive numbers. If it’s ‘jac’ the jacobian is used as the scaling.

Parameters:
  • x_scale (str | Sequence[float] | np.ndarray) – The scaling for the optimization variables.

  • x0 (np.ndarray) – The initial guess for the optimization variables.

Returns:

The prepared x_scale parameter.

Return type:

str | np.ndarray

class nlsq.core.least_squares.AutoDiffJacobian[source]

Bases: object

Wraps the residual fit function such that automatic differentiation is performed.

Supports both forward-mode (jacfwd) and reverse-mode (jacrev) automatic differentiation. This needs to be a class since we need to maintain in memory three different versions of the Jacobian for different sigma/covariance cases.

create_ad_jacobian(func, num_args, masked=True, mode='fwd')[source]

Creates a function that returns the autodiff jacobian of the residual fit function. The Jacobian of the residual fit function is equivalent to the Jacobian of the fit function.

Parameters:
  • func (Callable) – The function to take the jacobian of.

  • num_args (int) – The number of arguments the function takes.

  • masked (bool, optional) – Whether to use a masked jacobian, by default True

  • mode (str, optional) – Jacobian mode (‘fwd’ or ‘rev’), by default ‘fwd’

Returns:

The function that returns the autodiff jacobian of the given function.

Return type:

Callable

class nlsq.core.least_squares.LeastSquares(enable_stability=False, enable_diagnostics=False, max_jacobian_elements_for_svd=10000000)[source]

Bases: object

Core least squares optimization engine with JAX acceleration.

This class implements the main optimization algorithms for nonlinear least squares problems, including Trust Region Reflective (TRF) and Levenberg-Marquardt (LM). It handles automatic differentiation, bound constraints, loss functions, and uncertainty propagation.

The class maintains separate automatic differentiation instances for different sigma configurations (no sigma, 1D sigma, 2D covariance matrix) to optimize compilation and execution performance.

trf

Trust Region Reflective algorithm implementation

Type:

TrustRegionReflective

ls

JIT-compiled loss function implementations

Type:

LossFunctionsJIT

logger

Internal logger for debugging and performance tracking

Type:

Logger

f

Current objective function being optimized

Type:

callable

jac

Current Jacobian function (None for automatic differentiation)

Type:

callable or None

adjn

Automatic differentiation instance for unweighted problems

Type:

AutoDiffJacobian

adj1d

Automatic differentiation instance for 1D sigma weighting

Type:

AutoDiffJacobian

adj2d

Automatic differentiation instance for 2D covariance matrix weighting

Type:

AutoDiffJacobian

least_squares : Main optimization method
__init__(enable_stability=False, enable_diagnostics=False, max_jacobian_elements_for_svd=10000000)[source]

Initialize LeastSquares with optimization algorithms and autodiff instances.

Sets up the Trust Region Reflective solver, loss functions, and separate automatic differentiation instances for different weighting schemes to maximize JAX compilation efficiency.

Parameters:
  • enable_stability (bool, default False) – Enable numerical stability checks and fixes

  • enable_diagnostics (bool, default False) – Enable optimization diagnostics collection

  • max_jacobian_elements_for_svd (int, default 10_000_000) – Maximum Jacobian size (m × n elements) for SVD computation during stability checks. SVD is skipped for larger Jacobians.

least_squares(fun, x0, jac=None, bounds=(-inf, inf), method='trf', ftol=1e-08, xtol=1e-08, gtol=1e-08, x_scale=1.0, loss='linear', f_scale=1.0, diff_step=None, tr_solver=None, tr_options=None, jac_sparsity=None, max_nfev=None, verbose=0, jacobian_mode=None, xdata=None, ydata=None, data_mask=None, transform=None, timeit=False, callback=None, args=(), kwargs=None, prepared_bounds=None, **timeout_kwargs)[source]

Solve nonlinear least squares problem using JAX-accelerated algorithms.

This method orchestrates the optimization process by calling focused helper methods for each major step: validation, function setup, initial evaluation, stability checks, and optimization execution.

Parameters:
  • fun (callable) – Residual function. Must use jax.numpy operations.

  • x0 (array_like) – Initial parameter guess.

  • jac (callable or None, optional) – Jacobian function. If None, uses JAX autodiff.

  • bounds (2-tuple, optional) – Parameter bounds as (lower, upper).

  • method (str, optional) – Optimization algorithm (‘trf’).

  • ftol (float, optional) – Convergence tolerances for function, parameters, and gradient.

  • xtol (float, optional) – Convergence tolerances for function, parameters, and gradient.

  • gtol (float, optional) – Convergence tolerances for function, parameters, and gradient.

  • x_scale (str or array_like, optional) – Parameter scaling (‘jac’ for automatic).

  • loss (str or callable, optional) – Robust loss function (‘linear’, ‘huber’, ‘soft_l1’, etc.).

  • f_scale (float, optional) – Scale parameter for robust loss functions.

  • max_nfev (int, optional) – Maximum function evaluations.

  • verbose (int, optional) – Verbosity level (0, 1, or 2).

  • jacobian_mode ({'auto', 'fwd', 'rev'}, optional) – Jacobian automatic differentiation mode. If None, uses configuration from environment variable, config file, or auto-default. Default is None. - ‘auto’: Automatically select based on problem dimensions - ‘fwd’: Force forward-mode AD (jacfwd) - ‘rev’: Force reverse-mode AD (jacrev)

  • xdata (array_like, optional) – Data for curve fitting applications.

  • ydata (array_like, optional) – Data for curve fitting applications.

  • data_mask (array_like, optional) – Boolean mask for data exclusion.

  • transform (array_like, optional) – Transformation matrix for weighted fitting.

  • timeit (bool, optional) – Enable detailed timing analysis.

  • callback (callable or None, optional) – Callback function called after each optimization iteration with signature callback(iteration, cost, params, info). Useful for monitoring optimization progress, logging, or implementing custom stopping criteria. If None (default), no callback is invoked.

  • args (tuple, optional) – Additional arguments for objective function.

  • kwargs (dict, optional) – Additional optimization parameters.

  • prepared_bounds (tuple[np.ndarray, np.ndarray] or None, optional) – Pre-prepared bounds as (lb, ub). If provided, skips prepare_bounds call. This optimization avoids redundant bounds preparation when caller has already called prepare_bounds. Default is None.

Returns:

result – Optimization result with solution, convergence info, and statistics.

Return type:

OptimizeResult

autdiff_jac(jac, mode='fwd')[source]

We do this for all three sigma transformed functions such that if sigma is changed from none to 1D to covariance sigma then no retracing is needed.

Parameters:
  • jac (None) – Passed in to maintain compatibility with the user defined Jacobian function.

  • mode (str, optional) – Jacobian mode (‘fwd’ or ‘rev’), by default ‘fwd’

update_function(func)[source]

Wraps the given fit function to be a residual function using the data. The wrapped function is in a JAX JIT compatible format which is purely functional. This requires that both the data mask and the uncertainty transform are passed to the function. Even for the case where the data mask is all True and the uncertainty transform is None we still need to pass these arguments to the function due JAX’s functional nature.

Parameters:

func (Callable) – The fit function to wrap.

Return type:

None

wrap_jac(jac)[source]

Wraps an user defined Jacobian function to allow for data masking and uncertainty transforms. The wrapped function is in a JAX JIT compatible format which is purely functional. This requires that both the data mask and the uncertainty transform are passed to the function.

Using an analytical Jacobian of the fit function is equivalent to the Jacobian of the residual function.

Also note that the analytical Jacobian doesn’t require the independent ydata, but we still need to pass it to the function to maintain compatibility with autdiff version which does require the ydata.

Parameters:

jac (Callable) – The Jacobian function to wrap.

Returns:

The masked Jacobian of the function evaluated at args with respect to the data.

Return type:

jnp.ndarray