nlsq.sparse_jacobian module

Sparse Jacobian support for large-scale optimization.

This module provides sparse matrix support for Jacobian computations, enabling efficient handling of problems with 20M+ data points.

class nlsq.core.sparse_jacobian.SparseJacobianComputer(sparsity_threshold=0.01)[source]

Bases: object

Compute and manage sparse Jacobians for large-scale problems.

For many curve fitting problems, the Jacobian has a sparse structure where each data point only depends on a subset of parameters. This class exploits that structure to reduce memory usage by 10-100x.

__init__(sparsity_threshold=0.01)[source]

Initialize sparse Jacobian computer.

Parameters:

sparsity_threshold (float) – Elements with absolute value below this threshold are considered zero. Default is 0.01 which works well for most problems.

detect_sparsity_pattern(func, x0, xdata_sample, n_samples=100)[source]

Detect sparsity pattern of Jacobian from sample evaluations.

Parameters:
  • func (Callable) – Function to evaluate

  • x0 (np.ndarray) – Initial parameter values

  • xdata_sample (np.ndarray or list) – Sample of x data points. Can be a single array for 1D problems, a list of arrays [X, Y] for 2D problems, or a 2D array with shape (k, N) for multi-dimensional coordinates.

  • n_samples (int) – Number of samples to use for pattern detection

Returns:

  • pattern (np.ndarray) – Boolean array indicating non-zero elements

  • sparsity (float) – Fraction of zero elements

Return type:

tuple[ndarray, float]

compute_sparse_jacobian(jac_func, x, xdata, ydata, data_mask=None, chunk_size=10000, func=None)[source]

Compute Jacobian in sparse format with chunking.

Parameters:
  • jac_func (Callable) – Jacobian function

  • x (np.ndarray) – Current parameter values

  • xdata (np.ndarray) – Independent variable data

  • ydata (np.ndarray) – Dependent variable data

  • data_mask (np.ndarray, optional) – Mask for valid data points

  • chunk_size (int) – Size of chunks for computation

  • func (Callable, optional) – Original function for finite difference fallback

Returns:

J_sparse – Sparse Jacobian matrix

Return type:

csr_matrix

sparse_matrix_vector_product(J_sparse, v)[source]

Efficient sparse matrix-vector product.

Parameters:
  • J_sparse (csr_matrix) – Sparse Jacobian matrix

  • v (np.ndarray) – Vector to multiply

Returns:

result – J @ v

Return type:

np.ndarray

sparse_normal_equations(J_sparse, f)[source]

Set up normal equations with sparse Jacobian.

Solves (J^T @ J) @ p = -J^T @ f without forming J^T @ J explicitly.

Parameters:
  • J_sparse (csr_matrix) – Sparse Jacobian matrix

  • f (np.ndarray) – Residual vector

Returns:

  • matvec (callable) – Function that computes (J^T @ J) @ v

  • rhs (np.ndarray) – Right-hand side -J^T @ f

Return type:

tuple[Callable, ndarray]

estimate_memory_usage(n_data, n_params, sparsity=0.99)[source]

Estimate memory usage for sparse vs dense Jacobian.

Parameters:
  • n_data (int) – Number of data points

  • n_params (int) – Number of parameters

  • sparsity (float) – Fraction of zero elements (0-1)

Returns:

memory_info – Memory usage estimates in GB

Return type:

dict

class nlsq.core.sparse_jacobian.SparseOptimizer(sparsity_threshold=0.01, min_sparsity=0.9, auto_detect=True)[source]

Bases: object

Optimizer that uses sparse Jacobians for large-scale problems.

This optimizer automatically detects when sparse Jacobians would be beneficial and switches to sparse computations transparently.

__init__(sparsity_threshold=0.01, min_sparsity=0.9, auto_detect=True)[source]

Initialize sparse optimizer.

Parameters:
  • sparsity_threshold (float) – Threshold for considering elements as zero

  • min_sparsity (float) – Minimum sparsity level to use sparse methods

  • auto_detect (bool) – Automatically detect and use sparsity

should_use_sparse(n_data, n_params, force_check=False)[source]

Determine if sparse methods should be used.

Parameters:
  • n_data (int) – Number of data points

  • n_params (int) – Number of parameters

  • force_check (bool) – Force sparsity detection even if auto_detect is False

Returns:

use_sparse – Whether to use sparse methods

Return type:

bool

optimize_with_sparsity(func, x0, xdata, ydata, **kwargs)[source]

Optimize using sparse Jacobian methods.

Parameters:
  • func (Callable) – Objective function

  • x0 (np.ndarray) – Initial parameters

  • xdata (np.ndarray) – Independent variable data

  • ydata (np.ndarray) – Dependent variable data

  • **kwargs (Any) – Additional optimization parameters

Returns:

result – Optimization result

Return type:

dict

nlsq.core.sparse_jacobian.detect_jacobian_sparsity(func, x0, xdata_sample, threshold=0.01)[source]

Detect and analyze Jacobian sparsity for a given problem.

Parameters:
  • func (Callable) – Objective function

  • x0 (np.ndarray) – Initial parameters

  • xdata_sample (np.ndarray or list) – Sample of x data. Can be a single array for 1D problems, a list of arrays [X, Y] for 2D problems, or a 2D array with shape (k, N) for multi-dimensional coordinates.

  • threshold (float) – Threshold for zero elements

Returns:

  • sparsity (float) – Fraction of zero elements

  • info (dict) – Additional sparsity information

Return type:

tuple[float, dict]

nlsq.core.sparse_jacobian.detect_sparsity_at_p0(func, p0, xdata, n_residuals, threshold=0.01, sample_size=100)[source]

Detect sparsity at p0 initialization for automatic sparse solver selection.

This function computes the Jacobian at the initial parameter guess p0 and calculates the sparsity ratio. The result is cached to avoid recomputation.

Parameters:
  • func (Callable) – Model function f(x, *params) -> residuals

  • p0 (np.ndarray) – Initial parameter guess

  • xdata (np.ndarray) – Independent variable data

  • n_residuals (int) – Number of residuals (data points)

  • threshold (float, optional) – Threshold for considering elements as zero (default: 0.01)

  • sample_size (int, optional) – Number of data points to sample for detection (default: 100) Using a sample speeds up detection for large datasets

Returns:

  • sparsity_ratio (float) – Fraction of zero elements in Jacobian (0.0 = dense, 1.0 = completely sparse) Example: 0.9 means 90% of Jacobian elements are zero

  • is_sparse (bool) – Whether the problem is considered sparse (sparsity_ratio > 0.5)

Return type:

tuple[float, bool]

Notes

Detection strategy: - Samples up to sample_size data points for efficiency - Uses finite differences to compute Jacobian at p0 - Considers elements with |J[i,j]| < threshold as zero - Caches result to avoid repeated computation

For auto-selection to activate sparse solver, both conditions must be met: - sparsity_ratio > 0.5 (more than 50% zeros) - n_residuals > 10000 (problem size threshold)

Examples

>>> def model(x, a, b):
...     # Each parameter affects different data regions (sparse)
...     return jnp.where(x < 0.5, a, b)
>>> p0 = np.array([1.0, 2.0])
>>> xdata = np.linspace(0, 1, 1000)
>>> sparsity_ratio, is_sparse = detect_sparsity_at_p0(
...     model, p0, xdata, n_residuals=1000
... )
>>> print(f"Sparsity: {sparsity_ratio:.1%}, Is sparse: {is_sparse}")
Sparsity: 50.0%, Is sparse: True

Overview

The sparse_jacobian module provides automatic detection and exploitation of sparse Jacobian patterns for improved computational efficiency.

Key Features (Task Group 6)

  • Sparsity Detection: Automatic detection of sparse Jacobian patterns (>70% zeros)

  • Auto-Selection: Triggers sparse-aware optimizations when beneficial

  • Phase 1 Infrastructure: Detection and activation logic in place

  • Diagnostic Access: Full visibility into sparsity detection results

Performance Benefits

  • Reduced memory usage for sparse problems

  • Faster Jacobian-vector products

  • Better scaling for large parameter spaces

  • Automatic activation when sparsity >10%

Classes

class nlsq.sparse_jacobian.SparseJacobianComputer(sparsity_threshold=0.01)[source]

Bases: object

Compute and manage sparse Jacobians for large-scale problems.

For many curve fitting problems, the Jacobian has a sparse structure where each data point only depends on a subset of parameters. This class exploits that structure to reduce memory usage by 10-100x.

__init__(sparsity_threshold=0.01)[source]

Initialize sparse Jacobian computer.

Parameters:

sparsity_threshold (float) – Elements with absolute value below this threshold are considered zero. Default is 0.01 which works well for most problems.

detect_sparsity_pattern(func, x0, xdata_sample, n_samples=100)[source]

Detect sparsity pattern of Jacobian from sample evaluations.

Parameters:
  • func (Callable) – Function to evaluate

  • x0 (np.ndarray) – Initial parameter values

  • xdata_sample (np.ndarray or list) – Sample of x data points. Can be a single array for 1D problems, a list of arrays [X, Y] for 2D problems, or a 2D array with shape (k, N) for multi-dimensional coordinates.

  • n_samples (int) – Number of samples to use for pattern detection

Returns:

  • pattern (np.ndarray) – Boolean array indicating non-zero elements

  • sparsity (float) – Fraction of zero elements

Return type:

tuple[ndarray, float]

compute_sparse_jacobian(jac_func, x, xdata, ydata, data_mask=None, chunk_size=10000, func=None)[source]

Compute Jacobian in sparse format with chunking.

Parameters:
  • jac_func (Callable) – Jacobian function

  • x (np.ndarray) – Current parameter values

  • xdata (np.ndarray) – Independent variable data

  • ydata (np.ndarray) – Dependent variable data

  • data_mask (np.ndarray, optional) – Mask for valid data points

  • chunk_size (int) – Size of chunks for computation

  • func (Callable, optional) – Original function for finite difference fallback

Returns:

J_sparse – Sparse Jacobian matrix

Return type:

csr_matrix

sparse_matrix_vector_product(J_sparse, v)[source]

Efficient sparse matrix-vector product.

Parameters:
  • J_sparse (csr_matrix) – Sparse Jacobian matrix

  • v (np.ndarray) – Vector to multiply

Returns:

result – J @ v

Return type:

np.ndarray

sparse_normal_equations(J_sparse, f)[source]

Set up normal equations with sparse Jacobian.

Solves (J^T @ J) @ p = -J^T @ f without forming J^T @ J explicitly.

Parameters:
  • J_sparse (csr_matrix) – Sparse Jacobian matrix

  • f (np.ndarray) – Residual vector

Returns:

  • matvec (callable) – Function that computes (J^T @ J) @ v

  • rhs (np.ndarray) – Right-hand side -J^T @ f

Return type:

tuple[Callable, ndarray]

estimate_memory_usage(n_data, n_params, sparsity=0.99)[source]

Estimate memory usage for sparse vs dense Jacobian.

Parameters:
  • n_data (int) – Number of data points

  • n_params (int) – Number of parameters

  • sparsity (float) – Fraction of zero elements (0-1)

Returns:

memory_info – Memory usage estimates in GB

Return type:

dict

Example Usage

from nlsq.sparse_jacobian import SparseJacobianComputer
import jax.numpy as jnp

# Create sparse Jacobian computer
computer = SparseJacobianComputer(sparsity_threshold=0.01)


# Define model function
def model(x, a, b):
    return a * jnp.exp(-b * x)


# Detect sparsity pattern
p0 = jnp.array([1.0, 0.5])
x_sample = jnp.linspace(0, 10, 100)
pattern, sparsity = computer.detect_sparsity_pattern(model, p0, x_sample)

# Check if sparse optimizations should be used
if sparsity > 0.1:  # More than 10% sparse
    print(f"Jacobian is {sparsity:.1%} sparse")
    print("Sparse optimizations will be enabled")

Sparsity Detection

The module automatically analyzes Jacobian patterns:

# Analyze sparsity for different model functions
def multi_component_model(x, *params):
    # Model with multiple independent components
    # Often results in block-diagonal sparse Jacobian
    pass


computer = SparseJacobianComputer()
pattern, sparsity = computer.detect_sparsity_pattern(
    multi_component_model, p0, x_sample
)

# Get detailed diagnostics
diagnostics = computer.get_diagnostics()
print(f"Non-zero elements: {diagnostics['nnz']}")
print(f"Sparsity pattern: {diagnostics['pattern']}")

See Also