nlsq.validators module

Input validation for NLSQ optimization functions.

This module provides comprehensive input validation to catch errors early and provide helpful error messages to users.

class nlsq.utils.validators.InputValidator(fast_mode=True)[source]

Bases: object

Comprehensive input validation for curve fitting functions.

__init__(fast_mode=True)[source]

Initialize the input validator.

Parameters:

fast_mode (bool, default True) – If True, skip expensive validation checks for better performance. If False, perform all validation checks.

validate_curve_fit_inputs(f, xdata, ydata, p0=None, bounds=None, sigma=None, absolute_sigma=True, check_finite=True)[source]

Validate inputs for curve_fit function.

This method orchestrates the validation pipeline by calling focused helper methods for each validation step.

Parameters:
  • f (callable) – Model function to fit

  • xdata (array_like) – Independent variable data

  • ydata (array_like) – Dependent variable data

  • p0 (array_like, optional) – Initial parameter guess

  • bounds (tuple, optional) – Parameter bounds

  • sigma (array_like, optional) – Uncertainties in ydata

  • absolute_sigma (bool) – Whether sigma is absolute or relative

  • check_finite (bool) – Whether to check for finite values

Returns:

  • errors (list) – List of error messages (empty if no errors)

  • warnings (list) – List of warning messages

  • xdata_clean (np.ndarray) – Cleaned and validated xdata

  • ydata_clean (np.ndarray) – Cleaned and validated ydata

Return type:

tuple[list[str], list[str], ndarray, ndarray]

validate_security_constraints(n_points, n_params, bounds=None, p0=None)[source]

Validate security constraints to prevent DoS and numerical issues.

This method combines all security-focused validation checks.

Parameters:
  • n_points (int) – Number of data points

  • n_params (int) – Number of parameters

  • bounds (tuple | None, optional) – Parameter bounds

  • p0 (array_like | None, optional) – Initial parameter guess

Returns:

  • errors (list) – List of critical error messages

  • warnings (list) – List of warning messages

Return type:

tuple[list[str], list[str]]

validate_least_squares_inputs(fun, x0, bounds=None, method='trf', ftol=1e-08, xtol=1e-08, gtol=1e-08, max_nfev=None)[source]

Validate inputs for least_squares function.

This method orchestrates the validation pipeline by calling focused helper methods for each validation step.

Parameters:
  • fun (callable) – Residual function

  • x0 (array_like) – Initial parameter guess

  • bounds (tuple, optional) – Parameter bounds

  • method (str) – Optimization method

  • ftol (float) – Function tolerance

  • xtol (float) – Parameter tolerance

  • gtol (float) – Gradient tolerance

  • max_nfev (int, optional) – Maximum function evaluations

Returns:

  • errors (list) – List of error messages

  • warnings (list) – List of warning messages

  • x0_clean (np.ndarray) – Cleaned initial guess

Return type:

tuple[list[str], list[str], ndarray]

nlsq.utils.validators.validate_inputs(validation_type='curve_fit')[source]

Decorator for automatic input validation.

Parameters:

validation_type (str) – Type of validation to perform (‘curve_fit’ or ‘least_squares’)

Returns:

decorator – Decorator function that validates inputs

Return type:

function

Overview

The nlsq.validators module provides comprehensive input validation for curve fitting and optimization functions. It catches common errors early, provides helpful error messages, and ensures data quality before expensive optimization operations begin.

New in version 0.1.1: Complete validation system with fast mode and extensive checks.

New in version 0.3.1: Added security-focused validation (array size limits, bounds checking, parameter validation).

Key Features

  • Comprehensive input validation for all curve fitting parameters

  • Security validation (v0.3.1): Array size limits, bounds numeric range, parameter values

  • Early error detection with clear, actionable error messages

  • Data quality checks for outliers, duplicates, and degenerate cases

  • Fast mode to skip expensive checks for performance-critical code

  • Function signature analysis to detect parameter mismatches

  • Automatic type conversion with warnings

  • Bounds validation including initial guess checking

  • Decorator support for automatic validation

Classes

InputValidator([fast_mode])

Comprehensive input validation for curve fitting functions.

Functions

validate_inputs([validation_type])

Decorator for automatic input validation.

Usage Examples

Basic Validation for curve_fit

Validate inputs before curve fitting:

from nlsq.utils.validators import InputValidator
import numpy as np

# Create validator
validator = InputValidator(fast_mode=False)


# Define model and data
def model(x, a, b):
    return a * np.exp(-b * x)


x = np.linspace(0, 10, 100)
y = 2.5 * np.exp(-0.5 * x) + np.random.normal(0, 0.1, 100)
p0 = [2.0, 0.4]

# Validate inputs
errors, warnings, x_clean, y_clean = validator.validate_curve_fit_inputs(
    model, x, y, p0=p0, bounds=([0, 0], [10, 10])
)

if errors:
    print("Validation errors:")
    for error in errors:
        print(f"  - {error}")
else:
    print("Validation passed!")

if warnings:
    print("Warnings:")
    for warning in warnings:
        print(f"  - {warning}")

Fast Mode Validation

Skip expensive checks for performance:

from nlsq.utils.validators import InputValidator

# Fast mode skips function callable tests and data quality checks
fast_validator = InputValidator(fast_mode=True)

errors, warnings, x_clean, y_clean = fast_validator.validate_curve_fit_inputs(
    model, x, y, p0=p0
)

# Much faster, suitable for production code with trusted inputs

Decorator-Based Validation

Automatically validate function inputs:

from nlsq.utils.validators import validate_inputs
import numpy as np


@validate_inputs(validation_type="curve_fit")
def my_curve_fit(f, xdata, ydata, p0=None, **kwargs):
    """Custom curve fit with automatic validation."""
    # Inputs are automatically validated before this code runs
    # Invalid inputs raise ValueError with detailed message
    # xdata and ydata are converted to numpy arrays

    # Your fitting logic here
    return optimize(f, xdata, ydata, p0, **kwargs)


# Use it - validation happens automatically
try:
    result = my_curve_fit(model, x, y, p0=[1.0, 0.5])
except ValueError as e:
    print(f"Validation failed: {e}")

Validation for least_squares

Validate least squares inputs:

from nlsq.utils.validators import InputValidator

validator = InputValidator()


# Define residual function
def residual(params):
    a, b = params
    return y_data - (a * np.exp(-b * x_data))


x0 = np.array([2.0, 0.5])
bounds = ([0, 0], [10, 10])

# Validate
errors, warnings, x0_clean = validator.validate_least_squares_inputs(
    residual,
    x0,
    bounds=bounds,
    method="trf",
    ftol=1e-8,
    xtol=1e-8,
    gtol=1e-8,
    max_nfev=1000,
)

if errors:
    raise ValueError(f"Validation failed: {'; '.join(errors)}")

Detailed Validation Checks

See what validation detects:

from nlsq.utils.validators import InputValidator
import numpy as np

validator = InputValidator(fast_mode=False)

# Problematic data
x = np.array([1.0, 2.0, np.nan, 4.0])  # Contains NaN
y = np.array([1.0, 2.0, 3.0])  # Wrong length
p0 = [1.0, 2.0, 3.0]  # Wrong number of params


def model(x, a, b):
    return a * x + b


errors, warnings, _, _ = validator.validate_curve_fit_inputs(model, x, y, p0=p0)

# Errors will include:
# - "xdata contains 1 NaN or Inf values"
# - "xdata (4 points) and ydata (3 points) must have same length"
# - "Initial guess p0 has 3 parameters, but function expects 2"

Data Quality Warnings

Detect potential data quality issues:

from nlsq.utils.validators import InputValidator
import numpy as np

validator = InputValidator(fast_mode=False)

# Data with quality issues
x = np.array([1, 2, 3, 3, 3, 4, 5, 100])  # Duplicates and outlier
y = np.array([1, 2, 3, 3.1, 2.9, 4, 5, 200])  # Outlier


def linear(x, a, b):
    return a * x + b


errors, warnings, _, _ = validator.validate_curve_fit_inputs(
    linear, x, y, p0=[1.0, 0.0]
)

# Warnings will include:
# - "xdata contains 3 duplicate values"
# - "ydata may contain 1 outliers - consider using robust loss function"

Bounds Validation

Check parameter bounds:

from nlsq.utils.validators import InputValidator
import numpy as np

validator = InputValidator()

x = np.linspace(0, 10, 50)
y = 2 * x + 1 + np.random.randn(50) * 0.5


def linear(x, a, b):
    return a * x + b


# Invalid bounds
bad_bounds = ([0, 0], [0, 10])  # Lower >= Upper for first param
p0 = [2.0, 1.0]

errors, warnings, _, _ = validator.validate_curve_fit_inputs(
    linear, x, y, p0=p0, bounds=bad_bounds
)

# Errors: "Lower bounds must be less than upper bounds"

# Initial guess outside bounds
bounds = ([0, 0], [10, 10])
p0_out = [-1.0, 1.0]  # First param outside bounds

errors, warnings, _, _ = validator.validate_curve_fit_inputs(
    linear, x, y, p0=p0_out, bounds=bounds
)

# Warnings: "Initial guess p0 is outside bounds"

Sigma Validation

Validate uncertainty parameters:

from nlsq.utils.validators import InputValidator
import numpy as np

validator = InputValidator()

x = np.linspace(0, 10, 50)
y = 2 * x + 1 + np.random.randn(50) * 0.5

# Invalid sigma - wrong shape
sigma_bad = np.ones(40)  # Wrong length

errors, warnings, _, _ = validator.validate_curve_fit_inputs(
    lambda x, a, b: a * x + b, x, y, p0=[1, 0], sigma=sigma_bad
)

# Error: "sigma must have same shape as ydata"

# Invalid sigma - negative values
sigma_neg = np.ones(50)
sigma_neg[10] = -0.5

errors, warnings, _, _ = validator.validate_curve_fit_inputs(
    lambda x, a, b: a * x + b, x, y, p0=[1, 0], sigma=sigma_neg
)

# Error: "sigma values must be positive"

Degenerate Data Detection

Detect problematic data patterns:

from nlsq.utils.validators import InputValidator
import numpy as np

validator = InputValidator()

# All x values identical
x_const = np.ones(100)
y = np.random.randn(100)

errors, warnings, _, _ = validator.validate_curve_fit_inputs(
    lambda x, a, b: a * x + b, x_const, y, p0=[1, 0]
)

# Error: "All x values are identical - cannot fit"

# Very small range
x_small_range = np.linspace(1.0, 1.0000000001, 100)  # Range ~ 1e-9

errors, warnings, _, _ = validator.validate_curve_fit_inputs(
    lambda x, a, b: a * x + b, x_small_range, y, p0=[1, 0]
)

# Warning: "x data range is very small (1.00e-09) - consider rescaling"

Tolerance Validation

Validate convergence tolerances:

from nlsq.utils.validators import InputValidator

validator = InputValidator()


def residual(x):
    return x**2


# Very small tolerances
errors, warnings, x0_clean = validator.validate_least_squares_inputs(
    residual, x0=[1.0], ftol=1e-16, xtol=1e-17, gtol=1e-8  # Too small  # Too small
)

# Warnings:
# - "ftol=1e-16 is very small, may not converge"
# - "xtol=1e-17 is very small, may not converge"

Custom Validation Pipeline

Build custom validation logic:

from nlsq.utils.validators import InputValidator
import numpy as np


class CustomValidator(InputValidator):
    """Extended validator with custom checks."""

    def validate_my_data(self, x, y, **kwargs):
        """Custom validation pipeline."""
        # Use parent class methods
        errors, warnings, x, y = self.validate_curve_fit_inputs(
            kwargs["f"], x, y, p0=kwargs.get("p0"), bounds=kwargs.get("bounds")
        )

        # Add custom checks
        if np.std(y) < 0.01:
            warnings.append(
                "y data has very low variance - may indicate measurement issue"
            )

        if len(x) < 10:
            errors.append("Need at least 10 data points for reliable fit")

        return errors, warnings, x, y


# Use it
custom_validator = CustomValidator()
errors, warnings, x, y = custom_validator.validate_my_data(
    x_data, y_data, f=model_func, p0=initial_guess
)

Security Validation (v0.3.1)

The validator includes security-focused checks to prevent resource exhaustion and detect malformed inputs early:

from nlsq.utils.validators import InputValidator, validate_security_constraints

validator = InputValidator()

# Security checks run automatically and early in the validation pipeline
errors, warnings, x, y = validator.validate_curve_fit_inputs(model, x, y, p0=p0)

# Or call directly
errors, warnings = validate_security_constraints(x, y, n_params=3)

Array Size Limits:

  • Maximum 10 billion (10^10) data points

  • Maximum 100 billion (10^11) Jacobian elements

  • Detects integer overflow in size calculations

  • Memory estimation warnings (>10GB, >100GB)

# Example: Excessive data size detection
x_huge = np.ones(15_000_000_000)  # 15 billion points
# Error: "Data size exceeds maximum allowed (10B points)"

Bounds Numeric Range:

  • Warns for extreme bound values (absolute value > 1e100)

  • Errors for NaN in bounds

# Example: Extreme bounds detection
bounds = ([0, 0], [1e200, 10])  # Extremely large upper bound
# Warning: "Bounds contain extreme values (>1e100)"

Parameter Value Validation:

  • Warns for extreme initial parameters (abs(p0) > 1e50)

  • Errors for NaN/Inf in initial parameters

# Example: Invalid parameter detection
p0 = [1.0, np.nan, 0.5]
# Error: "Initial parameters contain NaN/Inf values"

Early Fail-Fast:

Security validation runs before expensive operations to fail fast on malformed input that could cause denial-of-service or numerical instability.

Validation Check Reference

The validator performs these checks:

Array Validation:

  • Convert to numpy arrays

  • Check dimensions (at least 1D)

  • Validate array lengths match

  • Check for tuple xdata (multi-dimensional fitting)

Finite Values:

  • Detect NaN values

  • Detect Inf values

  • Report counts of bad values

Data Shapes:

  • Minimum 2 data points

  • More data points than parameters

  • Matching xdata/ydata lengths

Initial Guess (p0):

  • Correct number of parameters

  • All finite values

  • Length matches function signature

Bounds:

  • 2-tuple format (lower, upper)

  • Correct lengths

  • Lower < Upper for all parameters

  • Initial guess within bounds

  • Method compatibility (LM doesn’t support bounds)

Sigma:

  • Correct shape (matches ydata)

  • All positive values

  • All finite values

Tolerances:

  • All positive (ftol, xtol, gtol)

  • Not too small (<1e-15)

Method:

  • Valid method name (‘trf’, ‘lm’, ‘dogbox’)

Function Callable (unless fast_mode=True):

  • Function can be evaluated

  • Output shape matches expected

Data Quality (unless fast_mode=True):

  • Check for duplicate x values

  • Detect potential outliers (3 IQR rule)

  • Degenerate x values (all identical, tiny range)

  • Degenerate y values (all identical, tiny range)

Performance Impact

Fast mode (fast_mode=True):

  • Skips: Function callable test, data quality checks

  • Speedup: ~30-50% faster validation

  • Use for: Production code with trusted inputs

Full mode (fast_mode=False):

  • All checks enabled

  • Recommended for: Interactive use, debugging, untrusted inputs

Typical validation overhead:

  • Small datasets (<1000 points): <1ms

  • Medium datasets (1000-100K points): 1-10ms

  • Large datasets (>100K points): 10-100ms

Recommendation: Use fast mode in tight loops, full mode for user-facing APIs

Integration Examples

With curve_fit

from nlsq import curve_fit
from nlsq.utils.validators import InputValidator


def safe_curve_fit(f, xdata, ydata, **kwargs):
    """curve_fit with validation."""
    validator = InputValidator(fast_mode=kwargs.pop("fast_validation", False))

    errors, warnings, x, y = validator.validate_curve_fit_inputs(
        f,
        xdata,
        ydata,
        p0=kwargs.get("p0"),
        bounds=kwargs.get("bounds"),
        sigma=kwargs.get("sigma"),
    )

    if errors:
        raise ValueError(f"Validation failed: {'; '.join(errors)}")

    # Show warnings
    for warning in warnings:
        import warnings

        warnings.warn(warning)

    return curve_fit(f, x, y, **kwargs)

With least_squares

from nlsq import LeastSquares
from nlsq.utils.validators import InputValidator


def safe_least_squares(fun, x0, **kwargs):
    """least_squares with validation."""
    validator = InputValidator()

    errors, warnings, x0 = validator.validate_least_squares_inputs(
        fun,
        x0,
        bounds=kwargs.get("bounds"),
        method=kwargs.get("method", "trf"),
        ftol=kwargs.get("ftol", 1e-8),
        xtol=kwargs.get("xtol", 1e-8),
        gtol=kwargs.get("gtol", 1e-8),
        max_nfev=kwargs.get("max_nfev"),
    )

    if errors:
        raise ValueError(f"Validation failed: {'; '.join(errors)}")

    for warning in warnings:
        import warnings

        warnings.warn(warning)

    ls = LeastSquares()
    return ls.least_squares(fun, x0, **kwargs)

See Also