nlsq.core.orchestration¶
Orchestration components for CurveFit decomposition.
This package contains extracted components from the CurveFit God class, each handling a single responsibility:
DataPreprocessor: Input validation, array conversion, data padding
OptimizationSelector: Method selection, bounds preparation, initial guess
CovarianceComputer: Covariance via SVD, sigma transformation
StreamingCoordinator: Memory analysis, streaming strategy selection
These components are used internally by CurveFit and controlled via feature flags.
Submodules¶
nlsq.core.orchestration.data_preprocessor¶
DataPreprocessor component for CurveFit decomposition.
Handles input validation, array conversion, data masking, and padding for curve fitting operations. This component is extracted from the CurveFit class as part of the God class decomposition.
Reference: specs/017-curve-fit-decomposition/spec.md FR-001
- class nlsq.core.orchestration.data_preprocessor.DataPreprocessor[source]¶
Bases:
objectPreprocessor for curve fitting input data.
Handles: 1. Input validation (type checking, finiteness) 2. Array conversion (numpy/list to JAX) 3. Length consistency checking 4. Data masking for invalid points 5. NaN/Inf handling via nan_policy
Example
>>> preprocessor = DataPreprocessor() >>> data = preprocessor.preprocess( ... f=my_model, ... xdata=x_values, ... ydata=y_values, ... sigma=uncertainties, ... check_finite=True, ... ) >>> print(f"Valid points: {data.n_points}")
- preprocess(f, xdata, ydata, *, sigma=None, absolute_sigma=False, check_finite=True, nan_policy='raise', stability_check=False)[source]¶
Validate and preprocess input data for curve fitting.
- Parameters:
f (Callable[..., ArrayLike]) – Model function to fit (used for parameter count detection)
xdata (ArrayLike) – Independent variable data
ydata (ArrayLike) – Dependent variable data (observations)
sigma (ArrayLike | None) – Uncertainty/weights for observations
absolute_sigma (bool) – If True, sigma is absolute; else relative
check_finite (bool) – If True, raise on NaN/Inf values
nan_policy (str) – How to handle NaN: ‘raise’, ‘omit’, or ‘propagate’
stability_check (bool) – If True, run additional stability checks
- Returns:
PreprocessedData with validated, converted arrays
- Raises:
ValueError – If inputs are invalid (wrong shape, non-finite, etc.)
TypeError – If inputs have wrong types
- Return type:
- validate_sigma(sigma, ydata_shape)[source]¶
Validate and convert sigma to appropriate format.
Public interface matching DataPreprocessorProtocol.
- Parameters:
- Returns:
Validated numpy array or None
- Raises:
ValueError – If sigma shape is incompatible with ydata
- Return type:
np.ndarray | None
nlsq.core.orchestration.optimization_selector¶
OptimizationSelector component for CurveFit decomposition.
Handles parameter detection, method selection, bounds preparation, and solver configuration for curve fitting operations.
Reference: specs/017-curve-fit-decomposition/spec.md FR-002
- nlsq.core.orchestration.optimization_selector.prepare_bounds(bounds, n)[source]¶
Prepare bounds for optimization.
- class nlsq.core.orchestration.optimization_selector.OptimizationSelector[source]¶
Bases:
objectSelector for optimization method and configuration.
Handles: 1. Parameter count detection from function signature 2. Method selection based on bounds and problem type 3. Bounds validation and preparation 4. Initial guess generation if not provided 5. Solver configuration validation
Example
>>> selector = OptimizationSelector() >>> config = selector.select( ... f=my_model, ... xdata=x_values, ... ydata=y_values, ... bounds=([0, 0], [10, 10]), ... ) >>> print(f"Method: {config.method}, Params: {config.n_params}")
- select(f, xdata, ydata, *, p0=None, bounds=None, method=None, jac=None, tr_solver=None, x_scale=1.0, ftol=1e-08, xtol=1e-08, gtol=1e-08, max_nfev=None)[source]¶
Select optimization method and prepare configuration.
- Parameters:
f (Callable[..., ArrayLike]) – Model function to fit
xdata (ArrayLike) – Independent variable data
ydata (ArrayLike) – Dependent variable data
p0 (ArrayLike | None) – Initial parameter guess (auto-detected if None)
bounds (tuple[ArrayLike, ArrayLike] | None) – Parameter bounds as (lower, upper)
method (str | None) – Optimization method (‘trf’, ‘lm’, ‘dogbox’, or None for auto)
jac (str | Callable | None) – Jacobian computation method
tr_solver (str | None) – Trust region solver (‘exact’, ‘lsmr’, or None for auto)
ftol (float) – Function tolerance
xtol (float) – Parameter tolerance
gtol (float) – Gradient tolerance
max_nfev (int | None) – Maximum function evaluations (auto if None)
- Returns:
OptimizationConfig with all settings resolved
- Raises:
ValueError – If configuration is invalid
- Return type:
- detect_parameter_count(f, xdata)[source]¶
Detect number of parameters from function signature.
Uses inspection of function signature to determine parameter count.
- Parameters:
f (Callable[..., ArrayLike]) – Model function to analyze
xdata (ArrayLike) – Sample data (not used currently, for future probing)
- Returns:
Number of parameters (excluding x)
- Raises:
ValueError – If parameter count cannot be determined
- Return type:
nlsq.core.orchestration.covariance_computer¶
CovarianceComputer component for CurveFit decomposition.
Handles covariance matrix computation via SVD, sigma transformation, and condition number estimation.
Reference: specs/017-curve-fit-decomposition/spec.md FR-003
- class nlsq.core.orchestration.covariance_computer.CovarianceComputer[source]¶
Bases:
objectComputer for parameter covariance from optimization results.
Handles: 1. Jacobian-based covariance via SVD 2. Sigma transformation (1D and 2D) 3. Absolute vs relative sigma handling 4. Singularity detection and handling
Example
>>> computer = CovarianceComputer() >>> result = computer.compute( ... result=optimize_result, ... n_data=100, ... sigma=uncertainties, ... absolute_sigma=True, ... ) >>> print(f"Parameter errors: {result.perr}")
- compute(result, n_data, *, sigma=None, absolute_sigma=False, full_output=False)[source]¶
Compute parameter covariance from optimization result.
Uses the Jacobian at the solution to compute covariance via: pcov = (J^T @ J)^(-1) * s_sq
where s_sq is the residual variance.
- Parameters:
result (OptimizeResult) – OptimizeResult from LeastSquares
n_data (int) – Number of data points
sigma (jax.Array | None) – Observation uncertainties/weights
absolute_sigma (bool) – If True, sigma is absolute uncertainty
full_output (bool) – If True, include additional diagnostics
- Returns:
CovarianceResult with covariance matrix and metadata
- Raises:
ValueError – If Jacobian is unavailable or invalid
- Return type:
- create_sigma_transform(sigma, n_data)[source]¶
Create sigma transformation function.
Handles both 1D (diagonal) and 2D (full covariance) sigma.
- compute_condition_number(jacobian)[source]¶
Compute condition number of Jacobian.
Uses singular values: cond = max(s) / min(s)
- setup_sigma_transform(sigma, ydata, data_mask, len_diff, m)[source]¶
Setup sigma transformation for weighted least squares.
This is the legacy interface matching CurveFit._setup_sigma_transform.
- Parameters:
- Returns:
Transformation array for sigma or None
- Raises:
ValueError – If sigma has incorrect shape or is not positive definite
- Return type:
jax.Array | None
nlsq.core.orchestration.streaming_coordinator¶
StreamingCoordinator component for CurveFit decomposition.
Handles memory analysis, streaming strategy selection, and configuration for large-scale curve fitting operations.
Reference: specs/017-curve-fit-decomposition/spec.md FR-004
- class nlsq.core.orchestration.streaming_coordinator.StreamingCoordinator(safety_factor=0.75)[source]¶
Bases:
objectCoordinator for streaming strategy selection.
Handles: 1. Memory estimation for dataset + Jacobian 2. Available memory detection 3. Strategy selection based on memory pressure 4. Configuration of chunked/hybrid strategies
Example
>>> coordinator = StreamingCoordinator() >>> decision = coordinator.decide( ... xdata=x_array, ... ydata=y_array, ... n_params=5, ... ) >>> if decision.strategy == "hybrid": ... config = decision.hybrid_config ... # Use hybrid streaming optimizer
- __init__(safety_factor=0.75)[source]¶
Initialize StreamingCoordinator.
- Parameters:
safety_factor (float) – Memory safety factor (0.75 means use 75% of available)
- decide(xdata, ydata, n_params, *, workflow='auto', memory_limit_mb=None, force_streaming=False)[source]¶
Decide on streaming strategy for the dataset.
Analyzes memory requirements and available resources to select the optimal execution strategy.
- Parameters:
xdata (jax.Array) – Independent variable data
ydata (jax.Array) – Dependent variable data
n_params (int) – Number of parameters
workflow (str) – Workflow hint (‘auto’, ‘streaming’, ‘hybrid’, ‘normal’)
memory_limit_mb (float | None) – Override for memory limit detection
force_streaming (bool) – If True, always use streaming
- Returns:
StreamingDecision with strategy and configuration
- Raises:
MemoryError – If dataset too large even for streaming
- Return type:
- estimate_memory(n_data, n_params, dtype_bytes=8)[source]¶
Estimate memory requirement in MB.
Accounts for: - Data arrays (x, y, residuals) - Jacobian matrix (n_data x n_params) - Working arrays for optimization - JAX compilation overhead
- get_available_memory()[source]¶
Get available system memory in MB.
Cached once per coordinator lifetime (one streaming decision per fit).
- Returns:
Available memory in MB
- Return type: