nlsq.streaming.phases¶

Streaming optimization phase classes for large-scale curve fitting.

This subpackage contains the extracted phase classes from the adaptive hybrid streaming optimizer, enabling modular streaming optimization workflows.

Phase-based streaming optimization modules.

This subpackage contains the phase implementations for the AdaptiveHybridStreamingOptimizer, organized by optimization phase:

Phase 0: Setup and normalization
Phase 1: L-BFGS warmup (WarmupPhase)
Phase 2: Gauss-Newton streaming optimization (GaussNewtonPhase)
Phase 3: Finalization and denormalization

The orchestrator coordinates phase transitions based on convergence criteria.

class nlsq.streaming.phases.CheckpointManager(config)[source]¶

Bases: object

Manages checkpoint save/load operations for streaming optimizer.

This class encapsulates all checkpoint I/O logic, including: - HDF5 file format versioning - Optimizer state serialization - Tournament and normalizer state handling - Guard clause-based validation

Parameters:: config (HybridStreamingConfig) – Configuration for streaming optimization.

config¶

Configuration object.

Type:: HybridStreamingConfig

version¶

Checkpoint format version (currently “3.0”).

Type:: str

VERSION = '3.0'¶

__init__(config)[source]¶

Initialize CheckpointManager.

Parameters:: config (HybridStreamingConfig) – Configuration for streaming optimization.

save(checkpoint_path, state)[source]¶

Save checkpoint with phase-specific state to HDF5 file.

Parameters:

checkpoint_path (str or Path) – Path to checkpoint file (.h5).
state (CheckpointState) – Optimizer state to save.

Notes

Checkpoint format version 3.0 includes: - current_phase: Current phase number - normalized_params: Parameters in normalized space - phase1_optimizer_state: Optax L-BFGS state - phase2_jtj_accumulator: Accumulated J^T J matrix - phase2_jtr_accumulator: Accumulated J^T r vector - best_params_global: Best parameters found globally - best_cost_global: Best cost value globally - phase_history: Complete phase history

load(checkpoint_path, global_config=None)[source]¶

Load checkpoint and restore phase-specific state.

Parameters:

checkpoint_path (str or Path) – Path to checkpoint file (.h5).
global_config (GlobalOptimizationConfig | None) – Configuration for tournament reconstruction.

Returns:

Restored optimizer state.

Return type:

CheckpointState

Raises:

FileNotFoundError – If checkpoint file does not exist.
ValueError – If checkpoint version is incompatible.

class nlsq.streaming.phases.CheckpointState(current_phase, normalized_params, phase1_optimizer_state, phase2_JTJ_accumulator, phase2_JTr_accumulator, best_params_global, best_cost_global, phase_history, normalizer, tournament_selector, multistart_candidates)[source]¶

Bases: object

State container for checkpoint save/load operations.

This dataclass captures all optimizer state that needs to be persisted across checkpoint boundaries.

current_phase¶

Current optimization phase (0-3).

Type:: int

normalized_params¶

Parameters in normalized space.

Type:: Array | None

phase1_optimizer_state¶

Optax L-BFGS optimizer state.

Type:: Any | None

phase2_JTJ_accumulator¶

Accumulated J^T J matrix for Phase 2.

Type:: Array | None

phase2_JTr_accumulator¶

Accumulated J^T r vector for Phase 2.

Type:: Array | None

best_params_global¶

Best parameters found globally.

Type:: Array | None

best_cost_global¶

Best cost value globally.

Type:: float

phase_history¶

Complete phase history.

Type:: list[dict[str, Any]]

normalizer¶

Parameter normalizer (strategy, scales, offsets).

Type:: ParameterNormalizer | None

tournament_selector¶

Tournament selector for multi-start optimization.

Type:: TournamentSelector | None

multistart_candidates¶

Multi-start candidate parameters.

Type:: Array | None

current_phase: int¶

normalized_params: Array | None¶

phase1_optimizer_state: Any | None¶

phase2_JTJ_accumulator: Array | None¶

phase2_JTr_accumulator: Array | None¶

best_params_global: Array | None¶

best_cost_global: float¶

phase_history: list[dict[str, Any]]¶

normalizer: ParameterNormalizer | None¶

tournament_selector: TournamentSelector | None¶

multistart_candidates: Array | None¶

__init__(current_phase, normalized_params, phase1_optimizer_state, phase2_JTJ_accumulator, phase2_JTr_accumulator, best_params_global, best_cost_global, phase_history, normalizer, tournament_selector, multistart_candidates)¶

class nlsq.streaming.phases.GNResult(params, cost, iterations, converged, jacobian=None, cov=None)[source]¶

Bases: object

Result from streaming Gauss-Newton phase.

params¶

Final optimized parameters.

Type:: Array

cost¶

Final cost value.

Type:: float

iterations¶

Number of GN iterations.

Type:: int

converged¶

Whether GN converged.

Type:: bool

jacobian¶

Final Jacobian matrix (optional).

Type:: Array | None

cov¶

Parameter covariance matrix (optional).

Type:: Array | None

params: Array¶

cost: float¶

iterations: int¶

converged: bool¶

jacobian: Array | None¶

cov: Array | None¶

__init__(params, cost, iterations, converged, jacobian=None, cov=None)¶

class nlsq.streaming.phases.GaussNewtonPhase(config, normalized_model, normalized_bounds=None)[source]¶

Bases: object

Phase 2: Streaming Gauss-Newton with implicit JtJ.

This class encapsulates the streaming Gauss-Newton logic for large dataset optimization. It streams over the full dataset in chunks, accumulating J^T J and J^T r, then solving the normal equations.

Parameters:

config (HybridStreamingConfig) – Configuration for streaming optimization.
normalized_model (NormalizedModelWrapper) – Model wrapper operating in normalized parameter space.
normalized_bounds (tuple of array_like or None) – Parameter bounds in normalized space.

config¶

Configuration object.

Type:: HybridStreamingConfig

normalized_model¶

Normalized model wrapper.

Type:: NormalizedModelWrapper

normalized_bounds¶

Bounds in normalized space.

Type:: tuple of array_like or None

Notes

The Gauss-Newton method iteratively solves:

(J^T J) @ step = J^T r

where J is the Jacobian and r is the residual vector.

__init__(config, normalized_model, normalized_bounds=None)[source]¶

Initialize GaussNewtonPhase.

Parameters:

config (HybridStreamingConfig) – Configuration for streaming optimization.
normalized_model (NormalizedModelWrapper) – Model wrapper operating in normalized parameter space.
normalized_bounds (tuple of array_like or None) – Parameter bounds in normalized space.

phase2_JTJ_accumulator: Array | None¶

phase2_JTr_accumulator: Array | None¶

set_jacobian_fn(fn)[source]¶

Set pre-compiled Jacobian function.

set_cost_fn(fn)[source]¶

Set pre-compiled cost function.

run(data_source, initial_params, phase_history, best_tracker)[source]¶

Run Phase 2 streaming Gauss-Newton optimization.

Parameters:

data_source (tuple of array_like) – Full dataset as (x_data, y_data).
initial_params (array_like) – Starting parameters in normalized space (from Phase 1).
phase_history (list) – Phase history list to append records to.
best_tracker (dict) – Dictionary tracking best_params_global and best_cost_global.

Returns:

result – Phase 2 optimization result with keys: - ‘final_params’: Final parameters in normalized space - ‘best_params’: Best parameters found - ‘best_cost’: Best cost achieved - ‘final_cost’: Final cost value - ‘iterations’: Number of Gauss-Newton iterations - ‘convergence_reason’: Why optimization stopped - ‘gradient_norm’: Final gradient norm - ‘JTJ_final’: Final accumulated J^T J matrix (for Phase 3) - ‘residual_sum_sq’: Total residual sum of squares (for Phase 3) - ‘gn_result’: GNResult dataclass instance

Return type:

dict

class nlsq.streaming.phases.PhaseOrchestrator(config)[source]¶

Bases: object

Orchestrates the multi-phase streaming optimization workflow.

The orchestrator coordinates: - Phase 0: Setup (normalization, validation) - Phase 1: L-BFGS warmup (WarmupPhase) - Phase 2: Streaming Gauss-Newton (GaussNewtonPhase) - Phase 3: Finalization (denormalization, covariance)

Parameters:: config (HybridStreamingConfig) – Configuration for streaming optimization.

config¶

Configuration object.

Type:: HybridStreamingConfig

warmup_phase¶

The warmup phase handler (lazy initialized).

Type:: WarmupPhase or None

gn_phase¶

The Gauss-Newton phase handler (lazy initialized).

Type:: GaussNewtonPhase or None

phase_history¶

Records of phase transitions.

Type:: list

__init__(config)[source]¶

Initialize PhaseOrchestrator.

Parameters:: config (HybridStreamingConfig) – Configuration for streaming optimization.

initialize_phases(normalized_model, normalized_bounds=None)[source]¶

Initialize phase handlers.

Parameters:

normalized_model (NormalizedModelWrapper) – Model wrapper operating in normalized parameter space.
normalized_bounds (tuple of array_like or None) – Parameter bounds in normalized space.

run(data_source, initial_params, normalizer=None)[source]¶

Run the full multi-phase optimization workflow.

Parameters:

data_source (tuple of array_like) – Data as (x_data, y_data).
initial_params (array_like) – Initial parameters in normalized space.
normalizer (ParameterNormalizer or None) – For denormalization in Phase 3.

Returns:

result – Complete optimization result with keys: - ‘final_params’: Final parameters in original space - ‘normalized_params’: Final parameters in normalized space - ‘best_cost’: Best cost achieved - ‘warmup_result’: WarmupResult from Phase 1 - ‘gn_result’: GNResult from Phase 2 - ‘phase_history’: List of phase records - ‘JTJ_final’: Final J^T J matrix - ‘residual_sum_sq’: Final residual sum of squares

Return type:

dict

get_phase_history()[source]¶

Get the phase transition history.

Returns:: phase_history – List of phase transition records.
Return type:: list

get_best_params()[source]¶

Get the best parameters found across all phases.

Returns:: best_params – Best parameters in normalized space.
Return type:: array_like or None

get_best_cost()[source]¶

Get the best cost found across all phases.

Returns:: best_cost – Best cost value.
Return type:: float

class nlsq.streaming.phases.PhaseOrchestratorResult(params, normalized_params, cost, warmup_result, gn_result, phase_history, total_time)[source]¶

Bases: object

Complete result from phase orchestration.

params¶

Final optimized parameters in original space.

Type:: Array

normalized_params¶

Final parameters in normalized space.

Type:: Array

cost¶

Final cost value.

Type:: float

warmup_result¶

Result from Phase 1 warmup.

Type:: WarmupResult | None

gn_result¶

Result from Phase 2 Gauss-Newton.

Type:: GNResult | None

phase_history¶

List of phase transition records.

Type:: list[dict[str, Any]]

total_time¶

Total optimization time.

Type:: float

params: Array¶

normalized_params: Array¶

cost: float¶

warmup_result: WarmupResult | None¶

gn_result: GNResult | None¶

phase_history: list[dict[str, Any]]¶

total_time: float¶

__init__(params, normalized_params, cost, warmup_result, gn_result, phase_history, total_time)¶

class nlsq.streaming.phases.WarmupPhase(config, normalized_model)[source]¶

Bases: object

Phase 1: L-BFGS warmup for initial convergence.

This class encapsulates the L-BFGS warmup logic that provides warm-started parameters for the streaming Gauss-Newton phase.

L-BFGS provides 5-10x faster convergence to the basin of attraction compared to first-order warmup by using approximate second-order (Hessian) information.

Parameters:

config (HybridStreamingConfig) – Configuration for streaming optimization.
normalized_model (NormalizedModelWrapper) – Model wrapper operating in normalized parameter space.

config¶

Configuration object.

Type:: HybridStreamingConfig

normalized_model¶

Normalized model wrapper.

Type:: NormalizedModelWrapper

Notes

The 4-Layer Defense Strategy is implemented to prevent warmup divergence: - Layer 1: Warm start detection (skip if already near optimum) - Layer 2: Adaptive initial step size based on relative loss - Layer 3: Cost-increase guard (abort if loss increases beyond tolerance) - Layer 4: Trust region constraint (clip update magnitude)

__init__(config, normalized_model)[source]¶

Initialize WarmupPhase.

Parameters:

config (HybridStreamingConfig) – Configuration for streaming optimization.
normalized_model (NormalizedModelWrapper) – Model wrapper operating in normalized parameter space.

set_residual_weights(weights)[source]¶

Set per-group residual weights for weighted least squares.

Parameters:: weights (array_like or None) – Per-group weights for weighted MSE computation.

run(data_source, initial_params, phase_history, best_tracker)[source]¶

Run Phase 1 L-BFGS warmup.

Parameters:

data_source (tuple of array_like) – Data source as (x_data, y_data).
initial_params (array_like) – Initial parameters in normalized space.
phase_history (list) – Phase history list to append records to.
best_tracker (dict) – Dictionary tracking best_params_global and best_cost_global.

Returns:

result – Phase 1 result with keys: - ‘final_params’: Final parameters in normalized space - ‘best_params’: Best parameters found during warmup - ‘best_loss’: Best loss value - ‘final_loss’: Final loss value - ‘iterations’: Number of iterations performed - ‘switch_reason’: Reason for switching to Phase 2 - ‘warmup_result’: WarmupResult dataclass instance

Return type:

dict

class nlsq.streaming.phases.WarmupResult(params, cost, iterations, converged, cost_history)[source]¶

Bases: object

Result from L-BFGS warmup phase.

params¶

Optimized parameters after warmup.

Type:: Array

cost¶

Final cost after warmup.

Type:: float

iterations¶

Number of warmup iterations performed.

Type:: int

converged¶

Whether warmup converged.

Type:: bool

cost_history¶

Cost at each iteration.

Type:: list[float]

params: Array¶

cost: float¶

iterations: int¶

converged: bool¶

cost_history: list[float]¶

__init__(params, cost, iterations, converged, cost_history)¶

Phase Classes Overview¶

WarmupPhase¶

L-BFGS warmup phase for initial parameter optimization. See WarmupPhase and WarmupResult.

GaussNewtonPhase¶

Streaming Gauss-Newton phase for refined optimization. See GaussNewtonPhase and GNResult.

PhaseOrchestrator¶

Coordinates the multi-phase optimization workflow. See PhaseOrchestrator and PhaseOrchestratorResult.

CheckpointManager¶

Manages checkpoint save/restore for crash recovery. See CheckpointManager and CheckpointState.