nlsq.streaming.phases

Streaming optimization phase classes for large-scale curve fitting.

This subpackage contains the extracted phase classes from the adaptive hybrid streaming optimizer, enabling modular streaming optimization workflows.

Phase-based streaming optimization modules.

This subpackage contains the phase implementations for the AdaptiveHybridStreamingOptimizer, organized by optimization phase:

  • Phase 0: Setup and normalization

  • Phase 1: L-BFGS warmup (WarmupPhase)

  • Phase 2: Gauss-Newton streaming optimization (GaussNewtonPhase)

  • Phase 3: Finalization and denormalization

The orchestrator coordinates phase transitions based on convergence criteria.

class nlsq.streaming.phases.CheckpointManager(config)[source]

Bases: object

Manages checkpoint save/load operations for streaming optimizer.

This class encapsulates all checkpoint I/O logic, including: - HDF5 file format versioning - Optimizer state serialization - Tournament and normalizer state handling - Guard clause-based validation

Parameters:

config (HybridStreamingConfig) – Configuration for streaming optimization.

config

Configuration object.

Type:

HybridStreamingConfig

version

Checkpoint format version (currently “3.0”).

Type:

str

VERSION = '3.0'
__init__(config)[source]

Initialize CheckpointManager.

Parameters:

config (HybridStreamingConfig) – Configuration for streaming optimization.

save(checkpoint_path, state)[source]

Save checkpoint with phase-specific state to HDF5 file.

Parameters:
  • checkpoint_path (str or Path) – Path to checkpoint file (.h5).

  • state (CheckpointState) – Optimizer state to save.

Notes

Checkpoint format version 3.0 includes: - current_phase: Current phase number - normalized_params: Parameters in normalized space - phase1_optimizer_state: Optax L-BFGS state - phase2_jtj_accumulator: Accumulated J^T J matrix - phase2_jtr_accumulator: Accumulated J^T r vector - best_params_global: Best parameters found globally - best_cost_global: Best cost value globally - phase_history: Complete phase history

load(checkpoint_path, global_config=None)[source]

Load checkpoint and restore phase-specific state.

Parameters:
  • checkpoint_path (str or Path) – Path to checkpoint file (.h5).

  • global_config (GlobalOptimizationConfig | None) – Configuration for tournament reconstruction.

Returns:

Restored optimizer state.

Return type:

CheckpointState

Raises:
class nlsq.streaming.phases.CheckpointState(current_phase, normalized_params, phase1_optimizer_state, phase2_JTJ_accumulator, phase2_JTr_accumulator, best_params_global, best_cost_global, phase_history, normalizer, tournament_selector, multistart_candidates)[source]

Bases: object

State container for checkpoint save/load operations.

This dataclass captures all optimizer state that needs to be persisted across checkpoint boundaries.

current_phase

Current optimization phase (0-3).

Type:

int

normalized_params

Parameters in normalized space.

Type:

Array | None

phase1_optimizer_state

Optax L-BFGS optimizer state.

Type:

Any | None

phase2_JTJ_accumulator

Accumulated J^T J matrix for Phase 2.

Type:

Array | None

phase2_JTr_accumulator

Accumulated J^T r vector for Phase 2.

Type:

Array | None

best_params_global

Best parameters found globally.

Type:

Array | None

best_cost_global

Best cost value globally.

Type:

float

phase_history

Complete phase history.

Type:

list[dict[str, Any]]

normalizer

Parameter normalizer (strategy, scales, offsets).

Type:

ParameterNormalizer | None

tournament_selector

Tournament selector for multi-start optimization.

Type:

TournamentSelector | None

multistart_candidates

Multi-start candidate parameters.

Type:

Array | None

current_phase: int
normalized_params: Array | None
phase1_optimizer_state: Any | None
phase2_JTJ_accumulator: Array | None
phase2_JTr_accumulator: Array | None
best_params_global: Array | None
best_cost_global: float
phase_history: list[dict[str, Any]]
normalizer: ParameterNormalizer | None
tournament_selector: TournamentSelector | None
multistart_candidates: Array | None
__init__(current_phase, normalized_params, phase1_optimizer_state, phase2_JTJ_accumulator, phase2_JTr_accumulator, best_params_global, best_cost_global, phase_history, normalizer, tournament_selector, multistart_candidates)
class nlsq.streaming.phases.GNResult(params, cost, iterations, converged, jacobian=None, cov=None)[source]

Bases: object

Result from streaming Gauss-Newton phase.

params

Final optimized parameters.

Type:

Array

cost

Final cost value.

Type:

float

iterations

Number of GN iterations.

Type:

int

converged

Whether GN converged.

Type:

bool

jacobian

Final Jacobian matrix (optional).

Type:

Array | None

cov

Parameter covariance matrix (optional).

Type:

Array | None

params: Array
cost: float
iterations: int
converged: bool
jacobian: Array | None
cov: Array | None
__init__(params, cost, iterations, converged, jacobian=None, cov=None)
class nlsq.streaming.phases.GaussNewtonPhase(config, normalized_model, normalized_bounds=None)[source]

Bases: object

Phase 2: Streaming Gauss-Newton with implicit JtJ.

This class encapsulates the streaming Gauss-Newton logic for large dataset optimization. It streams over the full dataset in chunks, accumulating J^T J and J^T r, then solving the normal equations.

Parameters:
  • config (HybridStreamingConfig) – Configuration for streaming optimization.

  • normalized_model (NormalizedModelWrapper) – Model wrapper operating in normalized parameter space.

  • normalized_bounds (tuple of array_like or None) – Parameter bounds in normalized space.

config

Configuration object.

Type:

HybridStreamingConfig

normalized_model

Normalized model wrapper.

Type:

NormalizedModelWrapper

normalized_bounds

Bounds in normalized space.

Type:

tuple of array_like or None

Notes

The Gauss-Newton method iteratively solves:

(J^T J) @ step = J^T r

where J is the Jacobian and r is the residual vector.

__init__(config, normalized_model, normalized_bounds=None)[source]

Initialize GaussNewtonPhase.

Parameters:
  • config (HybridStreamingConfig) – Configuration for streaming optimization.

  • normalized_model (NormalizedModelWrapper) – Model wrapper operating in normalized parameter space.

  • normalized_bounds (tuple of array_like or None) – Parameter bounds in normalized space.

phase2_JTJ_accumulator: Array | None
phase2_JTr_accumulator: Array | None
set_jacobian_fn(fn)[source]

Set pre-compiled Jacobian function.

set_cost_fn(fn)[source]

Set pre-compiled cost function.

run(data_source, initial_params, phase_history, best_tracker)[source]

Run Phase 2 streaming Gauss-Newton optimization.

Parameters:
  • data_source (tuple of array_like) – Full dataset as (x_data, y_data).

  • initial_params (array_like) – Starting parameters in normalized space (from Phase 1).

  • phase_history (list) – Phase history list to append records to.

  • best_tracker (dict) – Dictionary tracking best_params_global and best_cost_global.

Returns:

result – Phase 2 optimization result with keys: - ‘final_params’: Final parameters in normalized space - ‘best_params’: Best parameters found - ‘best_cost’: Best cost achieved - ‘final_cost’: Final cost value - ‘iterations’: Number of Gauss-Newton iterations - ‘convergence_reason’: Why optimization stopped - ‘gradient_norm’: Final gradient norm - ‘JTJ_final’: Final accumulated J^T J matrix (for Phase 3) - ‘residual_sum_sq’: Total residual sum of squares (for Phase 3) - ‘gn_result’: GNResult dataclass instance

Return type:

dict

class nlsq.streaming.phases.PhaseOrchestrator(config)[source]

Bases: object

Orchestrates the multi-phase streaming optimization workflow.

The orchestrator coordinates: - Phase 0: Setup (normalization, validation) - Phase 1: L-BFGS warmup (WarmupPhase) - Phase 2: Streaming Gauss-Newton (GaussNewtonPhase) - Phase 3: Finalization (denormalization, covariance)

Parameters:

config (HybridStreamingConfig) – Configuration for streaming optimization.

config

Configuration object.

Type:

HybridStreamingConfig

warmup_phase

The warmup phase handler (lazy initialized).

Type:

WarmupPhase or None

gn_phase

The Gauss-Newton phase handler (lazy initialized).

Type:

GaussNewtonPhase or None

phase_history

Records of phase transitions.

Type:

list

__init__(config)[source]

Initialize PhaseOrchestrator.

Parameters:

config (HybridStreamingConfig) – Configuration for streaming optimization.

initialize_phases(normalized_model, normalized_bounds=None)[source]

Initialize phase handlers.

Parameters:
  • normalized_model (NormalizedModelWrapper) – Model wrapper operating in normalized parameter space.

  • normalized_bounds (tuple of array_like or None) – Parameter bounds in normalized space.

run(data_source, initial_params, normalizer=None)[source]

Run the full multi-phase optimization workflow.

Parameters:
  • data_source (tuple of array_like) – Data as (x_data, y_data).

  • initial_params (array_like) – Initial parameters in normalized space.

  • normalizer (ParameterNormalizer or None) – For denormalization in Phase 3.

Returns:

result – Complete optimization result with keys: - ‘final_params’: Final parameters in original space - ‘normalized_params’: Final parameters in normalized space - ‘best_cost’: Best cost achieved - ‘warmup_result’: WarmupResult from Phase 1 - ‘gn_result’: GNResult from Phase 2 - ‘phase_history’: List of phase records - ‘JTJ_final’: Final J^T J matrix - ‘residual_sum_sq’: Final residual sum of squares

Return type:

dict

get_phase_history()[source]

Get the phase transition history.

Returns:

phase_history – List of phase transition records.

Return type:

list

get_best_params()[source]

Get the best parameters found across all phases.

Returns:

best_params – Best parameters in normalized space.

Return type:

array_like or None

get_best_cost()[source]

Get the best cost found across all phases.

Returns:

best_cost – Best cost value.

Return type:

float

class nlsq.streaming.phases.PhaseOrchestratorResult(params, normalized_params, cost, warmup_result, gn_result, phase_history, total_time)[source]

Bases: object

Complete result from phase orchestration.

params

Final optimized parameters in original space.

Type:

Array

normalized_params

Final parameters in normalized space.

Type:

Array

cost

Final cost value.

Type:

float

warmup_result

Result from Phase 1 warmup.

Type:

WarmupResult | None

gn_result

Result from Phase 2 Gauss-Newton.

Type:

GNResult | None

phase_history

List of phase transition records.

Type:

list[dict[str, Any]]

total_time

Total optimization time.

Type:

float

params: Array
normalized_params: Array
cost: float
warmup_result: WarmupResult | None
gn_result: GNResult | None
phase_history: list[dict[str, Any]]
total_time: float
__init__(params, normalized_params, cost, warmup_result, gn_result, phase_history, total_time)
class nlsq.streaming.phases.WarmupPhase(config, normalized_model)[source]

Bases: object

Phase 1: L-BFGS warmup for initial convergence.

This class encapsulates the L-BFGS warmup logic that provides warm-started parameters for the streaming Gauss-Newton phase.

L-BFGS provides 5-10x faster convergence to the basin of attraction compared to first-order warmup by using approximate second-order (Hessian) information.

Parameters:
config

Configuration object.

Type:

HybridStreamingConfig

normalized_model

Normalized model wrapper.

Type:

NormalizedModelWrapper

Notes

The 4-Layer Defense Strategy is implemented to prevent warmup divergence: - Layer 1: Warm start detection (skip if already near optimum) - Layer 2: Adaptive initial step size based on relative loss - Layer 3: Cost-increase guard (abort if loss increases beyond tolerance) - Layer 4: Trust region constraint (clip update magnitude)

__init__(config, normalized_model)[source]

Initialize WarmupPhase.

Parameters:
set_residual_weights(weights)[source]

Set per-group residual weights for weighted least squares.

Parameters:

weights (array_like or None) – Per-group weights for weighted MSE computation.

run(data_source, initial_params, phase_history, best_tracker)[source]

Run Phase 1 L-BFGS warmup.

Parameters:
  • data_source (tuple of array_like) – Data source as (x_data, y_data).

  • initial_params (array_like) – Initial parameters in normalized space.

  • phase_history (list) – Phase history list to append records to.

  • best_tracker (dict) – Dictionary tracking best_params_global and best_cost_global.

Returns:

result – Phase 1 result with keys: - ‘final_params’: Final parameters in normalized space - ‘best_params’: Best parameters found during warmup - ‘best_loss’: Best loss value - ‘final_loss’: Final loss value - ‘iterations’: Number of iterations performed - ‘switch_reason’: Reason for switching to Phase 2 - ‘warmup_result’: WarmupResult dataclass instance

Return type:

dict

class nlsq.streaming.phases.WarmupResult(params, cost, iterations, converged, cost_history)[source]

Bases: object

Result from L-BFGS warmup phase.

params

Optimized parameters after warmup.

Type:

Array

cost

Final cost after warmup.

Type:

float

iterations

Number of warmup iterations performed.

Type:

int

converged

Whether warmup converged.

Type:

bool

cost_history

Cost at each iteration.

Type:

list[float]

params: Array
cost: float
iterations: int
converged: bool
cost_history: list[float]
__init__(params, cost, iterations, converged, cost_history)

Phase Classes Overview

WarmupPhase

L-BFGS warmup phase for initial parameter optimization. See WarmupPhase and WarmupResult.

GaussNewtonPhase

Streaming Gauss-Newton phase for refined optimization. See GaussNewtonPhase and GNResult.

PhaseOrchestrator

Coordinates the multi-phase optimization workflow. See PhaseOrchestrator and PhaseOrchestratorResult.

CheckpointManager

Manages checkpoint save/restore for crash recovery. See CheckpointManager and CheckpointState.