nlsq.adaptive_hybrid_streaming module

Adaptive Hybrid Streaming Optimizer with Parameter Normalization.

This module implements a four-phase hybrid optimizer that solves three fundamental issues in streaming optimization:

  1. Weak gradient signals from parameter scale imbalance (via normalization)

  2. Slow convergence near optimum (via Gauss-Newton)

  3. Crude covariance estimation (via exact J^T J accumulation)

The optimizer operates in four phases: - Phase 0: Parameter normalization setup - Phase 1: L-BFGS warmup with adaptive switching - Phase 2: Streaming Gauss-Newton with exact J^T J accumulation - Phase 3: Denormalization and covariance transform

This implementation focuses on Phase 0 setup logic and phase tracking infrastructure.

class nlsq.streaming.adaptive_hybrid.AdaptiveHybridStreamingOptimizer(config=None)[source]

Bases: object

Adaptive hybrid streaming optimizer with four-phase optimization.

This optimizer combines parameter normalization, L-BFGS warmup, streaming Gauss-Newton, and exact covariance computation to provide:

  • Fast convergence for parameters with different scales

  • Accurate uncertainty estimates on large datasets

  • Memory-efficient streaming for unlimited dataset sizes

  • Production-ready fault tolerance

The optimization proceeds through four phases:

  • Phase 0: Setup parameter normalization and bounds transformation

  • Phase 1: L-BFGS warmup with adaptive switching to Phase 2

  • Phase 2: Streaming Gauss-Newton with exact J^T J accumulation

  • Phase 3: Denormalize parameters and transform covariance matrix

Parameters:

config (HybridStreamingConfig, optional) – Configuration for all phases of optimization. If None, uses default configuration. See HybridStreamingConfig for details.

config

Configuration object controlling all phases

Type:

HybridStreamingConfig

current_phase

Current optimization phase (0, 1, 2, or 3)

Type:

int

phase_history

History of phase transitions with timing information

Type:

list

phase_start_time

Start time of current phase (seconds since epoch)

Type:

float or None

normalized_params

Current parameters in normalized space

Type:

jax.Array or None

normalizer

Parameter normalizer instance (created in Phase 0)

Type:

ParameterNormalizer or None

normalized_model

Wrapped model function operating in normalized space

Type:

NormalizedModelWrapper or None

normalized_bounds

Bounds transformed to normalized space

Type:

tuple of jax.Array or None

normalization_jacobian

Denormalization Jacobian for covariance transform

Type:

jax.Array or None

Examples

Basic usage with default configuration:

>>> from nlsq import AdaptiveHybridStreamingOptimizer, HybridStreamingConfig
>>> import jax.numpy as jnp
>>> config = HybridStreamingConfig()
>>> optimizer = AdaptiveHybridStreamingOptimizer(config)

With bounds-based normalization:

>>> config = HybridStreamingConfig(
...     normalize=True,
...     normalization_strategy='bounds'
... )
>>> optimizer = AdaptiveHybridStreamingOptimizer(config)

With custom warmup settings:

>>> config = HybridStreamingConfig(
...     warmup_iterations=300,
...     lbfgs_initial_step_size=0.5,
...     gauss_newton_tol=1e-10
... )
>>> optimizer = AdaptiveHybridStreamingOptimizer(config)

See also

HybridStreamingConfig

Configuration for all phases

ParameterNormalizer

Parameter normalization implementation

curve_fit

High-level interface with method=’hybrid_streaming’

Notes

Based on Adaptive Hybrid Streaming Optimizer specification: agent-os/specs/2025-12-18-adaptive-hybrid-streaming-optimizer/spec.md

__init__(config=None)[source]

Initialize adaptive hybrid streaming optimizer.

Parameters:

config (HybridStreamingConfig, optional) – Configuration for all phases. If None, uses default configuration.

set_residual_weights(weights)[source]

Set residual weights for weighted least squares optimization.

This method allows updating weights during optimization, for example when weights need to be recomputed based on current parameter estimates.

Parameters:

weights (np.ndarray) – Per-group weights of shape (n_groups,). Higher weights give more importance to residuals in that group. The group index for each data point is determined by the first column of x_data.

Notes

Weights must be positive. The weighted MSE is computed as:

wMSE = sum(w[group_idx] * residuals^2) / sum(w[group_idx])

clear_cache()[source]

Release cached padded arrays to free memory.

Call this after optimization completes or when reusing the optimizer with different data. The cache is automatically invalidated when data identity changes, but explicit clearing frees memory sooner.

fit(data_source, func, p0, bounds=None, sigma=None, absolute_sigma=False, callback=None, verbose=1)[source]

Fit model parameters using four-phase hybrid optimization.

This method orchestrates all four phases: - Phase 0: Setup normalization - Phase 1: L-BFGS warmup - Phase 2: Streaming Gauss-Newton - Phase 3: Denormalization and covariance

Parameters:
  • data_source (various types) – Data source for optimization. Can be: - Tuple of arrays: (x_data, y_data) - Generator yielding (x_batch, y_batch) - HDF5 file path with datasets

  • func (Callable) – Model function with signature: func(x, *params) -> predictions

  • p0 (array_like) – Initial parameter guess of shape (n_params,)

  • bounds (tuple of array_like, optional) – Parameter bounds as (lb, ub)

  • sigma (array_like, optional) – Uncertainties in y_data for weighted least squares

  • absolute_sigma (bool, default=False) – If True, sigma is used in absolute sense (pcov not scaled)

  • callback (Callable, optional) – Callback with signature callback(params, loss, iteration) Called every config.callback_frequency iterations

  • verbose (int, default=1) – Verbosity level (0=silent, 1=progress, 2=debug)

Returns:

result – Optimization result dictionary with keys: - ‘x’: Optimized parameters in original space - ‘success’: Boolean indicating success - ‘message’: Status message - ‘fun’: Final residuals - ‘pcov’: Covariance matrix (Phase 3) - ‘perr’: Standard errors (Phase 3) - ‘streaming_diagnostics’: Phase information, timing, etc.

Return type:

dict

Notes

The result dictionary is compatible with scipy.optimize.curve_fit and can be used interchangeably.

property phase_status: dict[str, Any]

Get current phase status and history.

Returns:

status – Phase status dictionary with keys: - ‘current_phase’: Current phase number - ‘phase_name’: Name of current phase - ‘phase_history’: List of completed phases with timing - ‘total_phases’: Total number of phases (4)

Return type:

dict

Examples

>>> config = HybridStreamingConfig()
>>> optimizer = AdaptiveHybridStreamingOptimizer(config)
>>> status = optimizer.phase_status
>>> print(status['current_phase'])
0
>>> print(status['phase_name'])
Phase 0: Normalization Setup
class nlsq.streaming.adaptive_hybrid.DefenseLayerTelemetry[source]

Bases: object

Telemetry for monitoring 4-layer defense strategy activations.

Tracks when each defense layer is triggered during warmup to help with production monitoring and tuning. This class maintains thread-safe statistics that can be queried or exported for monitoring dashboards.

The 4 layers tracked are:
  • Layer 1: Warm start detection (skips warmup)

  • Layer 2: Adaptive step size selection (refinement/careful/exploration)

  • Layer 3: Cost-increase guard (aborts warmup if loss increases)

  • Layer 4: Step clipping (limits update magnitude)

layer1_warm_start_triggers

Count of warm start detection activations (warmup skipped)

Type:

int

layer2_lr_mode_counts

Counts per LR mode: {“refinement”: n, “careful”: m, “exploration”: k}

Type:

dict[str, int]

layer3_cost_guard_triggers

Count of cost-increase guard aborts

Type:

int

layer4_clip_triggers

Count of step clipping activations

Type:

int

total_warmup_calls

Total number of warmup phase executions

Type:

int

__init__()[source]

Initialize telemetry with zeroed counters.

reset()[source]

Reset all telemetry counters to zero.

record_warmup_start()[source]

Record start of a warmup phase.

record_layer1_trigger(relative_loss, threshold)[source]

Record Layer 1 warm start detection trigger.

Parameters:
  • relative_loss (float) – Relative loss that triggered warm start

  • threshold (float) – Threshold value that was exceeded

record_layer2_lr_mode(mode, relative_loss)[source]

Record Layer 2 adaptive LR mode selection.

Parameters:
  • mode (str) – Selected LR mode: “refinement”, “careful”, “exploration”, or “fixed”

  • relative_loss (float) – Relative loss that determined the mode

record_layer3_trigger(cost_ratio, tolerance, iteration)[source]

Record Layer 3 cost-increase guard trigger.

Parameters:
  • cost_ratio (float) – Cost increase ratio that triggered the guard

  • tolerance (float) – Tolerance threshold that was exceeded

  • iteration (int) – Iteration number when triggered

record_layer4_clip(original_norm, max_norm)[source]

Record Layer 4 step clipping activation.

Parameters:
  • original_norm (float) – Original update norm before clipping

  • max_norm (float) – Maximum allowed norm (clipping threshold)

record_lbfgs_history_fill(iteration)[source]

Record L-BFGS history buffer fill event.

Called when the L-BFGS history buffer becomes fully populated, signaling transition from cold start to full L-BFGS mode.

Parameters:

iteration (int) – Iteration number when history buffer filled

record_lbfgs_line_search_failure(iteration, reason='')[source]

Record L-BFGS line search failure event.

Called when the L-BFGS line search fails to find an acceptable step.

Parameters:
  • iteration (int) – Iteration number when line search failed

  • reason (str, optional) – Reason for line search failure

get_trigger_rates()[source]

Get trigger rates as percentage of total warmup calls.

Returns:

Trigger rates for each layer as percentages (0-100)

Return type:

dict[str, float]

get_summary()[source]

Get summary statistics for all defense layers.

Returns:

Summary with counts and rates for each layer

Return type:

dict

get_recent_events(n=10)[source]

Get most recent N events.

Parameters:

n (int) – Number of recent events to return

Returns:

Most recent events

Return type:

list[dict]

export_metrics()[source]

Export metrics in a format suitable for monitoring systems.

Returns:

Metrics with consistent naming for Prometheus/Grafana/etc.

Return type:

dict

nlsq.streaming.adaptive_hybrid.get_defense_telemetry()[source]

Get global defense layer telemetry instance.

Returns:

Global telemetry instance (created on first call)

Return type:

DefenseLayerTelemetry

nlsq.streaming.adaptive_hybrid.reset_defense_telemetry()[source]

Reset global defense layer telemetry.

Overview

The nlsq.adaptive_hybrid_streaming module implements a four-phase hybrid optimizer that solves three fundamental issues in streaming optimization:

  1. Weak gradient signals from parameter scale imbalance (via normalization)

  2. Slow convergence near optimum (via Gauss-Newton)

  3. Crude covariance estimation (via exact J^T J accumulation)

New in version 0.3.0: Complete adaptive hybrid streaming optimizer.

Key Features

  • Four-phase optimization: Automatic phase transitions for optimal convergence

  • Parameter normalization: Address gradient signal weakness from scale imbalance

  • L-BFGS warmup: Quasi-Newton optimization with 4-layer divergence protection

  • Streaming Gauss-Newton: Second-order convergence with streaming J^T J

  • Exact covariance: Production-quality uncertainty estimates

  • Fault tolerance: Checkpointing, validation, and automatic recovery

  • Multi-device support: GPU/TPU parallelism for large datasets

  • Defense telemetry: Production monitoring of warmup protection layers

New in version 0.3.6: 4-Layer Defense Strategy for warmup divergence prevention.

Optimization Phases

Phase 0: Normalization Setup

Sets up parameter normalization to address gradient signal weakness:

  • Determines normalization strategy (bounds-based, p0-based, or none)

  • Creates ParameterNormalizer with scales and offsets

  • Wraps user model for transparent normalized parameter space

  • Transforms bounds to normalized space

  • Stores normalization Jacobian for Phase 3

Phase 1: L-BFGS Warmup with 4-Layer Defense

First-order optimization with adaptive switching and divergence protection:

  • Uses Optax L-BFGS optimizer with line search and configurable step sizes

  • 4-Layer Defense Strategy (new in 0.3.6):

    1. Warm Start Detection: Skip warmup if initial loss already low

    2. Adaptive Step Size: Scale step size based on initial loss quality

    3. Cost-Increase Guard: Abort if loss increases beyond tolerance

    4. Step Clipping: Limit update magnitude for stability

  • Monitors loss plateau and gradient norm for switch criteria

  • Builds momentum and explores parameter space

  • Switches to Phase 2 when ready for fine-tuning

See also

How Curve Fitting Works for complete optimization strategy documentation.

Phase 2: Streaming Gauss-Newton

Second-order optimization with exact J^T J accumulation:

  • Streams data in chunks for memory efficiency

  • Accumulates exact J^T J matrix (not stochastic approximation)

  • Trust region step control with Levenberg-Marquardt regularization

  • Converges quickly near optimum with quadratic rate

  • Provides production-quality parameter estimates

Phase 3: Denormalization and Covariance

Finalizes optimization and computes uncertainties:

  • Denormalizes parameters to original space

  • Transforms covariance matrix using normalization Jacobian

  • Returns final result with parameter uncertainties

Classes

AdaptiveHybridStreamingOptimizer([config])

Adaptive hybrid streaming optimizer with four-phase optimization.

DefenseLayerTelemetry()

Telemetry for monitoring 4-layer defense strategy activations.

class nlsq.streaming.adaptive_hybrid.AdaptiveHybridStreamingOptimizer(config=None)[source]

Bases: object

Adaptive hybrid streaming optimizer with four-phase optimization.

This optimizer combines parameter normalization, L-BFGS warmup, streaming Gauss-Newton, and exact covariance computation to provide:

  • Fast convergence for parameters with different scales

  • Accurate uncertainty estimates on large datasets

  • Memory-efficient streaming for unlimited dataset sizes

  • Production-ready fault tolerance

The optimization proceeds through four phases:

  • Phase 0: Setup parameter normalization and bounds transformation

  • Phase 1: L-BFGS warmup with adaptive switching to Phase 2

  • Phase 2: Streaming Gauss-Newton with exact J^T J accumulation

  • Phase 3: Denormalize parameters and transform covariance matrix

Parameters:

config (HybridStreamingConfig, optional) – Configuration for all phases of optimization. If None, uses default configuration. See HybridStreamingConfig for details.

config

Configuration object controlling all phases

Type:

HybridStreamingConfig

current_phase

Current optimization phase (0, 1, 2, or 3)

Type:

int

phase_history

History of phase transitions with timing information

Type:

list

phase_start_time

Start time of current phase (seconds since epoch)

Type:

float or None

normalized_params

Current parameters in normalized space

Type:

jax.Array or None

normalizer

Parameter normalizer instance (created in Phase 0)

Type:

ParameterNormalizer or None

normalized_model

Wrapped model function operating in normalized space

Type:

NormalizedModelWrapper or None

normalized_bounds

Bounds transformed to normalized space

Type:

tuple of jax.Array or None

normalization_jacobian

Denormalization Jacobian for covariance transform

Type:

jax.Array or None

Examples

Basic usage with default configuration:

>>> from nlsq import AdaptiveHybridStreamingOptimizer, HybridStreamingConfig
>>> import jax.numpy as jnp
>>> config = HybridStreamingConfig()
>>> optimizer = AdaptiveHybridStreamingOptimizer(config)

With bounds-based normalization:

>>> config = HybridStreamingConfig(
...     normalize=True,
...     normalization_strategy='bounds'
... )
>>> optimizer = AdaptiveHybridStreamingOptimizer(config)

With custom warmup settings:

>>> config = HybridStreamingConfig(
...     warmup_iterations=300,
...     lbfgs_initial_step_size=0.5,
...     gauss_newton_tol=1e-10
... )
>>> optimizer = AdaptiveHybridStreamingOptimizer(config)

See also

HybridStreamingConfig

Configuration for all phases

ParameterNormalizer

Parameter normalization implementation

curve_fit

High-level interface with method=’hybrid_streaming’

Notes

Based on Adaptive Hybrid Streaming Optimizer specification: agent-os/specs/2025-12-18-adaptive-hybrid-streaming-optimizer/spec.md

__init__(config=None)[source]

Initialize adaptive hybrid streaming optimizer.

Parameters:

config (HybridStreamingConfig, optional) – Configuration for all phases. If None, uses default configuration.

set_residual_weights(weights)[source]

Set residual weights for weighted least squares optimization.

This method allows updating weights during optimization, for example when weights need to be recomputed based on current parameter estimates.

Parameters:

weights (np.ndarray) – Per-group weights of shape (n_groups,). Higher weights give more importance to residuals in that group. The group index for each data point is determined by the first column of x_data.

Notes

Weights must be positive. The weighted MSE is computed as:

wMSE = sum(w[group_idx] * residuals^2) / sum(w[group_idx])

clear_cache()[source]

Release cached padded arrays to free memory.

Call this after optimization completes or when reusing the optimizer with different data. The cache is automatically invalidated when data identity changes, but explicit clearing frees memory sooner.

fit(data_source, func, p0, bounds=None, sigma=None, absolute_sigma=False, callback=None, verbose=1)[source]

Fit model parameters using four-phase hybrid optimization.

This method orchestrates all four phases: - Phase 0: Setup normalization - Phase 1: L-BFGS warmup - Phase 2: Streaming Gauss-Newton - Phase 3: Denormalization and covariance

Parameters:
  • data_source (various types) – Data source for optimization. Can be: - Tuple of arrays: (x_data, y_data) - Generator yielding (x_batch, y_batch) - HDF5 file path with datasets

  • func (Callable) – Model function with signature: func(x, *params) -> predictions

  • p0 (array_like) – Initial parameter guess of shape (n_params,)

  • bounds (tuple of array_like, optional) – Parameter bounds as (lb, ub)

  • sigma (array_like, optional) – Uncertainties in y_data for weighted least squares

  • absolute_sigma (bool, default=False) – If True, sigma is used in absolute sense (pcov not scaled)

  • callback (Callable, optional) – Callback with signature callback(params, loss, iteration) Called every config.callback_frequency iterations

  • verbose (int, default=1) – Verbosity level (0=silent, 1=progress, 2=debug)

Returns:

result – Optimization result dictionary with keys: - ‘x’: Optimized parameters in original space - ‘success’: Boolean indicating success - ‘message’: Status message - ‘fun’: Final residuals - ‘pcov’: Covariance matrix (Phase 3) - ‘perr’: Standard errors (Phase 3) - ‘streaming_diagnostics’: Phase information, timing, etc.

Return type:

dict

Notes

The result dictionary is compatible with scipy.optimize.curve_fit and can be used interchangeably.

property phase_status: dict[str, Any]

Get current phase status and history.

Returns:

status – Phase status dictionary with keys: - ‘current_phase’: Current phase number - ‘phase_name’: Name of current phase - ‘phase_history’: List of completed phases with timing - ‘total_phases’: Total number of phases (4)

Return type:

dict

Examples

>>> config = HybridStreamingConfig()
>>> optimizer = AdaptiveHybridStreamingOptimizer(config)
>>> status = optimizer.phase_status
>>> print(status['current_phase'])
0
>>> print(status['phase_name'])
Phase 0: Normalization Setup
class nlsq.streaming.adaptive_hybrid.DefenseLayerTelemetry[source]

Bases: object

Telemetry for monitoring 4-layer defense strategy activations.

Tracks when each defense layer is triggered during warmup to help with production monitoring and tuning. This class maintains thread-safe statistics that can be queried or exported for monitoring dashboards.

The 4 layers tracked are:
  • Layer 1: Warm start detection (skips warmup)

  • Layer 2: Adaptive step size selection (refinement/careful/exploration)

  • Layer 3: Cost-increase guard (aborts warmup if loss increases)

  • Layer 4: Step clipping (limits update magnitude)

layer1_warm_start_triggers

Count of warm start detection activations (warmup skipped)

Type:

int

layer2_lr_mode_counts

Counts per LR mode: {“refinement”: n, “careful”: m, “exploration”: k}

Type:

dict[str, int]

layer3_cost_guard_triggers

Count of cost-increase guard aborts

Type:

int

layer4_clip_triggers

Count of step clipping activations

Type:

int

total_warmup_calls

Total number of warmup phase executions

Type:

int

__init__()[source]

Initialize telemetry with zeroed counters.

reset()[source]

Reset all telemetry counters to zero.

record_warmup_start()[source]

Record start of a warmup phase.

record_layer1_trigger(relative_loss, threshold)[source]

Record Layer 1 warm start detection trigger.

Parameters:
  • relative_loss (float) – Relative loss that triggered warm start

  • threshold (float) – Threshold value that was exceeded

record_layer2_lr_mode(mode, relative_loss)[source]

Record Layer 2 adaptive LR mode selection.

Parameters:
  • mode (str) – Selected LR mode: “refinement”, “careful”, “exploration”, or “fixed”

  • relative_loss (float) – Relative loss that determined the mode

record_layer3_trigger(cost_ratio, tolerance, iteration)[source]

Record Layer 3 cost-increase guard trigger.

Parameters:
  • cost_ratio (float) – Cost increase ratio that triggered the guard

  • tolerance (float) – Tolerance threshold that was exceeded

  • iteration (int) – Iteration number when triggered

record_layer4_clip(original_norm, max_norm)[source]

Record Layer 4 step clipping activation.

Parameters:
  • original_norm (float) – Original update norm before clipping

  • max_norm (float) – Maximum allowed norm (clipping threshold)

record_lbfgs_history_fill(iteration)[source]

Record L-BFGS history buffer fill event.

Called when the L-BFGS history buffer becomes fully populated, signaling transition from cold start to full L-BFGS mode.

Parameters:

iteration (int) – Iteration number when history buffer filled

record_lbfgs_line_search_failure(iteration, reason='')[source]

Record L-BFGS line search failure event.

Called when the L-BFGS line search fails to find an acceptable step.

Parameters:
  • iteration (int) – Iteration number when line search failed

  • reason (str, optional) – Reason for line search failure

get_trigger_rates()[source]

Get trigger rates as percentage of total warmup calls.

Returns:

Trigger rates for each layer as percentages (0-100)

Return type:

dict[str, float]

get_summary()[source]

Get summary statistics for all defense layers.

Returns:

Summary with counts and rates for each layer

Return type:

dict

get_recent_events(n=10)[source]

Get most recent N events.

Parameters:

n (int) – Number of recent events to return

Returns:

Most recent events

Return type:

list[dict]

export_metrics()[source]

Export metrics in a format suitable for monitoring systems.

Returns:

Metrics with consistent naming for Prometheus/Grafana/etc.

Return type:

dict

Functions

get_defense_telemetry()

Get global defense layer telemetry instance.

reset_defense_telemetry()

Reset global defense layer telemetry.

nlsq.streaming.adaptive_hybrid.get_defense_telemetry()[source]

Get global defense layer telemetry instance.

Returns:

Global telemetry instance (created on first call)

Return type:

DefenseLayerTelemetry

nlsq.streaming.adaptive_hybrid.reset_defense_telemetry()[source]

Reset global defense layer telemetry.

Usage Examples

Basic Usage

Create an optimizer with default configuration:

from nlsq import AdaptiveHybridStreamingOptimizer, HybridStreamingConfig
import jax.numpy as jnp

# Default configuration
config = HybridStreamingConfig()
optimizer = AdaptiveHybridStreamingOptimizer(config)


# Define model
def exponential(x, a, b, c):
    return a * jnp.exp(-b * x) + c


# Prepare data
x_data = jnp.linspace(0, 10, 1000000)  # 1M points
y_data = 2.5 * jnp.exp(-0.3 * x_data) + 0.5 + 0.1 * jnp.random.normal(key, (1000000,))

# Fit (API for full implementation)
result = optimizer.fit(
    model=exponential,
    x_data=x_data,
    y_data=y_data,
    p0=[2.0, 0.5, 0.3],
)

With Bounds-Based Normalization

Use parameter bounds for normalization:

config = HybridStreamingConfig(
    normalize=True,
    normalization_strategy="bounds",
)
optimizer = AdaptiveHybridStreamingOptimizer(config)

# Parameters: amplitude [1, 10], decay [0.1, 1.0], offset [0, 2]
bounds = (jnp.array([1.0, 0.1, 0.0]), jnp.array([10.0, 1.0, 2.0]))

result = optimizer.fit(
    model=exponential,
    x_data=x_data,
    y_data=y_data,
    p0=[2.0, 0.5, 0.3],
    bounds=bounds,
)

With Aggressive Configuration

Fast convergence with looser tolerances:

config = HybridStreamingConfig.aggressive()
optimizer = AdaptiveHybridStreamingOptimizer(config)

# Faster warmup, larger chunks, earlier switching
result = optimizer.fit(model, x_data, y_data, p0)

With Conservative Configuration

Higher quality with tighter tolerances:

config = HybridStreamingConfig.conservative()
optimizer = AdaptiveHybridStreamingOptimizer(config)

# Tighter tolerance, more Gauss-Newton iterations
result = optimizer.fit(model, x_data, y_data, p0)

Memory-Optimized Configuration

Minimize memory footprint:

config = HybridStreamingConfig.memory_optimized()
optimizer = AdaptiveHybridStreamingOptimizer(config)

# Smaller chunks, float32, frequent checkpoints
result = optimizer.fit(model, x_data, y_data, p0)

Custom Switching Criteria

Control when Phase 1 switches to Phase 2:

config = HybridStreamingConfig(
    warmup_iterations=300,
    max_warmup_iterations=800,
    loss_plateau_threshold=1e-5,  # Tighter plateau detection
    gradient_norm_threshold=1e-4,  # Lower gradient threshold
    active_switching_criteria=["plateau", "gradient"],  # Remove max_iter
)
optimizer = AdaptiveHybridStreamingOptimizer(config)

With Fault Tolerance

Enable checkpointing for long optimizations:

config = HybridStreamingConfig(
    enable_checkpoints=True,
    checkpoint_frequency=50,
    checkpoint_dir="/path/to/checkpoints",
    validate_numerics=True,
    enable_fault_tolerance=True,
    max_retries_per_batch=3,
)
optimizer = AdaptiveHybridStreamingOptimizer(config)

# Resume from checkpoint
config_resume = HybridStreamingConfig(resume_from_checkpoint="/path/to/checkpoint.pkl")

Phase Tracking

Monitor optimization progress:

optimizer = AdaptiveHybridStreamingOptimizer(config)

# After fitting
print(f"Final phase: {optimizer.current_phase}")
print(f"Phase history: {optimizer.phase_history}")

# Each entry in phase_history:
# {'phase': 0, 'start_time': 1234567890.0, 'duration': 0.5}

Defense Layer Telemetry

New in version 0.3.6: Monitor defense layer activations.

Track when defense layers trigger across multiple fits:

from nlsq import get_defense_telemetry, reset_defense_telemetry

# Reset telemetry counters
reset_defense_telemetry()

# Run multiple fits
for dataset in datasets:
    result = optimizer.fit(model, x_data, y_data, p0)

# Get telemetry summary
telemetry = get_defense_telemetry()
print(telemetry.get_summary())

# Get activation rates
rates = telemetry.get_trigger_rates()
print(f"Warm start rate: {rates['layer1_warm_start_rate']:.1f}%")
print(f"Cost guard rate: {rates['layer3_cost_guard_rate']:.1f}%")

# Export for Prometheus/Grafana
metrics = telemetry.export_metrics()

Defense Layer Presets

New in version 0.3.6: Sensitivity presets for defense layers.

# Strict defense for near-optimal scenarios
config = HybridStreamingConfig.defense_strict()

# Relaxed defense for exploration
config = HybridStreamingConfig.defense_relaxed()

# Disable all defense layers
config = HybridStreamingConfig.defense_disabled()

# Scientific computing optimized
config = HybridStreamingConfig.scientific_default()

Architecture

Optimization Flow

Input: model, x_data, y_data, p0, bounds
       │
       ▼
┌──────────────────────────────────────┐
│  Phase 0: Normalization Setup        │
│  - Create ParameterNormalizer        │
│  - Wrap model function               │
│  - Transform bounds                  │
│  - Store normalization Jacobian      │
└──────────────────────────────────────┘
       │
       ▼
┌──────────────────────────────────────┐
│  Phase 1: L-BFGS Warmup              │
│  - Optax L-BFGS optimizer            │
│  - Monitor loss/gradient             │
│  - Check switching criteria          │
└──────────────────────────────────────┘
       │ (switch when criteria met)
       ▼
┌──────────────────────────────────────┐
│  Phase 2: Streaming Gauss-Newton     │
│  - Stream data in chunks             │
│  - Accumulate exact J^T J            │
│  - Trust region optimization         │
└──────────────────────────────────────┘
       │ (converged)
       ▼
┌──────────────────────────────────────┐
│  Phase 3: Denormalization            │
│  - Denormalize parameters            │
│  - Transform covariance: J @ C @ J.T │
│  - Return final result               │
└──────────────────────────────────────┘
       │
       ▼
Output: OptimizeResult with popt, pcov

Key Attributes

Attribute

Description

current_phase

Current optimization phase (0, 1, 2, or 3)

phase_history

List of phase transitions with timing

normalizer

ParameterNormalizer instance

normalized_model

Wrapped model for normalized space

normalized_bounds

Bounds in normalized space

normalization_jacobian

Jacobian for covariance transform

best_params_global

Best parameters found (for fault tolerance)

best_cost_global

Best cost found (for fault tolerance)

Performance Characteristics

Convergence Speed

  • Phase 1 (L-BFGS): Superlinear convergence, robust to initialization

  • Phase 2 (Gauss-Newton): Quadratic convergence near optimum

  • Overall: Faster than pure first-order or pure second-order

Memory Usage

  • Streaming: Fixed memory footprint regardless of dataset size

  • J^T J accumulation: O(n_params^2) memory

  • Chunks: Configurable via chunk_size

Throughput

  • CPU: 50,000 - 200,000 samples/second

  • GPU: 500,000 - 2,000,000 samples/second

  • Depends on model complexity

When to Use

Use Adaptive Hybrid Streaming when:

  • Parameters span many orders of magnitude (gradient imbalance)

  • Dataset is large (100K+ points)

  • Need production-quality uncertainty estimates

  • Standard optimizers converge slowly

  • Memory is limited relative to dataset size

Use standard curve_fit when:

  • Dataset fits in memory

  • Parameters have similar scales

  • Simple models with good initialization

  • Don’t need streaming

See Also