nlsq.diagnostics.GradientMonitor

class nlsq.diagnostics.GradientMonitor(config)[source]

Bases: object

Monitor gradient health during optimization iterations.

This class tracks gradient behavior to detect potential optimization issues such as vanishing gradients, gradient imbalance between parameters, and gradient stagnation. It uses memory-efficient algorithms to ensure memory usage stays below 1KB regardless of iteration count.

Parameters:

config (DiagnosticsConfig) – Configuration containing thresholds and settings for gradient monitoring.

config

Configuration for the monitor.

Type:

DiagnosticsConfig

iteration_count

Total number of iterations recorded.

Type:

int

Examples

>>> from nlsq.diagnostics import DiagnosticsConfig
>>> from nlsq.diagnostics.gradient_health import GradientMonitor
>>> import numpy as np
>>>
>>> config = DiagnosticsConfig()
>>> monitor = GradientMonitor(config)
>>>
>>> # Record gradients during optimization
>>> for i in range(100):
...     gradient = np.array([0.1, 0.08, 0.12]) / (i + 1)
...     monitor.record_gradient(gradient, cost=1.0 / (i + 1))
>>>
>>> report = monitor.get_report()
>>> print(report.health_status)
HealthStatus.HEALTHY

Integration with curve_fit callback:

>>> from nlsq import curve_fit
>>> from nlsq.diagnostics import DiagnosticsConfig, GradientMonitor
>>>
>>> config = DiagnosticsConfig()
>>> monitor = GradientMonitor(config)
>>> callback = monitor.create_callback()
>>>
>>> # Use in curve_fit (gradient is estimated from Jacobian)
>>> # result = curve_fit(model, x, y, p0=p0, callback=callback)
>>> # report = monitor.get_report()

Notes

Memory efficiency is achieved through:

  1. Sliding window: Stores only the last N gradient norms (default 100), using a deque with maxlen for O(1) append/pop.

  2. Welford’s algorithm: Computes running mean and variance in O(1) space per parameter, without storing individual values.

The total memory footprint is approximately: - Sliding window: window_size * 8 bytes (floats) - Per-parameter stats: 3 * n_params * 8 bytes (mean, M2, count) - Overhead: ~100 bytes for scalars and bookkeeping

For 100 window size and 10 parameters: ~900 bytes < 1KB

__init__(config)[source]

Initialize the gradient monitor.

Parameters:

config (DiagnosticsConfig) – Configuration containing monitoring thresholds.

config
iteration_count: int
record_gradient(gradient, cost)[source]

Record a gradient observation from an optimization iteration.

Parameters:
  • gradient (np.ndarray or Sequence[float]) – The gradient vector (partial derivatives w.r.t. each parameter).

  • cost (float) – The current cost/loss value at this iteration.

Raises:

ValueError – If gradient is empty.

Notes

This method uses Welford’s online algorithm to update running statistics for per-parameter gradient magnitudes. This allows computing mean and variance without storing individual values, achieving O(1) memory per parameter.

The algorithm maintains: - mean: Running mean of absolute gradient values - M2: Sum of squared differences from the mean

Variance is computed as M2 / (n - 1) when needed.

create_callback(user_callback=None)[source]

Create a callback function for integration with curve_fit/TRF.

This method creates a callback compatible with NLSQ’s optimization callbacks. The callback extracts gradient information from the optimization state and records it in the monitor.

Parameters:

user_callback (callable, optional) – An optional user callback to chain with the gradient monitor. Will be called after gradient recording with the same arguments.

Returns:

A callback function compatible with curve_fit’s callback parameter.

Return type:

callable

Examples

>>> from nlsq import curve_fit
>>> from nlsq.diagnostics import DiagnosticsConfig, GradientMonitor
>>>
>>> monitor = GradientMonitor(DiagnosticsConfig())
>>> callback = monitor.create_callback()
>>>
>>> # result = curve_fit(model, x, y, p0=p0, callback=callback)
>>> # report = monitor.get_report()

Notes

The callback receives iteration information including: - iteration: Current iteration number - cost: Current cost value - params: Current parameter values - info: Dictionary with gradient_norm, nfev, step_norm, etc.

When the gradient is not directly available, we estimate it from changes in parameters and cost, or use gradient_norm from info.

get_report()[source]

Generate a gradient health report from recorded observations.

Returns:

Report containing gradient health metrics and any detected issues.

Return type:

GradientHealthReport

Notes

The report includes:

  • Overall health score (0-1, higher is healthier)

  • Mean and final gradient norms

  • Per-parameter mean and variance of gradient magnitudes

  • Detection of vanishing gradients, imbalance, and stagnation

  • List of ModelHealthIssue objects for any detected problems

reset()[source]

Reset the monitor to its initial state.

Clears all recorded gradients and statistics. Useful when starting a new optimization run.