nlsq.streaming.telemetry¶

Telemetry and monitoring for the defense layer strategy.

Added in version 1.2.0: Extracted from nlsq.streaming.adaptive_hybrid for modularity.

This module provides telemetry infrastructure for the 4-layer defense strategy used in adaptive hybrid streaming optimization. It tracks activation counts, timing, and effectiveness metrics for each defense layer.

Defense Layers¶

The telemetry system monitors four defense layers:

Layer 1 - Warm Start: Detects when initial parameters are close to optimal
Layer 2 - Adaptive Step Size: Monitors step size adjustments
Layer 3 - Cost Guard: Tracks cost increase rejections
Layer 4 - Step Clipping: Records step size limiting events

Classes¶

DefenseLayerTelemetry¶

Main telemetry class that collects and reports on defense layer activity.

Module Contents¶

Telemetry for monitoring defense layer activations.

This module provides telemetry tracking for the 4-layer defense strategy used during L-BFGS warmup in the adaptive hybrid streaming optimizer.

class nlsq.streaming.telemetry.DefenseLayerTelemetry[source]¶

Bases: object

Telemetry for monitoring 4-layer defense strategy activations.

Tracks when each defense layer is triggered during warmup to help with production monitoring and tuning. This class maintains thread-safe statistics that can be queried or exported for monitoring dashboards.

The 4 layers tracked are:

Layer 1: Warm start detection (skips warmup)
Layer 2: Adaptive step size selection (refinement/careful/exploration)
Layer 3: Cost-increase guard (aborts warmup if loss increases)
Layer 4: Step clipping (limits update magnitude)

layer1_warm_start_triggers¶

Count of warm start detection activations (warmup skipped)

Type:: int

layer2_lr_mode_counts¶

Counts per LR mode: {“refinement”: n, “careful”: m, “exploration”: k}

Type:: dict[str, int]

layer3_cost_guard_triggers¶

Count of cost-increase guard aborts

Type:: int

layer4_clip_triggers¶

Count of step clipping activations

Type:: int

total_warmup_calls¶

Total number of warmup phase executions

Type:: int

__init__()[source]¶

Initialize telemetry with zeroed counters.

reset()[source]¶

Reset all telemetry counters to zero.

record_warmup_start()[source]¶

Record start of a warmup phase.

record_layer1_trigger(relative_loss, threshold)[source]¶

Record Layer 1 warm start detection trigger.

Parameters:

relative_loss (float) – Relative loss that triggered warm start
threshold (float) – Threshold value that was exceeded

record_layer2_lr_mode(mode, relative_loss)[source]¶

Record Layer 2 adaptive LR mode selection.

Parameters:

mode (str) – Selected LR mode: “refinement”, “careful”, “exploration”, or “fixed”
relative_loss (float) – Relative loss that determined the mode

record_layer3_trigger(cost_ratio, tolerance, iteration)[source]¶

Record Layer 3 cost-increase guard trigger.

Parameters:

cost_ratio (float) – Cost increase ratio that triggered the guard
tolerance (float) – Tolerance threshold that was exceeded
iteration (int) – Iteration number when triggered

record_layer4_clip(original_norm, max_norm)[source]¶

Record Layer 4 step clipping activation.

Parameters:

original_norm (float) – Original update norm before clipping
max_norm (float) – Maximum allowed norm (clipping threshold)

record_lbfgs_history_fill(iteration)[source]¶

Record L-BFGS history buffer fill event.

Called when the L-BFGS history buffer becomes fully populated, signaling transition from cold start to full L-BFGS mode.

Parameters:: iteration (int) – Iteration number when history buffer filled

record_lbfgs_line_search_failure(iteration, reason='')[source]¶

Record L-BFGS line search failure event.

Called when the L-BFGS line search fails to find an acceptable step.

Parameters:

iteration (int) – Iteration number when line search failed
reason (str, optional) – Reason for line search failure

get_trigger_rates()[source]¶

Get trigger rates as percentage of total warmup calls.

Returns:: Trigger rates for each layer as percentages (0-100)
Return type:: dict[str, float]

get_summary()[source]¶

Get summary statistics for all defense layers.

Returns:: Summary with counts and rates for each layer
Return type:: dict

get_recent_events(n=10)[source]¶

Get most recent N events.

Parameters:: n (int) – Number of recent events to return
Returns:: Most recent events
Return type:: list[dict]

export_metrics()[source]¶

Export metrics in a format suitable for monitoring systems.

Returns:: Metrics with consistent naming for Prometheus/Grafana/etc.
Return type:: dict

nlsq.streaming.telemetry.get_defense_telemetry()[source]¶

Get global defense layer telemetry instance.

Returns:: Global telemetry instance (created on first call)
Return type:: DefenseLayerTelemetry

nlsq.streaming.telemetry.reset_defense_telemetry()[source]¶

Reset global defense layer telemetry.

Usage Example¶

from nlsq.streaming.telemetry import DefenseLayerTelemetry

# Create telemetry instance
telemetry = DefenseLayerTelemetry()

# Record layer activations during optimization
telemetry.record_layer1_activation(cost_reduction=0.05)
telemetry.record_layer3_rejection(cost_increase=0.02)

# Get summary report
report = telemetry.get_summary()
print(f"Layer 1 activations: {report['layer1_count']}")
print(f"Layer 3 rejections: {report['layer3_count']}")

nlsq.streaming.telemetry¶

Defense Layers¶

Classes¶

DefenseLayerTelemetry¶

Module Contents¶

Usage Example¶

See Also¶