nlsq.core.profiler

Performance profiling for Trust Region Reflective optimization.

Added in version 1.2.0: Extracted from nlsq.core.trf for modularity.

This module provides profiling infrastructure for the TRF optimizer, enabling detailed performance analysis of optimization runs.

Classes

TRFProfiler

A profiler that records timing information for each phase of TRF optimization:

  • Jacobian computation time

  • SVD decomposition time

  • Trust region step computation

  • Function evaluation time

  • Total iteration time

NullProfiler

A no-op profiler implementation for production use when profiling overhead is not desired. Implements the same interface as TRFProfiler but does nothing.

Module Contents

Profiling utilities for Trust Region Reflective optimization.

This module provides profiling classes for timing TRF algorithm operations, enabling performance analysis and optimization tuning.

class nlsq.core.profiler.NullProfiler[source]

Bases: object

Null object profiler with zero overhead.

Provides same interface as TRFProfiler but does nothing, enabling profiling to be toggled with no performance impact.

time_operation(operation, jax_result)[source]

No-op timing - returns result unchanged.

time_conversion(operation, start_time)[source]

No-op conversion timing.

get_timing_data()[source]

Returns empty timing data.

class nlsq.core.profiler.TRFProfiler[source]

Bases: object

Profiler for timing TRF algorithm operations.

Records detailed timing information for each operation in the TRF algorithm, including GPU synchronization via block_until_ready() for accurate timings.

This enables performance analysis without duplicating the entire algorithm.

ftimes

Function evaluation times.

Type:

list[float]

jtimes

Jacobian evaluation times.

Type:

list[float]

svd_times

SVD computation times.

Type:

list[float]

ctimes

Cost computation times (JAX).

Type:

list[float]

gtimes

Gradient computation times (JAX).

Type:

list[float]

gtimes2

Gradient norm computation times.

Type:

list[float]

ptimes

Parameter update times.

Type:

list[float]

svd_ctimes

SVD conversion times (JAX → NumPy).

Type:

list[float]

g_ctimes

Gradient conversion times (JAX → NumPy).

Type:

list[float]

c_ctimes

Cost conversion times (JAX → NumPy).

Type:

list[float]

p_ctimes

Parameter conversion times (JAX → NumPy).

Type:

list[float]

__init__()[source]

Initialize profiler with empty timing arrays.

ftimes: list[float]
jtimes: list[float]
svd_times: list[float]
ctimes: list[float]
gtimes: list[float]
gtimes2: list[float]
ptimes: list[float]
svd_ctimes: list[float]
g_ctimes: list[float]
c_ctimes: list[float]
p_ctimes: list[float]
time_operation(operation, jax_result)[source]

Time a JAX operation with GPU synchronization.

Parameters:
  • operation (str) – Operation name (‘fun’, ‘jac’, ‘svd’, ‘cost’, ‘grad’, etc.)

  • jax_result – JAX array result to synchronize

Returns:

The synchronized result (same as input)

Return type:

result

time_conversion(operation, start_time)[source]

Record timing for JAX → NumPy conversion.

Parameters:
  • operation (str) – Conversion operation (‘svd_convert’, ‘grad_convert’, ‘cost_convert’, ‘param_convert’)

  • start_time (float) – Start time from time.time()

get_timing_data()[source]

Get all recorded timing data.

Returns:

Dictionary containing all timing arrays

Return type:

dict[str, list[float]]

Usage Example

from nlsq.core.profiler import TRFProfiler, NullProfiler
from nlsq.core.trf import TrustRegionReflective

# Enable profiling
profiler = TRFProfiler()
optimizer = TrustRegionReflective(profiler=profiler)

# Run optimization
result = optimizer.optimize(...)

# Get timing statistics
stats = profiler.get_stats()
print(f"Jacobian time: {stats['jacobian_time']:.3f}s")
print(f"SVD time: {stats['svd_time']:.3f}s")

See Also