nlsq.workflow ============= Memory-based workflow system for automatic optimization strategy selection. .. versionchanged:: 0.5.5 The tier-based workflow system was replaced with a unified memory-based approach. ``MemoryBudgetSelector`` replaces ``auto_select_workflow()``, and strategy selection is now driven entirely by memory budget computation. Overview -------- The workflow module provides: * **MemoryBudget**: Dataclass for computing and storing memory estimates * **MemoryBudgetSelector**: Automatic strategy selection based on memory analysis * **OptimizationGoal**: Optimization objectives (FAST, ROBUST, GLOBAL, MEMORY_EFFICIENT, QUALITY) * **calculate_adaptive_tolerances**: Dataset-size-aware tolerance computation Quick Start ----------- .. code-block:: python from nlsq import fit, curve_fit from nlsq.core.workflow import MemoryBudget, MemoryBudgetSelector, OptimizationGoal import jax.numpy as jnp import numpy as np def model(x, a, b, c): return a * jnp.exp(-b * x) + c x = np.linspace(0, 10, 1_000_000) y = 2.0 * np.exp(-0.5 * x) + 0.3 + np.random.normal(0, 0.05, len(x)) # Automatic selection via fit() (recommended) result = fit(model, x, y, p0=[1, 1, 0], workflow="auto") # Automatic selection via curve_fit() popt, pcov = curve_fit(model, x, y, p0=[1, 1, 0], method="auto") # Direct use of MemoryBudgetSelector selector = MemoryBudgetSelector(safety_factor=0.75) strategy, config = selector.select( n_points=len(x), n_params=3, memory_limit_gb=16.0, # Optional override ) print(f"Selected strategy: {strategy}") # Inspect memory budget budget = MemoryBudget.compute(n_points=len(x), n_params=3) print(f"Peak memory: {budget.peak_gb:.2f} GB") print(f"Fits in memory: {budget.fits_in_memory}") Memory Budget Classes --------------------- MemoryBudget ~~~~~~~~~~~~ .. autoclass:: nlsq.core.workflow.MemoryBudget :members: :undoc-members: :show-inheritance: :no-index: **Fields:** .. list-table:: :header-rows: 1 :widths: 20 80 * - Field - Description * - ``available_gb`` - Total available memory (CPU or GPU) in GB * - ``threshold_gb`` - Safe threshold (available × safety_factor) * - ``data_gb`` - Estimated memory for data arrays (x, y) * - ``jacobian_gb`` - Estimated memory for Jacobian matrix * - ``peak_gb`` - Total peak memory estimate **Computed Properties:** * ``fits_in_memory``: True if peak_gb <= threshold_gb * ``data_fits``: True if data_gb <= threshold_gb MemoryBudgetSelector ~~~~~~~~~~~~~~~~~~~~ .. autoclass:: nlsq.core.workflow.MemoryBudgetSelector :members: :undoc-members: :show-inheritance: :no-index: **Strategy Selection Logic:** .. code-block:: text if data_gb > threshold_gb: return "streaming" # Data too large for memory elif peak_gb > threshold_gb: return "chunked" # Jacobian too large, chunk the computation else: return "standard" # Everything fits, use direct curve_fit() Enumerations ------------ OptimizationGoal ~~~~~~~~~~~~~~~~ .. autoclass:: nlsq.core.workflow.OptimizationGoal :members: :undoc-members: :show-inheritance: :no-index: .. list-table:: :header-rows: 1 :widths: 20 80 * - Goal - Description * - **FAST** - Prioritize speed. Uses one tier looser tolerances, skips multi-start. * - **ROBUST** - Standard tolerances with multi-start for better global optimum. * - **GLOBAL** - Synonym for ROBUST. Emphasizes global optimization. * - **MEMORY_EFFICIENT** - Minimize memory usage with standard tolerances. * - **QUALITY** - Highest precision. Uses one tier tighter tolerances, enables multi-start. Named Workflow Presets ---------------------- The ``fit()`` function accepts named presets via the ``workflow`` parameter: .. list-table:: :header-rows: 1 :widths: 20 15 15 50 * - Preset - Strategy - Tolerance - Description * - ``"auto"`` - Memory-based - Adaptive - Automatic selection based on memory budget * - ``"standard"`` - standard - 1e-8 - Default curve_fit() behavior, no multi-start * - ``"quality"`` - standard - 1e-10 - Highest precision with 20-point multi-start * - ``"fast"`` - standard - 1e-6 - Speed-optimized, no multi-start * - ``"large_robust"`` - chunked - 1e-8 - Chunked processing with 10-point multi-start * - ``"streaming"`` - streaming - 1e-7 - AdaptiveHybridStreamingOptimizer for huge datasets * - ``"hpc_distributed"`` - streaming - 1e-6 - Multi-GPU/node HPC configuration with checkpointing **Usage:** .. code-block:: python from nlsq import fit # Use automatic memory-based selection result = fit(model, x, y, p0=[1, 1, 0], workflow="auto") # Use a named preset result = fit(model, x, y, p0=[1, 1, 0], workflow="quality") # Override memory detection result = fit(model, x, y, p0=[1, 1, 0], workflow="auto", memory_limit_gb=8.0) Adaptive Tolerances ------------------- The workflow system uses adaptive tolerances based on dataset size: .. list-table:: :header-rows: 1 :widths: 20 25 20 35 * - Dataset Size - Points - Default Tolerance - Notes * - TINY - < 1,000 - 1e-12 - Maximum precision * - SMALL - 1,000 - 10,000 - 1e-10 - High precision * - MEDIUM - 10,000 - 100,000 - 1e-9 - Balanced * - LARGE - 100,000 - 1,000,000 - 1e-8 - Standard (NLSQ default) * - VERY_LARGE - 1M - 10M - 1e-7 - Reduced precision * - HUGE - 10M - 100M - 1e-6 - Streaming mode * - MASSIVE - > 100M - 1e-5 - Streaming with checkpoints **Goal-Based Adjustments:** * ``QUALITY``: Uses one tier tighter tolerances * ``FAST``: Uses one tier looser tolerances * ``ROBUST``/``GLOBAL``/``MEMORY_EFFICIENT``: Uses standard tolerances .. code-block:: python from nlsq.core.workflow import calculate_adaptive_tolerances, OptimizationGoal # 5M points with QUALITY goal tols = calculate_adaptive_tolerances(5_000_000, goal=OptimizationGoal.QUALITY) print(tols) # {'gtol': 1e-08, 'ftol': 1e-08, 'xtol': 1e-08} # 5M points with FAST goal tols = calculate_adaptive_tolerances(5_000_000, goal=OptimizationGoal.FAST) print(tols) # {'gtol': 1e-06, 'ftol': 1e-06, 'xtol': 1e-06} Memory Estimation Details ------------------------- The system estimates memory requirements for each component: .. list-table:: :header-rows: 1 :widths: 20 40 40 * - Component - Formula - Example (10M pts, 10 params) * - Data (x, y) - n × (features + 1) × 8 - 160 MB * - Jacobian - n × p × 8 - 800 MB * - J\ :sup:`T`\ J - p² × 8 - 0.8 KB * - SVD working - ~0.3 × jacobian - 240 MB * - **Peak** - data + 1.3×J + solver - **~1.3 GB** The Jacobian matrix dominates memory usage for most problems. Utility Functions ----------------- calculate_adaptive_tolerances ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autofunction:: nlsq.core.workflow.calculate_adaptive_tolerances :no-index: create_checkpoint_directory ~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autofunction:: nlsq.core.workflow.create_checkpoint_directory :no-index: Module Contents --------------- .. automodule:: nlsq.core.workflow :members: :undoc-members: :show-inheritance: :no-index: :exclude-members: MemoryBudget, MemoryBudgetSelector, OptimizationGoal, calculate_adaptive_tolerances, create_checkpoint_directory See Also -------- - :doc:`/explanation/workflows` - Workflow system overview - :doc:`/howto/common_workflows` - Common workflow patterns - :doc:`/reference/configuration` - Configuration reference