nlsq.global_optimization

Multi-start optimization with Latin Hypercube Sampling (LHS) and quasi-random samplers for global search in nonlinear least squares fitting.

This module provides tools for exploring parameter space to find global optima, which is particularly useful for problems with multiple local minima.

Key Features

  • Latin Hypercube Sampling (LHS): Stratified random sampling for better coverage

  • Quasi-random sequences: Sobol and Halton for deterministic, low-discrepancy sampling

  • Multi-start orchestration: Evaluate multiple starting points and select best

  • Tournament selection: Memory-efficient selection for streaming datasets

  • Preset configurations: ‘fast’, ‘robust’, ‘global’, ‘thorough’, ‘streaming’

Quick Start

Basic multi-start optimization:

from nlsq import fit, curve_fit
from nlsq.global_optimization import MultiStartOrchestrator
import jax.numpy as jnp
import numpy as np


def model(x, a, b, c):
    return a * jnp.exp(-b * x) + c


x = np.linspace(0, 5, 100)
y = 3.0 * np.exp(-0.5 * x) + 1.0 + np.random.normal(0, 0.1, 100)

# Method 1: Use fit() with preset
popt, pcov = fit(model, x, y, preset="robust", bounds=([0, 0, 0], [10, 5, 10]))

# Method 2: Use MultiStartOrchestrator directly
orchestrator = MultiStartOrchestrator.from_preset("global")
result = orchestrator.fit(model, x, y, bounds=([0, 0, 0], [10, 5, 10]))

Configuration

class nlsq.global_optimization.GlobalOptimizationConfig(n_starts=10, sampler='lhs', center_on_p0=True, scale_factor=1.0, elimination_rounds=3, elimination_fraction=0.5, batches_per_round=50, _preset=None)[source]

Bases: object

Configuration for multi-start global optimization.

This configuration class controls all aspects of multi-start optimization with Latin Hypercube Sampling or other quasi-random samplers.

Parameters:
  • n_starts (int, default=10) – Number of starting points to generate. Set to 0 to disable multi-start.

  • sampler ({'lhs', 'sobol', 'halton'}, default='lhs') –

    Sampling strategy for generating starting points:

    • ’lhs’: Latin Hypercube Sampling (recommended, stratified random)

    • ’sobol’: Sobol quasi-random sequence (deterministic, low-discrepancy)

    • ’halton’: Halton quasi-random sequence (deterministic, prime bases)

  • center_on_p0 (bool, default=True) – Whether to center starting points around the initial parameter guess (p0). When True, samples are generated in a region around p0 rather than uniformly in the full parameter bounds.

  • scale_factor (float, default=1.0) – Scale factor for exploration region when center_on_p0=True. Multiplier for the exploration range around p0. Smaller values (0.5) = tighter exploration around p0. Larger values (2.0) = wider exploration.

  • elimination_rounds (int, default=3) – Number of tournament elimination rounds for large datasets. Each round eliminates a fraction of candidates based on loss.

  • elimination_fraction (float, default=0.5) – Fraction of candidates to eliminate in each tournament round. Must be in (0, 1). Default 0.5 eliminates half in each round.

  • batches_per_round (int, default=50) – Number of data batches to use for evaluation in each tournament round. More batches = more reliable selection but slower.

Examples

>>> config = GlobalOptimizationConfig(n_starts=20, sampler='sobol')
>>> config.n_starts
20
>>> config = GlobalOptimizationConfig.from_preset('global')
>>> config.n_starts
20

Notes

  • When n_starts=0, multi-start is disabled and standard single-start optimization is used.

  • Tournament selection (elimination_rounds > 0) is designed for streaming datasets where evaluating all candidates on the full dataset is impractical.

  • LHS provides better coverage guarantees than Sobol/Halton for small N, while Sobol/Halton are deterministic and may be preferred for reproducibility.

See also

MultiStartOrchestrator

Orchestrates multi-start optimization

TournamentSelector

Implements tournament selection for large datasets

n_starts: int
sampler: Literal['lhs', 'sobol', 'halton']
center_on_p0: bool
scale_factor: float
elimination_rounds: int
elimination_fraction: float
batches_per_round: int
__post_init__()[source]

Validate configuration after initialization.

classmethod from_preset(preset_name)[source]

Create configuration from a named preset.

Parameters:

preset_name (str) – Name of the preset. One of: ‘fast’, ‘robust’, ‘global’, ‘thorough’, ‘streaming’.

Returns:

Configuration instance with preset values.

Return type:

GlobalOptimizationConfig

Raises:

ValueError – If preset_name is not a known preset.

Examples

>>> config = GlobalOptimizationConfig.from_preset('robust')
>>> config.n_starts
5
>>> config = GlobalOptimizationConfig.from_preset('global')
>>> config.n_starts
20
property is_multistart_enabled: bool

Whether multi-start optimization is enabled.

Returns:

True if n_starts > 0, False otherwise.

Return type:

bool

property preset: str | None

The preset name if this config was created from a preset.

Returns:

Preset name (‘fast’, ‘robust’, etc.) or None if custom.

Return type:

str or None

to_dict()[source]

Serialize configuration to a dictionary.

Returns:

Dictionary representation suitable for JSON serialization or checkpoint saving.

Return type:

dict

Examples

>>> config = GlobalOptimizationConfig(n_starts=20)
>>> d = config.to_dict()
>>> d['n_starts']
20
classmethod from_dict(d)[source]

Deserialize configuration from a dictionary.

Parameters:

d (dict) – Dictionary with configuration values.

Returns:

Configuration instance.

Return type:

GlobalOptimizationConfig

Examples

>>> d = {'n_starts': 20, 'sampler': 'sobol'}
>>> config = GlobalOptimizationConfig.from_dict(d)
>>> config.n_starts
20
with_overrides(**kwargs)[source]

Create a new config with specified overrides.

Parameters:

**kwargs (Any) – Configuration fields to override.

Returns:

New configuration with overrides applied.

Return type:

GlobalOptimizationConfig

Examples

>>> config = GlobalOptimizationConfig.from_preset('robust')
>>> config2 = config.with_overrides(n_starts=10)
>>> config2.n_starts
10
__init__(n_starts=10, sampler='lhs', center_on_p0=True, scale_factor=1.0, elimination_rounds=3, elimination_fraction=0.5, batches_per_round=50, _preset=None)
nlsq.global_optimization.PRESETS = {'fast': {...}, 'robust': {...}, 'global': {...}, 'thorough': {...}, 'streaming': {...}}

Preset configurations for common use cases:

  • fast: n_starts=0, multi-start disabled for maximum speed

  • robust: n_starts=5, light multi-start for robustness

  • global: n_starts=20, thorough global search

  • thorough: n_starts=50, exhaustive search

  • streaming: n_starts=10, tournament selection for large datasets

Multi-Start Orchestrator

class nlsq.global_optimization.MultiStartOrchestrator(config=None, curve_fit_instance=None)[source]

Bases: object

Orchestrator for multi-start optimization with LHS sampling.

This class generates multiple starting points using Latin Hypercube Sampling (or other quasi-random samplers), evaluates each starting point by running a full optimization, and selects the best result based on minimum loss.

Parameters:
  • config (GlobalOptimizationConfig, optional) – Configuration for multi-start optimization. If None, uses default config.

  • curve_fit_instance (CurveFit, optional) – Custom CurveFit instance to use for optimization. If None, creates a new one.

config

Configuration settings for multi-start optimization.

Type:

GlobalOptimizationConfig

curve_fit

CurveFit instance used for running optimizations.

Type:

CurveFit

logger

Logger for multi-start operations.

Type:

NLSQLogger

Examples

>>> from nlsq.global_optimization import MultiStartOrchestrator, GlobalOptimizationConfig
>>>
>>> # Basic usage
>>> orchestrator = MultiStartOrchestrator()
>>> result = orchestrator.fit(model, x, y, bounds=([0, 0], [10, 10]))
>>>
>>> # With custom config
>>> config = GlobalOptimizationConfig(n_starts=20, sampler='sobol')
>>> orchestrator = MultiStartOrchestrator(config=config)
>>>
>>> # Using presets
>>> orchestrator = MultiStartOrchestrator.from_preset('global')
__init__(config=None, curve_fit_instance=None)[source]

Initialize MultiStartOrchestrator.

Parameters:
  • config (GlobalOptimizationConfig, optional) – Configuration for multi-start optimization.

  • curve_fit_instance (CurveFit, optional) – Custom CurveFit instance for optimization.

classmethod from_preset(preset_name, curve_fit_instance=None)[source]

Create orchestrator from a named preset.

Parameters:
  • preset_name (str) – Name of the preset. One of: ‘fast’, ‘robust’, ‘global’, ‘thorough’, ‘streaming’.

  • curve_fit_instance (CurveFit, optional) – Custom CurveFit instance for optimization.

Returns:

Orchestrator instance configured with preset values.

Return type:

MultiStartOrchestrator

Raises:

ValueError – If preset_name is not a known preset.

Examples

>>> orchestrator = MultiStartOrchestrator.from_preset('robust')
>>> print(orchestrator.config.n_starts)
5
evaluate_starting_points(f, xdata, ydata, starting_points, bounds, **kwargs)[source]

Evaluate starting points in parallel using ThreadPoolExecutor.

Each starting point is evaluated with an isolated CurveFit instance to avoid thread-safety issues with mutable per-fit state in LeastSquares. JIT-compiled functions and the compilation cache are shared safely across threads.

Parameters:
  • f (Callable) – Model function f(x, *params) -> y.

  • xdata (np.ndarray) – Independent variable data.

  • ydata (np.ndarray) – Dependent variable data.

  • starting_points (np.ndarray) – Array of shape (n_starts, n_params) with starting points.

  • bounds (tuple) – Tuple of (lower_bounds, upper_bounds) for parameters.

  • **kwargs (Any) – Additional arguments passed to curve_fit.

Returns:

  • results (list[tuple[np.ndarray, float, CurveFitResult | None]]) – List of (params, loss, result) tuples sorted by loss (ascending).

  • diagnostics (dict[str, Any]) – Parallel diagnostics with keys parallel, n_workers, wall_time_sec.

Return type:

tuple[list[tuple[ndarray, float, CurveFitResult | None]], dict[str, Any]]

fit(f, xdata, ydata, p0=None, bounds=None, **kwargs)[source]

Run multi-start optimization and return best result.

This is the main entry point for multi-start optimization. It: 1. Infers bounds if not provided 2. Estimates p0 if not provided and center_on_p0=True 3. Generates LHS starting points 4. Evaluates all starting points 5. Returns the result with the minimum loss

Parameters:
  • f (Callable) – Model function f(x, *params) -> y.

  • xdata (np.ndarray) – Independent variable data.

  • ydata (np.ndarray) – Dependent variable data.

  • p0 (np.ndarray, optional) – Initial parameter guess. Used for centering if center_on_p0=True.

  • bounds (tuple, optional) – Tuple of (lower_bounds, upper_bounds) for parameters. If None, bounds are inferred from data.

  • kwargs (dict) – Additional arguments passed to curve_fit.

Returns:

Best optimization result with multi-start diagnostics.

Return type:

CurveFitResult

Examples

>>> orchestrator = MultiStartOrchestrator.from_preset('robust')
>>> result = orchestrator.fit(model, x, y, bounds=([0, 0], [10, 10]))
>>> print(f"Best params: {result.popt}")
>>> print(f"Multi-start diagnostics: {result.multistart_diagnostics}")

Sampling Functions

Latin Hypercube Sampling

nlsq.global_optimization.latin_hypercube_sample(n_samples, n_dims, rng_key=None)[source]

Generate Latin Hypercube samples in the unit hypercube.

Latin Hypercube Sampling divides each dimension into n_samples equal strata and ensures exactly one sample falls in each stratum for each dimension. This provides better coverage than random sampling.

Parameters:
  • n_samples (int) – Number of samples to generate.

  • n_dims (int) – Number of dimensions.

  • rng_key (jax.Array, optional) – JAX random key. If None, a default key is created.

Returns:

Array of shape (n_samples, n_dims) with values in [0, 1).

Return type:

jax.Array

Notes

The stratification property ensures that when samples are sorted in any dimension, sample i falls in the stratum [i/n, (i+1)/n) for that dimension.

Examples

>>> import jax
>>> samples = latin_hypercube_sample(10, 3, rng_key=jax.random.PRNGKey(42))
>>> samples.shape
(10, 3)
>>> # Verify stratification
>>> import numpy as np
>>> sorted_dim0 = np.sort(np.asarray(samples[:, 0]))
>>> for i, s in enumerate(sorted_dim0):
...     assert i / 10 <= s < (i + 1) / 10

Sobol Sequence

nlsq.global_optimization.sobol_sample(n_samples, n_dims, skip=0)[source]

Generate Sobol quasi-random samples in the unit hypercube.

Sobol sequences are deterministic low-discrepancy sequences that provide excellent coverage of the parameter space.

Parameters:
  • n_samples (int) – Number of samples to generate.

  • n_dims (int) – Number of dimensions. Must be <= 21.

  • skip (int, default=0) – Number of initial samples to skip (for different starting points).

Returns:

Array of shape (n_samples, n_dims) with values in [0, 1].

Return type:

jax.Array

Notes

This implementation uses a Gray code approach for efficient generation. For high dimensions (>21), consider using scipy.stats.qmc.Sobol.

Examples

>>> samples = sobol_sample(16, 4)
>>> samples.shape
(16, 4)

Halton Sequence

nlsq.global_optimization.halton_sample(n_samples, n_dims, skip=0)[source]

Generate Halton quasi-random samples in the unit hypercube.

Halton sequences use different prime number bases for each dimension to create low-discrepancy samples.

Parameters:
  • n_samples (int) – Number of samples to generate.

  • n_dims (int) – Number of dimensions. Must be <= 20 (number of available prime bases).

  • skip (int, default=0) – Number of initial samples to skip (for different starting points).

Returns:

Array of shape (n_samples, n_dims) with values in [0, 1).

Return type:

jax.Array

Notes

Each dimension uses a different prime base: 2, 3, 5, 7, 11, … The Halton sequence in base b for index n is computed by reflecting the base-b representation of n about the decimal point.

Examples

>>> samples = halton_sample(20, 5)
>>> samples.shape
(20, 5)

Utility Functions

nlsq.global_optimization.scale_samples_to_bounds(samples, lb, ub)[source]

Transform samples from [0, 1] to parameter bounds [lb, ub].

Parameters:
  • samples (jax.Array) – Array of shape (n_samples, n_dims) with values in [0, 1].

  • lb (jax.Array) – Lower bounds for each dimension, shape (n_dims,).

  • ub (jax.Array) – Upper bounds for each dimension, shape (n_dims,).

Returns:

Scaled samples with values in [lb, ub].

Return type:

jax.Array

Examples

>>> import jax.numpy as jnp
>>> samples = jnp.array([[0.0, 0.5], [1.0, 0.0]])
>>> lb = jnp.array([0.0, -10.0])
>>> ub = jnp.array([10.0, 10.0])
>>> scaled = scale_samples_to_bounds(samples, lb, ub)
>>> # First sample: [0, 0] in [0,1] -> [0, 0] in scaled space
>>> # Second sample: [1, 0] in [0,1] -> [10, -10] in scaled space
nlsq.global_optimization.center_samples_around_p0(samples, p0, scale_factor, lb, ub)[source]

Center samples around p0 with specified scale factor.

Instead of sampling uniformly in [lb, ub], this creates samples in a region centered at p0. The region width is scale_factor * (ub - lb).

Parameters:
  • samples (jax.Array) – Array of shape (n_samples, n_dims) with values in [0, 1].

  • p0 (jax.Array) – Center point (initial parameter guess), shape (n_dims,).

  • scale_factor (float) – Width of exploration region as fraction of (ub - lb). 0.5 means explore within 50% of the range centered at p0.

  • lb (jax.Array) – Lower bounds for each dimension, shape (n_dims,).

  • ub (jax.Array) – Upper bounds for each dimension, shape (n_dims,).

Returns:

Centered samples, clipped to stay within [lb, ub].

Return type:

jax.Array

Notes

The exploration region is: - Half-width = scale_factor * (ub - lb) / 2 - Region = [p0 - half_width, p0 + half_width] - Clipped to [lb, ub] to ensure valid samples

Examples

>>> import jax.numpy as jnp
>>> samples = jnp.array([[0.0], [0.5], [1.0]])
>>> p0 = jnp.array([5.0])
>>> lb = jnp.array([0.0])
>>> ub = jnp.array([10.0])
>>> centered = center_samples_around_p0(samples, p0, 0.5, lb, ub)
>>> # With scale_factor=0.5, half-width = 0.5 * 10 / 2 = 2.5
>>> # Region = [2.5, 7.5]
>>> # Sample at 0.5 maps to center (5.0)
nlsq.global_optimization.get_sampler(sampler_type)[source]

Get a sampler function by type name.

Factory function that returns the appropriate sampling function based on the sampler type string.

Parameters:

sampler_type (str) – Type of sampler. One of: ‘lhs’, ‘sobol’, ‘halton’ (case-insensitive).

Returns:

Sampler function with signature (n_samples, n_dims, **kwargs) -> jax.Array.

Return type:

Callable

Raises:

ValueError – If sampler_type is not recognized.

Examples

>>> sampler = get_sampler('lhs')
>>> samples = sampler(10, 3)
>>> samples.shape
(10, 3)
>>> sampler = get_sampler('sobol')
>>> samples = sampler(16, 4)

CMA-ES Optimizer

Evolution strategy optimizer for multi-scale parameter problems:

class nlsq.global_optimization.CMAESOptimizer(config=None)[source]

Bases: object

CMA-ES global optimizer with NLSQ refinement using evosax.

Uses evosax’s CMA-ES implementation for gradient-free global optimization, followed by NLSQ Trust Region Reflective refinement for proper parameter covariance estimation.

Parameters:

config (CMAESConfig | None, optional) – Configuration for CMA-ES optimization. If None, uses default config.

config

Configuration for CMA-ES optimization.

Type:

CMAESConfig

Examples

>>> from nlsq.global_optimization import CMAESOptimizer, CMAESConfig
>>> import jax.numpy as jnp
>>>
>>> def model(x, a, b):
...     return a * jnp.exp(-b * x)
>>>
>>> x = jnp.linspace(0, 5, 100)
>>> y = 2.5 * jnp.exp(-0.5 * x)
>>> bounds = ([0.1, 0.01], [10.0, 2.0])
>>>
>>> optimizer = CMAESOptimizer()
>>> result = optimizer.fit(model, x, y, bounds=bounds)
>>> print(f"Optimal params: {result['popt']}")
__init__(config=None)[source]

Initialize CMAESOptimizer.

Parameters:

config (CMAESConfig | None, optional) – Configuration for CMA-ES optimization. If None, uses default config (BIPOP enabled, 100 generations, 9 max restarts).

classmethod from_preset(preset_name)[source]

Create optimizer from a named preset.

Parameters:

preset_name (str) – Name of the preset. One of ‘cmaes-fast’, ‘cmaes’, ‘cmaes-global’.

Returns:

Optimizer configured with the specified preset.

Return type:

CMAESOptimizer

Examples

>>> optimizer = CMAESOptimizer.from_preset('cmaes-fast')
>>> optimizer.config.max_generations
50
fit(f, xdata, ydata, p0=None, bounds=None, sigma=None, **kwargs)[source]

Run CMA-ES global optimization followed by NLSQ refinement.

Parameters:
  • f (Callable) – Model function f(x, *params) -> y.

  • xdata (ArrayLike) – Independent variable data.

  • ydata (ArrayLike) – Dependent variable data.

  • p0 (ArrayLike | None, optional) – Initial parameter guess. If None, uses center of bounds.

  • bounds (tuple[ArrayLike, ArrayLike] | None) – Lower and upper bounds for parameters. Required for CMA-ES.

  • sigma (ArrayLike | None, optional) – Standard deviation of ydata for weighted residuals.

  • **kwargs (Any) – Additional keyword arguments (passed to NLSQ refinement).

Returns:

Result dictionary containing: - popt: Optimal parameters - pcov: Parameter covariance matrix (from NLSQ refinement) - Additional fields from NLSQ result

Return type:

dict[str, Any]

Raises:

ValueError – If bounds are not provided (required for CMA-ES).

class nlsq.global_optimization.CMAESConfig(popsize=None, max_generations=100, sigma=0.5, tol_fun=1e-08, tol_x=1e-08, restart_strategy='bipop', max_restarts=9, population_batch_size=None, data_chunk_size=None, refine_with_nlsq=True, seed=None)[source]

Bases: object

Configuration for CMA-ES global optimization.

popsize

Population size. If None, uses CMA-ES default: int(4 + 3 * log(n)).

Type:

int | None

max_generations

Maximum number of generations before stopping. Default: 100.

Type:

int

sigma

Initial step size (standard deviation). Default: 0.5.

Type:

float

tol_fun

Function value tolerance for convergence. Default: 1e-8.

Type:

float

tol_x

Parameter tolerance for convergence. Default: 1e-8.

Type:

float

restart_strategy

Restart strategy. Default: ‘bipop’.

Type:

Literal[‘none’, ‘bipop’]

max_restarts

Maximum restart attempts for BIPOP. Default: 9.

Type:

int

population_batch_size

Batch size for population evaluation. If None, evaluates all candidates in parallel. Set to smaller values (e.g., 4) to reduce memory usage.

Type:

int | None

data_chunk_size

Chunk size for data streaming. If None, processes full dataset at once. Set to smaller values (e.g., 50000) for datasets >10M points to avoid OOM. Must be >= 1024 for numerical stability.

Type:

int | None

refine_with_nlsq

Whether to refine best solution with NLSQ TRF. Default: True.

Type:

bool

seed

Random seed for reproducibility. If None, uses random seed.

Type:

int | None

Examples

>>> config = CMAESConfig(popsize=32, max_generations=200)
>>> config = CMAESConfig.from_preset('cmaes-global')

Memory-efficient configuration for large datasets:

>>> config = CMAESConfig(
...     population_batch_size=4,  # Evaluate 4 candidates at a time
...     data_chunk_size=50000,    # Process 50K points per chunk
... )
popsize: int | None
max_generations: int
sigma: float
tol_fun: float
tol_x: float
restart_strategy: Literal['none', 'bipop']
max_restarts: int
population_batch_size: int | None
data_chunk_size: int | None
refine_with_nlsq: bool
seed: int | None
__post_init__()[source]

Validate configuration after initialization.

classmethod from_preset(preset_name)[source]

Create a CMAESConfig from a named preset.

Parameters:

preset_name (str) – Name of the preset. One of ‘cmaes-fast’, ‘cmaes’, ‘cmaes-global’.

Returns:

Configuration for the specified preset.

Return type:

CMAESConfig

Raises:

ValueError – If preset_name is not recognized.

Examples

>>> config = CMAESConfig.from_preset('cmaes-fast')
>>> config.max_generations
50
__init__(popsize=None, max_generations=100, sigma=0.5, tol_fun=1e-08, tol_x=1e-08, restart_strategy='bipop', max_restarts=9, population_batch_size=None, data_chunk_size=None, refine_with_nlsq=True, seed=None)
class nlsq.global_optimization.CMAESDiagnostics(total_generations=0, total_restarts=0, final_sigma=0.0, best_fitness=inf, fitness_history=<factory>, restart_history=<factory>, convergence_reason='not_converged', nlsq_refinement=False, wall_time=0.0)[source]

Bases: object

Diagnostic information from CMA-ES optimization.

total_generations

Total number of CMA-ES generations across all restarts.

Type:

int

total_restarts

Number of BIPOP restarts performed.

Type:

int

final_sigma

Final step size (standard deviation) at convergence.

Type:

float

best_fitness

Best fitness value found (negative SSR, higher is better).

Type:

float

fitness_history

History of best fitness values per generation.

Type:

list[float]

restart_history

Information about each restart (popsize, generations, reason).

Type:

list[dict[str, Any]]

convergence_reason

Reason for convergence or termination.

Type:

str

nlsq_refinement

Whether NLSQ refinement was applied.

Type:

bool

wall_time

Total wall-clock time in seconds.

Type:

float

Examples

>>> diagnostics = CMAESDiagnostics(
...     total_generations=150,
...     total_restarts=3,
...     final_sigma=0.01,
...     best_fitness=-1e-10,
...     convergence_reason="fitness_tolerance",
... )
>>> print(diagnostics.summary())
total_generations: int
total_restarts: int
final_sigma: float
best_fitness: float
fitness_history: list[float]
restart_history: list[dict[str, Any]]
convergence_reason: str
nlsq_refinement: bool
wall_time: float
summary()[source]

Generate a human-readable summary of the diagnostics.

Returns:

Multi-line summary string.

Return type:

str

to_dict()[source]

Convert diagnostics to a dictionary.

Returns:

Dictionary representation for serialization.

Return type:

dict

classmethod from_dict(d)[source]

Create diagnostics from a dictionary.

Parameters:

d (dict) – Dictionary with diagnostics values.

Returns:

Diagnostics instance.

Return type:

CMAESDiagnostics

get_fitness_improvement()[source]

Calculate fitness improvement from first to last generation.

Returns:

Relative fitness improvement, or None if not enough history.

Return type:

float | None

get_convergence_rate()[source]

Compute per-generation convergence rate.

Returns:

Array of per-generation fitness changes, or None if not enough history.

Return type:

NDArray | None

__init__(total_generations=0, total_restarts=0, final_sigma=0.0, best_fitness=inf, fitness_history=<factory>, restart_history=<factory>, convergence_reason='not_converged', nlsq_refinement=False, wall_time=0.0)
class nlsq.global_optimization.BIPOPRestarter(base_popsize, n_params, max_restarts=9, min_fitness_spread=1e-12)[source]

Bases: object

BIPOP restart manager for CMA-ES optimization.

Manages alternating large/small population restarts following the BIPOP strategy. Large populations explore broadly while small populations exploit local regions more intensively.

Parameters:
  • base_popsize (int) – Base population size (typically 4 + floor(3 * ln(n))).

  • n_params (int) – Number of parameters being optimized.

  • max_restarts (int, optional) – Maximum number of restarts before giving up. Default is 9.

  • min_fitness_spread (float, optional) – Minimum fitness spread threshold for stagnation detection. Default is 1e-12.

restart_count

Number of restarts performed so far.

Type:

int

exhausted

True if max_restarts has been reached.

Type:

bool

best_solution

Best solution found across all restarts.

Type:

jax.Array | None

best_fitness

Best fitness found across all restarts.

Type:

float

Examples

>>> restarter = BIPOPRestarter(base_popsize=8, n_params=3)
>>> popsize = restarter.get_next_popsize()
>>> # ... run CMA-ES with popsize ...
>>> if restarter.check_stagnation(fitness_spread=1e-15):
...     restarter.register_restart()
...     popsize = restarter.get_next_popsize()
base_popsize: int
n_params: int
max_restarts: int = 9
min_fitness_spread: float = 1e-12
restart_count: int = 0
property exhausted: bool

Whether maximum restarts have been reached.

property best_solution: jax.Array | None

Best solution found across all restarts.

property best_fitness: float

Best fitness found across all restarts.

get_next_popsize()[source]

Get population size for the next run.

Returns:

Population size to use for next CMA-ES run. Large runs use 2x base_popsize, small runs use base_popsize/2 to base_popsize.

Return type:

int

register_restart()[source]

Register that a restart has occurred.

Call this after a restart to update internal state.

check_stagnation(fitness_spread)[source]

Check if optimization has stagnated.

Parameters:

fitness_spread (float) – Difference between max and min fitness in current population.

Returns:

True if stagnation detected (fitness spread below threshold).

Return type:

bool

update_best(solution, fitness)[source]

Update best solution if the new one is better.

Parameters:
  • solution (jax.Array) – Candidate solution.

  • fitness (float) – Fitness value (CMA-ES maximizes, so higher is better).

get_best()[source]

Get the best solution found across all restarts.

Returns:

Best solution and its fitness, or (None, -inf) if none found.

Return type:

tuple[jax.Array | None, float]

reset()[source]

Reset the restarter to initial state.

__init__(base_popsize, n_params, max_restarts=9, min_fitness_spread=1e-12)

Method Selection

Automatic selection between CMA-ES and Multi-Start:

class nlsq.global_optimization.MethodSelector(scale_threshold=1000.0)[source]

Bases: object

Select optimization method based on problem characteristics.

The selector analyzes the parameter bounds to compute a scale ratio, which indicates how many orders of magnitude separate the parameter scales. CMA-ES is preferred for multi-scale problems (high scale ratio) when evosax is available.

Parameters:

scale_threshold (float, optional) – Threshold for scale ratio above which CMA-ES is preferred. Default is 1000.0 (3 orders of magnitude).

scale_threshold

The scale ratio threshold for CMA-ES preference.

Type:

float

Examples

>>> from nlsq.global_optimization import MethodSelector
>>> import numpy as np
>>>
>>> selector = MethodSelector()
>>> lower = np.array([1e2, 1e-5, 0.5])
>>> upper = np.array([1e6, 1e-1, 3.0])
>>>
>>> method = selector.select('auto', lower, upper)
>>> print(f"Selected method: {method}")
__init__(scale_threshold=1000.0)[source]

Initialize MethodSelector.

Parameters:

scale_threshold (float, optional) – Threshold for scale ratio above which CMA-ES is preferred.

compute_scale_ratio(lower_bounds, upper_bounds)[source]

Compute the scale ratio from parameter bounds.

The scale ratio is the ratio of the largest parameter range to the smallest, indicating how many orders of magnitude separate the parameter scales.

Parameters:
  • lower_bounds (ArrayLike) – Lower bounds for parameters.

  • upper_bounds (ArrayLike) – Upper bounds for parameters.

Returns:

Scale ratio (>= 1.0). Higher values indicate more diverse scales.

Return type:

float

select(requested_method, lower_bounds, upper_bounds)[source]

Select the optimization method to use.

Parameters:
  • requested_method ({'cmaes', 'multi-start', 'auto'} | None) – Requested method. If ‘auto’ or None, selection is based on scale ratio and evosax availability.

  • lower_bounds (ArrayLike) – Lower bounds for parameters.

  • upper_bounds (ArrayLike) – Upper bounds for parameters.

Returns:

The selected optimization method.

Return type:

Literal[‘cmaes’, ‘multi-start’]

Notes

Selection logic:

  1. If ‘cmaes’ requested and evosax available -> ‘cmaes’

  2. If ‘cmaes’ requested but evosax unavailable -> ‘multi-start’ (with warning)

  3. If ‘multi-start’ requested -> ‘multi-start’

  4. If ‘auto’ or None:

    • If scale_ratio > threshold and evosax available -> ‘cmaes’

    • Otherwise -> ‘multi-start’

nlsq.global_optimization.is_evosax_available()[source]

Check if evosax is available for import.

Uses lazy import pattern to avoid import overhead until needed. Caches the result for subsequent calls.

Returns:

True if evosax can be imported successfully, False otherwise.

Return type:

bool

CMA-ES Memory Helpers

Helper functions for managing memory with CMA-ES on large datasets:

nlsq.global_optimization.estimate_cmaes_memory_gb(n_data, popsize, population_batch_size=None, data_chunk_size=None)[source]

Estimate peak GPU memory usage for CMA-ES in GB.

Parameters:
  • n_data (int) – Number of data points.

  • popsize (int) – Population size.

  • population_batch_size (int | None, optional) – Batch size for population evaluation. If None, uses full popsize.

  • data_chunk_size (int | None, optional) – Chunk size for data streaming. If None, uses full dataset.

Returns:

Estimated peak memory usage in GB.

Return type:

float

Examples

>>> estimate_cmaes_memory_gb(10_000_000, popsize=16)
1.1920928955078125
>>> estimate_cmaes_memory_gb(10_000_000, popsize=16, population_batch_size=4)
0.298023223876953
>>> estimate_cmaes_memory_gb(10_000_000, popsize=16, data_chunk_size=50000)
0.005960464477539062
nlsq.global_optimization.auto_configure_cmaes_memory(n_data, popsize, available_memory_gb=8.0, safety_factor=0.7)[source]

Auto-configure batch sizes to fit in available memory.

Parameters:
  • n_data (int) – Number of data points.

  • popsize (int) – Population size.

  • available_memory_gb (float, optional) – Available GPU/CPU memory in GB. Default: 8.0.

  • safety_factor (float, optional) – Safety factor for memory allocation (0-1). Default: 0.7.

Returns:

(population_batch_size, data_chunk_size) configuration. None means no batching/chunking needed for that dimension.

Return type:

tuple[int | None, int | None]

Examples

>>> auto_configure_cmaes_memory(1_000_000, popsize=16, available_memory_gb=8.0)
(None, None)  # No batching needed
>>> auto_configure_cmaes_memory(100_000_000, popsize=16, available_memory_gb=8.0)
(4, None)  # Population batching only
>>> auto_configure_cmaes_memory(100_000_000, popsize=16, available_memory_gb=1.0)
(1, 65536)  # Both population batching and data chunking
nlsq.global_optimization.compute_default_popsize(n_params)[source]

Compute default CMA-ES population size.

Uses the standard CMA-ES formula: int(4 + 3 * log(n))

Parameters:

n_params (int) – Number of parameters being optimized.

Returns:

Default population size, minimum 4.

Return type:

int

Examples

>>> compute_default_popsize(5)
8
>>> compute_default_popsize(20)
12

Tournament Selection

For large/streaming datasets where evaluating all candidates on the full dataset is impractical:

class nlsq.global_optimization.TournamentSelector(candidates, config)[source]

Bases: object

Tournament selector for progressive elimination in multi-start optimization.

This class implements tournament-style progressive elimination for selecting the best starting points when optimizing on large/streaming datasets. Instead of evaluating all candidates on the full dataset, candidates are evaluated on streaming batches and the worst performers are eliminated each round.

The tournament proceeds as: - Round 1: N candidates -> Keep top (1 - elimination_fraction) * N - Round 2: Survivors -> Keep top (1 - elimination_fraction) * survivors - … - Final: Return top M candidates

Parameters:
  • candidates (np.ndarray) – Array of shape (n_candidates, n_params) containing candidate starting points.

  • config (GlobalOptimizationConfig) – Configuration controlling tournament parameters: - elimination_rounds: Number of elimination rounds - elimination_fraction: Fraction to eliminate each round (default 0.5) - batches_per_round: Number of batches to evaluate per round

n_candidates

Total number of candidates.

Type:

int

n_params

Number of parameters per candidate.

Type:

int

survival_mask

Boolean mask indicating which candidates are still alive.

Type:

np.ndarray

cumulative_losses

Accumulated loss for each candidate (inf for eliminated).

Type:

np.ndarray

current_round

Current tournament round (0-indexed).

Type:

int

round_history

History of each round with statistics.

Type:

list

Examples

>>> import numpy as np
>>> from nlsq.global_optimization import TournamentSelector, GlobalOptimizationConfig
>>>
>>> candidates = np.random.randn(16, 3)
>>> config = GlobalOptimizationConfig(
...     n_starts=16,
...     elimination_rounds=3,
...     elimination_fraction=0.5,
...     batches_per_round=10,
... )
>>> selector = TournamentSelector(candidates, config)
>>> print(f"Starting with {selector.n_candidates} candidates")
Starting with 16 candidates

Notes

Tournament selection is particularly effective for: - Large datasets where full evaluation is expensive - Streaming datasets that don’t fit in memory - High-dimensional parameter spaces with many local minima

The elimination_fraction parameter controls the aggressiveness of pruning: - 0.5 (default): Eliminate half each round (log2(N) rounds to get to 1) - 0.25: Eliminate 25% each round (slower, more conservative) - 0.75: Eliminate 75% each round (faster, more aggressive)

__init__(candidates, config)[source]

Initialize tournament selector.

Parameters:
  • candidates (np.ndarray) – Array of shape (n_candidates, n_params) with candidate starting points.

  • config (GlobalOptimizationConfig) – Configuration for tournament parameters.

property n_survivors: int

Number of currently surviving candidates.

run_tournament(data_batch_iterator, model, top_m=1)[source]

Run full tournament selection.

Executes all elimination rounds and returns the top M surviving candidates.

Parameters:
  • data_batch_iterator (Iterator) – Iterator yielding (x_batch, y_batch) tuples of data.

  • model (Callable) – Model function with signature model(x, *params) -> predictions.

  • top_m (int, default=1) – Number of top candidates to return.

Returns:

List of top M candidate parameter arrays, sorted by loss (best first).

Return type:

list[np.ndarray]

Notes

The iterator is consumed during tournament execution. Ensure it yields enough batches: elimination_rounds * batches_per_round.

If the iterator runs out of batches before completing all rounds, the tournament will return the best candidates found so far.

get_top_candidates(top_m=1)[source]

Get the top M candidates by cumulative loss.

Parameters:

top_m (int, default=1) – Number of top candidates to return.

Returns:

List of top candidate parameter arrays, sorted by loss (best first).

Return type:

list[np.ndarray]

to_checkpoint()[source]

Serialize tournament state to a checkpoint dictionary.

Returns:

Checkpoint state that can be serialized and saved.

Return type:

dict

Examples

>>> from nlsq.utils.safe_serialize import safe_dumps
>>> checkpoint = selector.to_checkpoint()
>>> with open('tournament_checkpoint.json', 'w') as f:
...     f.write(safe_dumps(checkpoint))
classmethod from_checkpoint(checkpoint, config)[source]

Restore tournament selector from checkpoint.

Parameters:
Returns:

Restored tournament selector.

Return type:

TournamentSelector

Examples

>>> from nlsq.utils.safe_serialize import safe_loads
>>> with open('tournament_checkpoint.json') as f:
...     checkpoint = safe_loads(f.read())
>>> selector = TournamentSelector.from_checkpoint(checkpoint, config)
get_diagnostics()[source]

Get tournament diagnostics.

Returns:

Dictionary with tournament statistics and history.

Return type:

dict

Integration with NLSQ

Multi-start optimization is integrated with the existing NLSQ infrastructure:

  • Small datasets (<1M points): Full multi-start on complete data

  • Medium datasets (1M-100M points): Full multi-start, then chunked fit

  • Large datasets (>100M points): Tournament selection during streaming warmup

Usage Examples

Using curve_fit with multi-start

from nlsq import curve_fit

popt, pcov = curve_fit(
    model,
    x,
    y,
    p0=[1, 1, 1],
    bounds=([0, 0, 0], [10, 10, 10]),
    multistart=True,
    n_starts=10,
    sampler="lhs",
    center_on_p0=True,
)

Custom configuration

from nlsq.global_optimization import GlobalOptimizationConfig, MultiStartOrchestrator

config = GlobalOptimizationConfig(
    n_starts=30,
    sampler="sobol",
    center_on_p0=True,
    scale_factor=0.5,  # Tighter exploration around p0
)

orchestrator = MultiStartOrchestrator(config=config)
result = orchestrator.fit(model, x, y, bounds=bounds)

Tournament selection for streaming

from nlsq.global_optimization import (
    TournamentSelector,
    GlobalOptimizationConfig,
    latin_hypercube_sample,
)

# Generate candidates
candidates = latin_hypercube_sample(20, 3)

# Configure tournament
config = GlobalOptimizationConfig(
    n_starts=20,
    elimination_rounds=3,
    elimination_fraction=0.5,
    batches_per_round=50,
)

selector = TournamentSelector(candidates, config)


def data_generator():
    for _ in range(200):
        x_batch = np.random.randn(100)
        y_batch = model(x_batch, *true_params)
        yield x_batch, y_batch


best_candidates = selector.run_tournament(data_generator(), model, top_m=1)

Interactive Notebooks

Hands-on tutorials for global optimization:

See Also