Global Optimization
===================

This reference covers NLSQ's global optimization capabilities, including
multi-start optimization and CMA-ES (Covariance Matrix Adaptation Evolution
Strategy).

.. contents:: Contents
   :local:
   :depth: 2

Overview
--------

NLSQ provides two main approaches for global optimization:

1. **Multi-start optimization**: Run multiple local optimizations from
   different starting points using Latin Hypercube Sampling or other
   quasi-random samplers.

2. **CMA-ES (Evolution Strategy)**: A gradient-free evolutionary algorithm
   that adapts the search covariance matrix, particularly effective for
   multi-scale parameter problems.

Installation
------------

Multi-start optimization works out of the box. For CMA-ES, install the
optional ``evosax`` dependency:

.. code-block:: bash

   pip install nlsq

CMA-ES Global Optimization
--------------------------

CMA-ES is recommended when:

- Parameters span many orders of magnitude (>1000x scale ratio)
- The fitness landscape has multiple local minima
- Gradient information is unreliable
- You want robust convergence without sensitivity to initialization

Basic Usage
^^^^^^^^^^^

.. code-block:: python

   from nlsq.global_optimization import CMAESOptimizer, CMAESConfig
   import jax.numpy as jnp


   # Define model
   def model(x, a, b):
       return a * jnp.exp(-b * x)


   # Generate data
   x = jnp.linspace(0, 5, 100)
   y = 2.5 * jnp.exp(-0.5 * x)

   # Bounds are required for CMA-ES
   bounds = ([0.1, 0.01], [10.0, 2.0])

   # Create optimizer (uses default BIPOP configuration)
   optimizer = CMAESOptimizer()

   # Run optimization
   result = optimizer.fit(model, x, y, bounds=bounds)

   print(f"Optimal parameters: {result['popt']}")
   print(f"Parameter covariance: {result['pcov']}")

Using Presets
^^^^^^^^^^^^^

Three presets are available for common use cases:

.. code-block:: python

   # Fast preset: no restarts, 50 generations
   optimizer = CMAESOptimizer.from_preset("cmaes-fast")

   # Standard preset: BIPOP with 9 restarts, 100 generations
   optimizer = CMAESOptimizer.from_preset("cmaes")

   # Global preset: BIPOP with 9 restarts, 200 generations, 2x population
   optimizer = CMAESOptimizer.from_preset("cmaes-global")

Custom Configuration
^^^^^^^^^^^^^^^^^^^^

For fine-grained control, create a custom ``CMAESConfig``:

.. code-block:: python

   from nlsq.global_optimization import CMAESConfig, CMAESOptimizer

   config = CMAESConfig(
       popsize=32,  # Population size (None = auto)
       max_generations=150,  # Max generations per run
       sigma=0.3,  # Initial step size
       tol_fun=1e-10,  # Fitness tolerance
       tol_x=1e-10,  # Parameter tolerance
       restart_strategy="bipop",  # 'none' or 'bipop'
       max_restarts=5,  # Max BIPOP restarts
       refine_with_nlsq=True,  # Refine with Trust Region
       seed=42,  # For reproducibility
   )

   optimizer = CMAESOptimizer(config=config)

Memory Management for Large Datasets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

CMA-ES can encounter out-of-memory (OOM) issues with large datasets (>10M points)
because each fitness evaluation processes the full dataset across all population
members. NLSQ provides two strategies to manage memory:

**Population Batching**: Evaluates population members in smaller groups instead
of all at once.

.. code-block:: python

   config = CMAESConfig(
       population_batch_size=4,  # Evaluate 4 candidates at a time
   )

**Data Streaming**: Processes the dataset in chunks, accumulating the sum of
squared residuals across chunks.

.. code-block:: python

   config = CMAESConfig(
       data_chunk_size=50000,  # Process 50K points per chunk
   )

**Combined Configuration** for maximum memory efficiency:

.. code-block:: python

   from nlsq.global_optimization import CMAESConfig, CMAESOptimizer

   # For 100M+ point datasets
   config = CMAESConfig(
       population_batch_size=4,  # Batch population evaluation
       data_chunk_size=50000,  # Stream data in 50K chunks
       max_generations=100,
   )

   optimizer = CMAESOptimizer(config=config)

Memory Estimation
"""""""""""""""""

Use the helper functions to estimate and auto-configure memory usage:

.. code-block:: python

   from nlsq.global_optimization import (
       estimate_cmaes_memory_gb,
       auto_configure_cmaes_memory,
   )

   # Estimate memory for a configuration
   memory_gb = estimate_cmaes_memory_gb(
       n_data=100_000_000,
       popsize=16,
       population_batch_size=4,
       data_chunk_size=50000,
   )
   print(f"Estimated memory: {memory_gb:.3f} GB")  # ~0.005 GB

   # Auto-configure based on available memory
   pop_batch, data_chunk = auto_configure_cmaes_memory(
       n_data=100_000_000,
       popsize=16,
       available_memory_gb=4.0,  # Target GPU memory
   )
   config = CMAESConfig(
       population_batch_size=pop_batch,
       data_chunk_size=data_chunk,
   )

Memory comparison for 100M points with popsize=16:

.. list-table::
   :header-rows: 1
   :widths: 40 30 30

   * - Configuration
     - Peak Memory
     - Reduction
   * - No batching (default)
     - 12.8 GB
     - --
   * - ``population_batch_size=4``
     - 3.2 GB
     - 75%
   * - ``data_chunk_size=50000``
     - ~18 MB
     - 99.9%
   * - Both combined
     - ~5 MB
     - 99.96%

Integration with curve_fit
^^^^^^^^^^^^^^^^^^^^^^^^^^

CMA-ES can be used directly through ``curve_fit`` with the ``method`` parameter:

.. code-block:: python

   from nlsq import curve_fit

   result = curve_fit(
       model,
       x,
       y,
       bounds=bounds,
       method="cmaes",  # Explicitly request CMA-ES
   )

   # Or with custom config
   from nlsq.global_optimization import CMAESConfig

   config = CMAESConfig(max_generations=200, seed=42)
   result = curve_fit(
       model,
       x,
       y,
       bounds=bounds,
       method="cmaes",
       cmaes_config=config,
   )

Auto Method Selection
^^^^^^^^^^^^^^^^^^^^^

Use ``method="auto"`` to let NLSQ choose based on the problem:

.. code-block:: python

   from nlsq import curve_fit

   # NLSQ checks scale ratio and evosax availability
   result = curve_fit(model, x, y, bounds=bounds, method="auto")

The ``MethodSelector`` class handles the logic:

- If scale ratio > 1000x and evosax available: CMA-ES
- Otherwise: multi-start optimization

Diagnostics
^^^^^^^^^^^

CMA-ES returns detailed diagnostics:

.. code-block:: python

   result = optimizer.fit(model, x, y, bounds=bounds)
   diag = result["cmaes_diagnostics"]

   print(f"Total generations: {diag['total_generations']}")
   print(f"Total restarts: {diag['total_restarts']}")
   print(f"Final sigma: {diag['final_sigma']}")
   print(f"Best fitness: {diag['best_fitness']}")
   print(f"Convergence reason: {diag['convergence_reason']}")
   print(f"Wall time: {diag['wall_time']}s")

The ``CMAESDiagnostics`` class provides analysis methods:

.. code-block:: python

   from nlsq.global_optimization import CMAESDiagnostics

   diag = CMAESDiagnostics.from_dict(result["cmaes_diagnostics"])
   print(diag.summary())
   print(f"Fitness improvement: {diag.get_fitness_improvement()}")

BIPOP Restart Strategy
^^^^^^^^^^^^^^^^^^^^^^

BIPOP (Bi-Population) alternates between large and small population runs:

- **Large population**: More exploration, broader search
- **Small population**: More exploitation, faster convergence

The ``BIPOPRestarter`` class manages this:

.. code-block:: python

   from nlsq.global_optimization import BIPOPRestarter

   restarter = BIPOPRestarter(
       base_popsize=16,
       n_params=3,
       max_restarts=9,
       min_fitness_spread=1e-12,
   )

   while not restarter.exhausted:
       popsize = restarter.get_next_popsize()  # Alternates large/small
       # ... run CMA-ES ...
       restarter.register_restart()

Multi-Start Optimization
------------------------

For problems where CMA-ES is not needed, multi-start optimization provides
robust global search using quasi-random sampling.

See the ``MultiStartOrchestrator`` API for details.

API Reference
-------------

Configuration
^^^^^^^^^^^^^

.. autoclass:: nlsq.global_optimization.CMAESConfig
   :members:
   :undoc-members:

Optimizer
^^^^^^^^^

.. autoclass:: nlsq.global_optimization.CMAESOptimizer
   :members:
   :undoc-members:

Diagnostics
^^^^^^^^^^^

.. autoclass:: nlsq.global_optimization.CMAESDiagnostics
   :members:
   :undoc-members:

Restart Strategy
^^^^^^^^^^^^^^^^

.. autoclass:: nlsq.global_optimization.BIPOPRestarter
   :members:
   :undoc-members:

Method Selection
^^^^^^^^^^^^^^^^

.. autoclass:: nlsq.global_optimization.MethodSelector
   :members:
   :undoc-members:

Memory Helpers
^^^^^^^^^^^^^^

.. autofunction:: nlsq.global_optimization.estimate_cmaes_memory_gb

.. autofunction:: nlsq.global_optimization.auto_configure_cmaes_memory