Notebook Utilities API Reference¶
The notebook_utils package provides extensible notebook transformation utilities following the Strategy and Chain of Responsibility design patterns.
Module Overview¶
The package is organized into several modules:
Module |
Purpose |
|---|---|
TypedDict definitions for type safety |
|
Cell manipulation utilities |
|
Robust I/O operations with validation |
|
Abstract base class for transformers |
|
Matplotlib inline configuration |
|
IPython.display import injection |
|
plt.show() replacement logic |
|
Transformation pipeline orchestration |
|
Incremental processing with checksums |
Type Definitions¶
- class notebook_utils.types.NotebookCell¶
TypedDict defining the structure of a Jupyter notebook cell.
Attributes:
- cell_type: Literal['code', 'markdown', 'raw']¶
Type of the cell
Example:
cell: NotebookCell = { "cell_type": "code", "execution_count": None, "metadata": {}, "outputs": [], "source": ["import numpy as np"], }
- class notebook_utils.types.NotebookStats¶
TypedDict for tracking transformation statistics.
Attributes:
Cell Utilities¶
- notebook_utils.cells.has_matplotlib_magic(cells: list[NotebookCell]) bool¶
Check if notebook already has
%matplotlib inlinemagic.- Parameters:
cells – List of notebook cells
- Returns:
True if magic is present
Example:
if not has_matplotlib_magic(cells): # Add magic pass
- notebook_utils.cells.has_ipython_display_import(cells: list[NotebookCell]) bool¶
Check if notebook imports
displayfromIPython.display.- Parameters:
cells – List of notebook cells
- Returns:
True if import is present
- notebook_utils.cells.uses_display(cells: list[NotebookCell]) bool¶
Check if notebook uses the
display()function.- Parameters:
cells – List of notebook cells
- Returns:
True if display() is called
- notebook_utils.cells.find_first_code_cell_index(cells: list[NotebookCell]) int¶
Find index of first code cell in notebook.
- Parameters:
cells – List of notebook cells
- Returns:
Index of first code cell, or 0 if no code cells
- notebook_utils.cells.find_cell_with_pattern(cells: list[NotebookCell], pattern: str) int | None¶
Find first cell containing a pattern.
- Parameters:
cells – List of notebook cells
pattern – String pattern to search for
- Returns:
Index of first matching cell, or None
Example:
idx = find_cell_with_pattern(cells, "%matplotlib inline") if idx is not None: # Insert after matplotlib magic insert_idx = idx + 1
- notebook_utils.cells.create_matplotlib_config_cell() NotebookCell¶
Create a cell with matplotlib inline configuration.
- Returns:
NotebookCell with %matplotlib inline magic
- notebook_utils.cells.create_ipython_display_import_cell() NotebookCell¶
Create a cell with IPython.display import.
- Returns:
NotebookCell with display import statement
- notebook_utils.cells.cell_contains_pattern(cells: list[NotebookCell], pattern: str, cell_type: str = 'code') bool¶
Check if any cell contains a pattern.
- Parameters:
cells – List of notebook cells
pattern – String pattern to search for
cell_type – Type of cells to search (default: “code”)
- Returns:
True if pattern found
Core I/O Operations¶
Exception Classes¶
- exception notebook_utils.core.NotebookError¶
Base exception for notebook operations.
- exception notebook_utils.core.NotebookReadError¶
Raised when notebook cannot be read.
- exception notebook_utils.core.NotebookWriteError¶
Raised when notebook cannot be written.
- exception notebook_utils.core.NotebookValidationError¶
Raised when notebook structure is invalid.
Functions¶
- notebook_utils.core.validate_notebook_structure(notebook: dict) None¶
Validate notebook has required structure.
- Parameters:
notebook – Notebook dictionary
- Raises:
NotebookValidationError – If structure is invalid
Validates:
nbformatfield existscellsfield exists and is a list
- notebook_utils.core.read_notebook(path: Path) dict | None¶
Read and validate notebook from file with comprehensive error handling.
- Parameters:
path – Path to notebook file
- Returns:
Notebook dictionary, or None on error
Error Handling:
Logs file not found errors
Logs JSON decode errors
Logs validation errors
Returns None on any error
Example:
from pathlib import Path notebook = read_notebook(Path("example.ipynb")) if notebook is None: print("Failed to read notebook")
- notebook_utils.core.write_notebook(path: Path, notebook: dict, backup: bool = False) bool¶
Write notebook to file with atomic write pattern.
- Parameters:
path – Path to notebook file
notebook – Notebook dictionary
backup – Create .bak file before writing
- Returns:
True if successful, False otherwise
Atomic Write Pattern:
Write to temporary file
Validate write succeeded
Move temporary file to target (atomic operation)
Example:
success = write_notebook( Path("example.ipynb"), notebook, backup=True # Creates example.ipynb.bak )
Transformation Base Class¶
- class notebook_utils.transformations.base.NotebookTransformer¶
Abstract base class for notebook transformations (Strategy pattern).
Each transformer should be:
Stateless: Can be reused across multiple notebooks
Idempotent: Running twice produces same result as running once
Pure: Only modifies notebook cells, no side effects
- abstractmethod transform(cells: list[NotebookCell]) tuple[list[NotebookCell], dict[str, int]]¶
Transform notebook cells.
- Parameters:
cells – List of notebook cells to transform
- Returns:
Tuple of (modified_cells, stats_dict)
Note: Should return NEW list, not mutate input cells
- abstractmethod name() str¶
Return unique transformation name.
- Returns:
Transformation identifier (e.g., “matplotlib_inline”)
- abstractmethod description() str¶
Return human-readable transformation description.
- Returns:
Description of what this transformation does
- should_apply(cells: list[NotebookCell]) bool¶
Check if transformation should be applied.
- Parameters:
cells – Notebook cells to check
- Returns:
True if transformation should run
Default: Returns True. Override to skip when not needed.
- validate_result(original: list[NotebookCell], transformed: list[NotebookCell]) bool¶
Validate transformation result.
- Parameters:
original – Original cells before transformation
transformed – Cells after transformation
- Returns:
True if valid
- Raises:
ValueError – If validation fails
Default: Checks result is a list. Override for custom validation.
Transformation Implementations¶
MatplotlibInlineTransformer¶
- class notebook_utils.transformations.matplotlib.MatplotlibInlineTransformer¶
Bases:
NotebookTransformerAdds
%matplotlib inlinemagic before first code cell.Ensures: Notebooks have matplotlib inline backend configured for proper display.
- transform(cells: list[NotebookCell]) tuple[list[NotebookCell], dict[str, int]]¶
Add %matplotlib inline magic if not present.
- Parameters:
cells – Notebook cells
- Returns:
Tuple of (modified_cells, {“magic_added”: count})
- should_apply(cells: list[NotebookCell]) bool¶
- Returns:
True if magic not already present
Example:
transformer = MatplotlibInlineTransformer() result, stats = transformer.transform(cells) print(f"Added {stats['magic_added']} magic(s)")
IPythonDisplayImportTransformer¶
- class notebook_utils.transformations.imports.IPythonDisplayImportTransformer¶
Bases:
NotebookTransformerAdds
from IPython.display import displaywhendisplay()is used.Prevents: NameError when notebooks use display() without importing it.
- transform(cells: list[NotebookCell]) tuple[list[NotebookCell], dict[str, int]]¶
Add IPython.display import if needed.
- Parameters:
cells – Notebook cells
- Returns:
Tuple of (modified_cells, {“import_added”: count})
- should_apply(cells: list[NotebookCell]) bool¶
- Returns:
True if display() used and import not present
PltShowReplacementTransformer¶
- notebook_utils.transformations.plt_show.find_figure_variable(source: list[str], show_line_idx: int) str¶
Find figure variable name by looking backwards from
plt.show().- Parameters:
source – Source code lines
show_line_idx – Index of line containing plt.show()
- Returns:
Figure variable name or “plt.gcf()” as fallback
Looks for patterns:
fig = plt.figure()fig, ax = plt.subplots()
- notebook_utils.transformations.plt_show.replace_plt_show(source: list[str]) tuple[list[str], int]¶
Replace
plt.show()with display pattern using context-aware logic.- Parameters:
source – Source code lines
- Returns:
Tuple of (modified_source, num_replacements)
Replacement Pattern:
# Before plt.show() # After plt.tight_layout() display(fig) plt.close(fig)
Skips:
Comments (lines starting with #)
String literals (inside quotes)
Complex statements (not standalone plt.show())
- class notebook_utils.transformations.plt_show.PltShowReplacementTransformer¶
Bases:
NotebookTransformerReplaces
plt.show()calls with display/close pattern.Improves:
Layout with
plt.tight_layout()Display with explicit
display()Memory management with
plt.close()
- transform(cells: list[NotebookCell]) tuple[list[NotebookCell], dict[str, int]]¶
Replace plt.show() in all code cells.
- Parameters:
cells – Notebook cells
- Returns:
Tuple of (modified_cells, {“replacements”: count, “cells_modified”: count})
Pipeline Orchestration¶
- class notebook_utils.pipeline.TransformationPipeline¶
Composes multiple transformations with rollback support (Chain of Responsibility pattern).
Provides:
Atomic commit semantics
Automatic rollback on errors
Validation of all transformations
Statistics collection
- __init__(transformers: list[NotebookTransformer])¶
Initialize pipeline with transformers.
- Parameters:
transformers – List of transformers to apply in order
- run(notebook_path: Path, backup: bool = False, dry_run: bool = False) dict[str, dict]¶
Run all transformations with atomic commit.
- Parameters:
notebook_path – Path to notebook file
backup – Create .bak file before writing
dry_run – Don’t write changes, just return stats
- Returns:
Dictionary mapping transformer names to their stats
- Raises:
Exception – If any transformation fails (with rollback)
Example:
from pathlib import Path from notebook_utils.pipeline import TransformationPipeline from notebook_utils.transformations import ( MatplotlibInlineTransformer, IPythonDisplayImportTransformer, ) pipeline = TransformationPipeline( [ MatplotlibInlineTransformer(), IPythonDisplayImportTransformer(), ] ) stats = pipeline.run(Path("example.ipynb"), backup=True, dry_run=False) print(stats) # { # "matplotlib_inline": {"magic_added": 1}, # "ipython_display_import": {"import_added": 1} # }
- get_transformers() list[NotebookTransformer]¶
Get list of transformers in pipeline.
- Returns:
List of transformer instances (copy)
- add_transformer(transformer: NotebookTransformer) None¶
Add a transformer to the end of the pipeline.
- Parameters:
transformer – Transformer instance to add
Incremental Processing¶
- class notebook_utils.tracking.ProcessingTracker¶
Tracks processed notebooks to enable incremental updates.
Uses: SHA-256 checksums to detect changes. Stores state in
.notebook_transforms.json.- __init__(state_file: Path = None)¶
Initialize tracker with state file.
- Parameters:
state_file – Path to state file (default: .notebook_transforms.json in current directory)
- needs_processing(notebook_path: Path, transformations: list[str]) bool¶
Check if notebook needs processing.
- Parameters:
notebook_path – Path to notebook
transformations – List of transformation names to apply
- Returns:
True if notebook should be processed
Returns True if:
Notebook is new (not in state)
File content changed (different checksum)
Transformation set changed
- mark_processed(notebook_path: Path, transformations: list[str], stats: dict = None) None¶
Mark notebook as processed.
- Parameters:
notebook_path – Path to notebook
transformations – List of transformation names applied
stats – Optional statistics from processing
Updates:
Checksum (SHA-256 of file content)
Transformations list (sorted)
Last processed timestamp
Processing statistics
- get_stats() dict¶
Get statistics about tracked notebooks.
- Returns:
Dictionary with tracking statistics
Example:
tracker = ProcessingTracker() stats = tracker.get_stats() print(stats) # { # "total_tracked": 42, # "state_file": "/path/to/.notebook_transforms.json", # "state_file_exists": True # }
Example Usage:
from pathlib import Path from notebook_utils.tracking import ProcessingTracker from notebook_utils.pipeline import TransformationPipeline from notebook_utils.transformations import MatplotlibInlineTransformer # Initialize tracker = ProcessingTracker() pipeline = TransformationPipeline([MatplotlibInlineTransformer()]) transform_names = ["matplotlib_inline"] # Process only if needed notebook_path = Path("example.ipynb") if tracker.needs_processing(notebook_path, transform_names): stats = pipeline.run(notebook_path) tracker.mark_processed(notebook_path, transform_names, stats) else: print("Notebook already up-to-date")
See Also¶
Notebook Configuration Utilities - User Guide
CI/CD Documentation - CI/CD Integration
Index - General Index