Notebook Utilities API Reference

The notebook_utils package provides extensible notebook transformation utilities following the Strategy and Chain of Responsibility design patterns.

Module Overview

The package is organized into several modules:

Module

Purpose

notebook_utils.types

TypedDict definitions for type safety

notebook_utils.cells

Cell manipulation utilities

notebook_utils.core

Robust I/O operations with validation

notebook_utils.transformations.base

Abstract base class for transformers

notebook_utils.transformations.matplotlib

Matplotlib inline configuration

notebook_utils.transformations.imports

IPython.display import injection

notebook_utils.transformations.plt_show

plt.show() replacement logic

notebook_utils.pipeline

Transformation pipeline orchestration

notebook_utils.tracking

Incremental processing with checksums

Type Definitions

class notebook_utils.types.NotebookCell

TypedDict defining the structure of a Jupyter notebook cell.

Attributes:

cell_type: Literal['code', 'markdown', 'raw']

Type of the cell

execution_count: int | None

Execution count for code cells

metadata: dict

Cell metadata dictionary

outputs: list

List of cell outputs (for code cells)

source: str | list[str]

Cell source code or markdown content

Example:

cell: NotebookCell = {
    "cell_type": "code",
    "execution_count": None,
    "metadata": {},
    "outputs": [],
    "source": ["import numpy as np"],
}
class notebook_utils.types.NotebookStats

TypedDict for tracking transformation statistics.

Attributes:

matplotlib_magic_added: int

Number of %matplotlib inline magics added

ipython_display_import_added: int

Number of IPython.display imports added

plt_show_replaced: int

Number of plt.show() calls replaced

cells_modified: int

Total number of cells modified

Cell Utilities

notebook_utils.cells.has_matplotlib_magic(cells: list[NotebookCell]) bool

Check if notebook already has %matplotlib inline magic.

Parameters:

cells – List of notebook cells

Returns:

True if magic is present

Example:

if not has_matplotlib_magic(cells):
    # Add magic
    pass
notebook_utils.cells.has_ipython_display_import(cells: list[NotebookCell]) bool

Check if notebook imports display from IPython.display.

Parameters:

cells – List of notebook cells

Returns:

True if import is present

notebook_utils.cells.uses_display(cells: list[NotebookCell]) bool

Check if notebook uses the display() function.

Parameters:

cells – List of notebook cells

Returns:

True if display() is called

notebook_utils.cells.find_first_code_cell_index(cells: list[NotebookCell]) int

Find index of first code cell in notebook.

Parameters:

cells – List of notebook cells

Returns:

Index of first code cell, or 0 if no code cells

notebook_utils.cells.find_cell_with_pattern(cells: list[NotebookCell], pattern: str) int | None

Find first cell containing a pattern.

Parameters:
  • cells – List of notebook cells

  • pattern – String pattern to search for

Returns:

Index of first matching cell, or None

Example:

idx = find_cell_with_pattern(cells, "%matplotlib inline")
if idx is not None:
    # Insert after matplotlib magic
    insert_idx = idx + 1
notebook_utils.cells.create_matplotlib_config_cell() NotebookCell

Create a cell with matplotlib inline configuration.

Returns:

NotebookCell with %matplotlib inline magic

notebook_utils.cells.create_ipython_display_import_cell() NotebookCell

Create a cell with IPython.display import.

Returns:

NotebookCell with display import statement

notebook_utils.cells.cell_contains_pattern(cells: list[NotebookCell], pattern: str, cell_type: str = 'code') bool

Check if any cell contains a pattern.

Parameters:
  • cells – List of notebook cells

  • pattern – String pattern to search for

  • cell_type – Type of cells to search (default: “code”)

Returns:

True if pattern found

Core I/O Operations

Exception Classes

exception notebook_utils.core.NotebookError

Base exception for notebook operations.

exception notebook_utils.core.NotebookReadError

Raised when notebook cannot be read.

exception notebook_utils.core.NotebookWriteError

Raised when notebook cannot be written.

exception notebook_utils.core.NotebookValidationError

Raised when notebook structure is invalid.

Functions

notebook_utils.core.validate_notebook_structure(notebook: dict) None

Validate notebook has required structure.

Parameters:

notebook – Notebook dictionary

Raises:

NotebookValidationError – If structure is invalid

Validates:

  • nbformat field exists

  • cells field exists and is a list

notebook_utils.core.read_notebook(path: Path) dict | None

Read and validate notebook from file with comprehensive error handling.

Parameters:

path – Path to notebook file

Returns:

Notebook dictionary, or None on error

Error Handling:

  • Logs file not found errors

  • Logs JSON decode errors

  • Logs validation errors

  • Returns None on any error

Example:

from pathlib import Path

notebook = read_notebook(Path("example.ipynb"))
if notebook is None:
    print("Failed to read notebook")
notebook_utils.core.write_notebook(path: Path, notebook: dict, backup: bool = False) bool

Write notebook to file with atomic write pattern.

Parameters:
  • path – Path to notebook file

  • notebook – Notebook dictionary

  • backup – Create .bak file before writing

Returns:

True if successful, False otherwise

Atomic Write Pattern:

  1. Write to temporary file

  2. Validate write succeeded

  3. Move temporary file to target (atomic operation)

Example:

success = write_notebook(
    Path("example.ipynb"), notebook, backup=True  # Creates example.ipynb.bak
)

Transformation Base Class

class notebook_utils.transformations.base.NotebookTransformer

Abstract base class for notebook transformations (Strategy pattern).

Each transformer should be:

  • Stateless: Can be reused across multiple notebooks

  • Idempotent: Running twice produces same result as running once

  • Pure: Only modifies notebook cells, no side effects

abstractmethod transform(cells: list[NotebookCell]) tuple[list[NotebookCell], dict[str, int]]

Transform notebook cells.

Parameters:

cells – List of notebook cells to transform

Returns:

Tuple of (modified_cells, stats_dict)

Note: Should return NEW list, not mutate input cells

abstractmethod name() str

Return unique transformation name.

Returns:

Transformation identifier (e.g., “matplotlib_inline”)

abstractmethod description() str

Return human-readable transformation description.

Returns:

Description of what this transformation does

should_apply(cells: list[NotebookCell]) bool

Check if transformation should be applied.

Parameters:

cells – Notebook cells to check

Returns:

True if transformation should run

Default: Returns True. Override to skip when not needed.

validate_result(original: list[NotebookCell], transformed: list[NotebookCell]) bool

Validate transformation result.

Parameters:
  • original – Original cells before transformation

  • transformed – Cells after transformation

Returns:

True if valid

Raises:

ValueError – If validation fails

Default: Checks result is a list. Override for custom validation.

Transformation Implementations

MatplotlibInlineTransformer

class notebook_utils.transformations.matplotlib.MatplotlibInlineTransformer

Bases: NotebookTransformer

Adds %matplotlib inline magic before first code cell.

Ensures: Notebooks have matplotlib inline backend configured for proper display.

transform(cells: list[NotebookCell]) tuple[list[NotebookCell], dict[str, int]]

Add %matplotlib inline magic if not present.

Parameters:

cells – Notebook cells

Returns:

Tuple of (modified_cells, {“magic_added”: count})

name() str
Returns:

“matplotlib_inline”

description() str
Returns:

“Add %matplotlib inline magic for inline plotting”

should_apply(cells: list[NotebookCell]) bool
Returns:

True if magic not already present

Example:

transformer = MatplotlibInlineTransformer()
result, stats = transformer.transform(cells)
print(f"Added {stats['magic_added']} magic(s)")

IPythonDisplayImportTransformer

class notebook_utils.transformations.imports.IPythonDisplayImportTransformer

Bases: NotebookTransformer

Adds from IPython.display import display when display() is used.

Prevents: NameError when notebooks use display() without importing it.

transform(cells: list[NotebookCell]) tuple[list[NotebookCell], dict[str, int]]

Add IPython.display import if needed.

Parameters:

cells – Notebook cells

Returns:

Tuple of (modified_cells, {“import_added”: count})

name() str
Returns:

“ipython_display_import”

description() str
Returns:

“Add IPython.display import when display() is used”

should_apply(cells: list[NotebookCell]) bool
Returns:

True if display() used and import not present

PltShowReplacementTransformer

notebook_utils.transformations.plt_show.find_figure_variable(source: list[str], show_line_idx: int) str

Find figure variable name by looking backwards from plt.show().

Parameters:
  • source – Source code lines

  • show_line_idx – Index of line containing plt.show()

Returns:

Figure variable name or “plt.gcf()” as fallback

Looks for patterns:

  • fig = plt.figure()

  • fig, ax = plt.subplots()

notebook_utils.transformations.plt_show.replace_plt_show(source: list[str]) tuple[list[str], int]

Replace plt.show() with display pattern using context-aware logic.

Parameters:

source – Source code lines

Returns:

Tuple of (modified_source, num_replacements)

Replacement Pattern:

# Before
plt.show()

# After
plt.tight_layout()
display(fig)
plt.close(fig)

Skips:

  • Comments (lines starting with #)

  • String literals (inside quotes)

  • Complex statements (not standalone plt.show())

class notebook_utils.transformations.plt_show.PltShowReplacementTransformer

Bases: NotebookTransformer

Replaces plt.show() calls with display/close pattern.

Improves:

  1. Layout with plt.tight_layout()

  2. Display with explicit display()

  3. Memory management with plt.close()

transform(cells: list[NotebookCell]) tuple[list[NotebookCell], dict[str, int]]

Replace plt.show() in all code cells.

Parameters:

cells – Notebook cells

Returns:

Tuple of (modified_cells, {“replacements”: count, “cells_modified”: count})

name() str
Returns:

“plt_show_replacement”

description() str
Returns:

“Replace plt.show() with display/close pattern”

Pipeline Orchestration

class notebook_utils.pipeline.TransformationPipeline

Composes multiple transformations with rollback support (Chain of Responsibility pattern).

Provides:

  • Atomic commit semantics

  • Automatic rollback on errors

  • Validation of all transformations

  • Statistics collection

__init__(transformers: list[NotebookTransformer])

Initialize pipeline with transformers.

Parameters:

transformers – List of transformers to apply in order

run(notebook_path: Path, backup: bool = False, dry_run: bool = False) dict[str, dict]

Run all transformations with atomic commit.

Parameters:
  • notebook_path – Path to notebook file

  • backup – Create .bak file before writing

  • dry_run – Don’t write changes, just return stats

Returns:

Dictionary mapping transformer names to their stats

Raises:

Exception – If any transformation fails (with rollback)

Example:

from pathlib import Path
from notebook_utils.pipeline import TransformationPipeline
from notebook_utils.transformations import (
    MatplotlibInlineTransformer,
    IPythonDisplayImportTransformer,
)

pipeline = TransformationPipeline(
    [
        MatplotlibInlineTransformer(),
        IPythonDisplayImportTransformer(),
    ]
)

stats = pipeline.run(Path("example.ipynb"), backup=True, dry_run=False)

print(stats)
# {
#   "matplotlib_inline": {"magic_added": 1},
#   "ipython_display_import": {"import_added": 1}
# }
get_transformers() list[NotebookTransformer]

Get list of transformers in pipeline.

Returns:

List of transformer instances (copy)

add_transformer(transformer: NotebookTransformer) None

Add a transformer to the end of the pipeline.

Parameters:

transformer – Transformer instance to add

describe() list[dict[str, str]]

Get description of all transformers in pipeline.

Returns:

List of dicts with ‘name’ and ‘description’ keys

Incremental Processing

class notebook_utils.tracking.ProcessingTracker

Tracks processed notebooks to enable incremental updates.

Uses: SHA-256 checksums to detect changes. Stores state in .notebook_transforms.json.

__init__(state_file: Path = None)

Initialize tracker with state file.

Parameters:

state_file – Path to state file (default: .notebook_transforms.json in current directory)

needs_processing(notebook_path: Path, transformations: list[str]) bool

Check if notebook needs processing.

Parameters:
  • notebook_path – Path to notebook

  • transformations – List of transformation names to apply

Returns:

True if notebook should be processed

Returns True if:

  • Notebook is new (not in state)

  • File content changed (different checksum)

  • Transformation set changed

mark_processed(notebook_path: Path, transformations: list[str], stats: dict = None) None

Mark notebook as processed.

Parameters:
  • notebook_path – Path to notebook

  • transformations – List of transformation names applied

  • stats – Optional statistics from processing

Updates:

  • Checksum (SHA-256 of file content)

  • Transformations list (sorted)

  • Last processed timestamp

  • Processing statistics

clear() None

Clear all processing state.

Deletes: State file and clears in-memory state.

get_stats() dict

Get statistics about tracked notebooks.

Returns:

Dictionary with tracking statistics

Example:

tracker = ProcessingTracker()
stats = tracker.get_stats()
print(stats)
# {
#   "total_tracked": 42,
#   "state_file": "/path/to/.notebook_transforms.json",
#   "state_file_exists": True
# }

Example Usage:

from pathlib import Path
from notebook_utils.tracking import ProcessingTracker
from notebook_utils.pipeline import TransformationPipeline
from notebook_utils.transformations import MatplotlibInlineTransformer

# Initialize
tracker = ProcessingTracker()
pipeline = TransformationPipeline([MatplotlibInlineTransformer()])
transform_names = ["matplotlib_inline"]

# Process only if needed
notebook_path = Path("example.ipynb")
if tracker.needs_processing(notebook_path, transform_names):
    stats = pipeline.run(notebook_path)
    tracker.mark_processed(notebook_path, transform_names, stats)
else:
    print("Notebook already up-to-date")

See Also