Benchmark Organization¶
Current Structure¶
The benchmark code is organized into three layers:
1. Utility Functions (src/novelentitymatcher/utils/)¶
Internal helper functions used by benchmark scripts:
- benchmarks.py - Core benchmark utilities
- benchmark_dataset.py - Dataset loading for benchmarks
- benchmark_reporting.py - Report formatting and CLI helpers
These are internal utilities not exported in the public API.
2. Benchmark Scripts (scripts/)¶
Executable scripts that run benchmarks:
- benchmark_embeddings.py - Embedding model benchmarks
- benchmark_async.py - Async API benchmarks
- benchmark_bert.py - BERT classifier benchmarks
- visualize_benchmarks.py - Visualization tools
- render_benchmark_report.py - Report generation
3. Results (artifacts/benchmarks/)¶
Benchmark output data: - JSON result files - Generated reports
Design Rationale¶
Why keep utilities in src/novelentitymatcher/utils/?
- Internal vs. External: Benchmark utilities are internal helpers, not user-facing features
- Import Path: Scripts in
scripts/can import fromnovelentitymatcher.utils.benchmark* - Testing: Tests are already in
tests/test_utils/test_benchmarks.py - Clarity: The
utils/directory clearly indicates these are helper functions
Why not create a top-level benchmarks/ directory?
- Scope: These are utilities, not standalone benchmarks
- Imports: Would require updating all import paths in scripts/
- Consistency: Other utilities (validation, logging, etc.) are in
utils/ - Separation: Actual benchmark scripts are already separated in
scripts/
Usage¶
Running Benchmarks¶
# Embedding benchmarks
uv run python scripts/benchmark_embeddings.py
# Async benchmarks
uv run python scripts/benchmark_async.py
# BERT benchmarks
uv run python scripts/benchmark_bert.py
Using Utilities Directly¶
from novelentitymatcher.utils.benchmarks import benchmark_accuracy
from novelentitymatcher.utils.benchmark_dataset import load_processed_sections
from novelentitymatcher.utils.benchmark_reporting import format_benchmark_summary
Future Considerations¶
If the benchmark utilities become part of the public API, consider:
1. Creating novelentitymatcher.benchmarks module
2. Exporting in main __init__.py
3. Adding documentation for public benchmark API
For now, the current structure is optimal for internal use.