Skip to content

Monitoring

novelentitymatcher.monitoring.metrics

Metrics schema and utilities for library-centric observability.

Provides MetricEvent dataclass and helper functions for emitting metrics through user-provided callbacks. Works without external dependencies and is suitable for library usage patterns.

Classes

MetricEvent(name, value, unit, labels, timestamp) dataclass

Standardized metric event structure.

Can be emitted from any component (matcher, stages, detectors) and sent to user-provided callbacks for custom handling.

Attributes:

Name Type Description
name str

Metric name (e.g., "match_latency", "novelty_rate")

value float

Numeric value of the metric

unit str

Unit of measurement (e.g., "seconds", "count", "ratio")

labels dict[str, str]

Key-value pairs for categorization (e.g., {"stage": "ood"})

timestamp datetime

When the metric was recorded

Functions

create_metric(name, value, unit, labels=None)

Helper to create MetricEvent with current timestamp.

Parameters:

Name Type Description Default
name str

Metric name

required
value float

Numeric value

required
unit str

Unit of measurement

required
labels dict[str, str] | None

Optional labels dictionary

None

Returns:

Type Description
MetricEvent

MetricEvent with current timestamp

Source code in src/novelentitymatcher/monitoring/metrics.py
def create_metric(
    name: str,
    value: float,
    unit: str,
    labels: dict[str, str] | None = None,
) -> MetricEvent:
    """Helper to create MetricEvent with current timestamp.

    Args:
        name: Metric name
        value: Numeric value
        unit: Unit of measurement
        labels: Optional labels dictionary

    Returns:
        MetricEvent with current timestamp
    """
    return MetricEvent(
        name=name,
        value=value,
        unit=unit,
        labels=labels or {},
        timestamp=datetime.now(),
    )

get_metric_summary(events)

Summarize metric events by name.

Parameters:

Name Type Description Default
events list[MetricEvent]

List of MetricEvent instances

required

Returns:

Type Description
dict[str, dict[str, float]]

Dictionary mapping metric names to summary statistics

dict[str, dict[str, float]]

{metric_name: {"count": N, "sum": X, "mean": Y, ...}}

Source code in src/novelentitymatcher/monitoring/metrics.py
def get_metric_summary(events: list[MetricEvent]) -> dict[str, dict[str, float]]:
    """Summarize metric events by name.

    Args:
        events: List of MetricEvent instances

    Returns:
        Dictionary mapping metric names to summary statistics
        {metric_name: {"count": N, "sum": X, "mean": Y, ...}}
    """
    from collections import defaultdict

    grouped: dict[str, list[float]] = defaultdict(list)

    for event in events:
        grouped[event.name].append(event.value)

    summary: dict[str, dict[str, float]] = {}
    for name, values in grouped.items():
        summary[name] = {
            "count": len(values),
            "sum": sum(values),
            "mean": sum(values) / len(values),
            "min": min(values),
            "max": max(values),
        }

    return summary

novelentitymatcher.monitoring.performance

Performance monitoring utilities for semantic matchers.

Classes

PerformanceMonitor()

Simple performance tracking for matchers and other components.

Provides detailed metrics for different operations.

Example

monitor = PerformanceMonitor()

with monitor.track("match_operation"): result = matcher.match(query)

logger.info(monitor.summary())

Source code in src/novelentitymatcher/monitoring/performance.py
def __init__(self) -> None:
    self.metrics: dict[str, list[float]] = {}
Functions
record(operation, duration)

Record a timing for an operation.

Source code in src/novelentitymatcher/monitoring/performance.py
def record(self, operation: str, duration: float) -> None:
    """Record a timing for an operation."""
    if operation not in self.metrics:
        self.metrics[operation] = []
    self.metrics[operation].append(duration)
track(operation)

Context manager for tracking operation timing.

Source code in src/novelentitymatcher/monitoring/performance.py
@contextmanager
def track(self, operation: str):
    """Context manager for tracking operation timing."""
    start = time.time()
    try:
        yield
    finally:
        self.record(operation, time.time() - start)
summary()

Return summary statistics for all tracked operations.

Returns:

Type Description
dict[str, dict[str, float]]

Dict mapping operation names to statistics: - count: Number of recordings - total: Total time - mean: Average time - min: Minimum time - max: Maximum time

Source code in src/novelentitymatcher/monitoring/performance.py
def summary(self) -> dict[str, dict[str, float]]:
    """
    Return summary statistics for all tracked operations.

    Returns:
        Dict mapping operation names to statistics:
            - count: Number of recordings
            - total: Total time
            - mean: Average time
            - min: Minimum time
            - max: Maximum time
    """
    return {
        op: self._stats(timings) for op, timings in self.metrics.items() if timings
    }
reset()

Clear all recorded metrics.

Source code in src/novelentitymatcher/monitoring/performance.py
def reset(self) -> None:
    """Clear all recorded metrics."""
    self.metrics.clear()
get_operation_metrics(operation)

Get metrics for a specific operation.

Source code in src/novelentitymatcher/monitoring/performance.py
def get_operation_metrics(self, operation: str) -> dict[str, float] | None:
    """Get metrics for a specific operation."""
    timings = self.metrics.get(operation, [])
    return self._stats(timings) if timings else None
to_dict()

Return raw metrics as dictionary.

Source code in src/novelentitymatcher/monitoring/performance.py
def to_dict(self) -> dict[str, list[float]]:
    """Return raw metrics as dictionary."""
    return {k: list(v) for k, v in self.metrics.items()}
export_json(filepath)

Export all metrics to JSON file.

Parameters:

Name Type Description Default
filepath str

Path to output JSON file

required

Returns:

Type Description
Path

Path to saved file

Source code in src/novelentitymatcher/monitoring/performance.py
def export_json(self, filepath: str) -> Path:
    """Export all metrics to JSON file.

    Args:
        filepath: Path to output JSON file

    Returns:
        Path to saved file
    """
    output_path = Path(filepath)
    data = {
        "timestamp": datetime.now().isoformat(),
        "metrics": self.summary(),
        "raw_timings": self.to_dict(),
    }
    output_path.write_text(json.dumps(data, indent=2), encoding="utf-8")
    logger.info("Exported metrics to %s", output_path)
    return output_path
export_csv(filepath)

Export metrics to CSV file.

Parameters:

Name Type Description Default
filepath str

Path to output CSV file

required

Returns:

Type Description
Path

Path to saved file

Source code in src/novelentitymatcher/monitoring/performance.py
def export_csv(self, filepath: str) -> Path:
    """Export metrics to CSV file.

    Args:
        filepath: Path to output CSV file

    Returns:
        Path to saved file
    """
    output_path = Path(filepath)
    self._write_csv(output_path)
    logger.info("Exported metrics to %s", output_path)
    return output_path

Functions

track_performance(func)

Decorator to track method performance metrics.

Tracks
  • Number of calls
  • Total time
  • Average time per call
  • Last call duration
Usage

@track_performance def match(self, query, top_k=5): ...

Access metrics

matcher._metrics # {'calls': 10, 'total_time': 1.5, 'avg_time': 0.15}

Source code in src/novelentitymatcher/monitoring/performance.py
def track_performance(func: Callable[..., Any]) -> Callable[..., Any]:
    """
    Decorator to track method performance metrics.

    Tracks:
        - Number of calls
        - Total time
        - Average time per call
        - Last call duration

    Usage:
        @track_performance
        def match(self, query, top_k=5):
            ...

        # Access metrics
        matcher._metrics  # {'calls': 10, 'total_time': 1.5, 'avg_time': 0.15}
    """

    @functools.wraps(func)
    def wrapper(self: Any, *args: Any, **kwargs: Any) -> Any:
        start = time.time()
        result = func(self, *args, **kwargs)
        elapsed = time.time() - start

        if not hasattr(self, "_metrics"):
            self._metrics = {
                "calls": 0,
                "total_time": 0.0,
                "avg_time": 0.0,
                "last_time": 0.0,
            }

        self._metrics["calls"] += 1
        self._metrics["total_time"] += elapsed
        self._metrics["avg_time"] = self._metrics["total_time"] / self._metrics["calls"]
        self._metrics["last_time"] = elapsed

        return result

    return wrapper