Monitoring¶

`novelentitymatcher.monitoring.metrics` ¶

Metrics schema and utilities for library-centric observability.

Provides MetricEvent dataclass and helper functions for emitting metrics through user-provided callbacks. Works without external dependencies and is suitable for library usage patterns.

Classes¶

`MetricEvent(name, value, unit, labels, timestamp)` `dataclass` ¶

Standardized metric event structure.

Can be emitted from any component (matcher, stages, detectors) and sent to user-provided callbacks for custom handling.

Attributes:

Name	Type	Description
`name`	`str`	Metric name (e.g., "match_latency", "novelty_rate")
`value`	`float`	Numeric value of the metric
`unit`	`str`	Unit of measurement (e.g., "seconds", "count", "ratio")
`labels`	`dict[str, str]`	Key-value pairs for categorization (e.g., {"stage": "ood"})
`timestamp`	`datetime`	When the metric was recorded

Functions¶

`create_metric(name, value, unit, labels=None)` ¶

Helper to create MetricEvent with current timestamp.

Parameters:

Name	Type	Description	Default
`name`	`str`	Metric name	required
`value`	`float`	Numeric value	required
`unit`	`str`	Unit of measurement	required
`labels`	`dict[str, str] \| None`	Optional labels dictionary	`None`

Returns:

Type	Description
`MetricEvent`	MetricEvent with current timestamp

Source code in src/novelentitymatcher/monitoring/metrics.py

def create_metric(
    name: str,
    value: float,
    unit: str,
    labels: dict[str, str] | None = None,
) -> MetricEvent:
    """Helper to create MetricEvent with current timestamp.

    Args:
        name: Metric name
        value: Numeric value
        unit: Unit of measurement
        labels: Optional labels dictionary

    Returns:
        MetricEvent with current timestamp
    """
    return MetricEvent(
        name=name,
        value=value,
        unit=unit,
        labels=labels or {},
        timestamp=datetime.now(),
    )

`get_metric_summary(events)` ¶

Summarize metric events by name.

Parameters:

Name	Type	Description	Default
`events`	`list[MetricEvent]`	List of MetricEvent instances	required

Returns:

Type	Description
`dict[str, dict[str, float]]`	Dictionary mapping metric names to summary statistics
`dict[str, dict[str, float]]`	{metric_name: {"count": N, "sum": X, "mean": Y, ...}}

Source code in src/novelentitymatcher/monitoring/metrics.py

def get_metric_summary(events: list[MetricEvent]) -> dict[str, dict[str, float]]:
    """Summarize metric events by name.

    Args:
        events: List of MetricEvent instances

    Returns:
        Dictionary mapping metric names to summary statistics
        {metric_name: {"count": N, "sum": X, "mean": Y, ...}}
    """
    from collections import defaultdict

    grouped: dict[str, list[float]] = defaultdict(list)

    for event in events:
        grouped[event.name].append(event.value)

    summary: dict[str, dict[str, float]] = {}
    for name, values in grouped.items():
        summary[name] = {
            "count": len(values),
            "sum": sum(values),
            "mean": sum(values) / len(values),
            "min": min(values),
            "max": max(values),
        }

    return summary

`novelentitymatcher.monitoring.performance` ¶

Performance monitoring utilities for semantic matchers.

Classes¶

`PerformanceMonitor()` ¶

Simple performance tracking for matchers and other components.

Provides detailed metrics for different operations.

Example

monitor = PerformanceMonitor()

with monitor.track("match_operation"): result = matcher.match(query)

logger.info(monitor.summary())

Source code in src/novelentitymatcher/monitoring/performance.py

def __init__(self) -> None:
    self.metrics: dict[str, list[float]] = {}

Functions¶

`record(operation, duration)` ¶

Record a timing for an operation.

Source code in src/novelentitymatcher/monitoring/performance.py

def record(self, operation: str, duration: float) -> None:
    """Record a timing for an operation."""
    if operation not in self.metrics:
        self.metrics[operation] = []
    self.metrics[operation].append(duration)

`track(operation)` ¶

Context manager for tracking operation timing.

Source code in src/novelentitymatcher/monitoring/performance.py

@contextmanager
def track(self, operation: str):
    """Context manager for tracking operation timing."""
    start = time.time()
    try:
        yield
    finally:
        self.record(operation, time.time() - start)

`summary()` ¶

Return summary statistics for all tracked operations.

Returns:

Type	Description
`dict[str, dict[str, float]]`	Dict mapping operation names to statistics: - count: Number of recordings - total: Total time - mean: Average time - min: Minimum time - max: Maximum time

Source code in src/novelentitymatcher/monitoring/performance.py

def summary(self) -> dict[str, dict[str, float]]:
    """
    Return summary statistics for all tracked operations.

    Returns:
        Dict mapping operation names to statistics:
            - count: Number of recordings
            - total: Total time
            - mean: Average time
            - min: Minimum time
            - max: Maximum time
    """
    return {
        op: self._stats(timings) for op, timings in self.metrics.items() if timings
    }

`reset()` ¶

Clear all recorded metrics.

Source code in src/novelentitymatcher/monitoring/performance.py

def reset(self) -> None:
    """Clear all recorded metrics."""
    self.metrics.clear()

`get_operation_metrics(operation)` ¶

Get metrics for a specific operation.

Source code in src/novelentitymatcher/monitoring/performance.py

def get_operation_metrics(self, operation: str) -> dict[str, float] | None:
    """Get metrics for a specific operation."""
    timings = self.metrics.get(operation, [])
    return self._stats(timings) if timings else None

`to_dict()` ¶

Return raw metrics as dictionary.

Source code in src/novelentitymatcher/monitoring/performance.py

def to_dict(self) -> dict[str, list[float]]:
    """Return raw metrics as dictionary."""
    return {k: list(v) for k, v in self.metrics.items()}

`export_json(filepath)` ¶

Export all metrics to JSON file.

Parameters:

Name	Type	Description	Default
`filepath`	`str`	Path to output JSON file	required

Returns:

Type	Description
`Path`	Path to saved file

Source code in src/novelentitymatcher/monitoring/performance.py

def export_json(self, filepath: str) -> Path:
    """Export all metrics to JSON file.

    Args:
        filepath: Path to output JSON file

    Returns:
        Path to saved file
    """
    output_path = Path(filepath)
    data = {
        "timestamp": datetime.now().isoformat(),
        "metrics": self.summary(),
        "raw_timings": self.to_dict(),
    }
    output_path.write_text(json.dumps(data, indent=2), encoding="utf-8")
    logger.info("Exported metrics to %s", output_path)
    return output_path

`export_csv(filepath)` ¶

Export metrics to CSV file.

Parameters:

Name	Type	Description	Default
`filepath`	`str`	Path to output CSV file	required

Returns:

Type	Description
`Path`	Path to saved file

Source code in src/novelentitymatcher/monitoring/performance.py

def export_csv(self, filepath: str) -> Path:
    """Export metrics to CSV file.

    Args:
        filepath: Path to output CSV file

    Returns:
        Path to saved file
    """
    output_path = Path(filepath)
    self._write_csv(output_path)
    logger.info("Exported metrics to %s", output_path)
    return output_path

Functions¶

`track_performance(func)` ¶

Decorator to track method performance metrics.

Tracks

Number of calls
Total time
Average time per call
Last call duration

Usage

@track_performance def match(self, query, top_k=5): ...

Access metrics¶

matcher._metrics # {'calls': 10, 'total_time': 1.5, 'avg_time': 0.15}

Source code in src/novelentitymatcher/monitoring/performance.py

def track_performance(func: Callable[..., Any]) -> Callable[..., Any]:
    """
    Decorator to track method performance metrics.

    Tracks:
        - Number of calls
        - Total time
        - Average time per call
        - Last call duration

    Usage:
        @track_performance
        def match(self, query, top_k=5):
            ...

        # Access metrics
        matcher._metrics  # {'calls': 10, 'total_time': 1.5, 'avg_time': 0.15}
    """

    @functools.wraps(func)
    def wrapper(self: Any, *args: Any, **kwargs: Any) -> Any:
        start = time.time()
        result = func(self, *args, **kwargs)
        elapsed = time.time() - start

        if not hasattr(self, "_metrics"):
            self._metrics = {
                "calls": 0,
                "total_time": 0.0,
                "avg_time": 0.0,
                "last_time": 0.0,
            }

        self._metrics["calls"] += 1
        self._metrics["total_time"] += elapsed
        self._metrics["avg_time"] = self._metrics["total_time"] / self._metrics["calls"]
        self._metrics["last_time"] = elapsed

        return result

    return wrapper

Monitoring¶

novelentitymatcher.monitoring.metrics ¶

Classes¶

MetricEvent(name, value, unit, labels, timestamp) dataclass ¶

Functions¶

create_metric(name, value, unit, labels=None) ¶

get_metric_summary(events) ¶

novelentitymatcher.monitoring.performance ¶

Classes¶

PerformanceMonitor() ¶

Functions¶

record(operation, duration) ¶

track(operation) ¶

summary() ¶

reset() ¶

get_operation_metrics(operation) ¶

to_dict() ¶

export_json(filepath) ¶

export_csv(filepath) ¶

Functions¶

track_performance(func) ¶

Access metrics¶

`novelentitymatcher.monitoring.metrics` ¶

`MetricEvent(name, value, unit, labels, timestamp)` `dataclass` ¶

`create_metric(name, value, unit, labels=None)` ¶

`get_metric_summary(events)` ¶

`novelentitymatcher.monitoring.performance` ¶

`PerformanceMonitor()` ¶

`record(operation, duration)` ¶

`track(operation)` ¶

`summary()` ¶

`reset()` ¶

`get_operation_metrics(operation)` ¶

`to_dict()` ¶

`export_json(filepath)` ¶

`export_csv(filepath)` ¶

`track_performance(func)` ¶