Backends¶
novelentitymatcher.backends.base
¶
Classes¶
RerankerBackend
¶
Bases: ABC
Abstract base class for reranker backends.
Functions¶
score(query, docs)
abstractmethod
¶
rerank(query, candidates, top_k=5, text_field='text')
¶
Rerank candidates and return top_k.
Default implementation using score(). Subclasses can override for optimization.
Source code in src/novelentitymatcher/backends/base.py
novelentitymatcher.backends.static_embedding
¶
Classes¶
StaticEmbeddingBackend(model_name, embedding_dim=None)
¶
Bases: EmbeddingBackend
Backend for static embeddings.
Supports two approaches: 1. SentenceTransformer's StaticEmbedding module 2. model2vec StaticModel (for minishlab potion models and custom distillations)
Models: - RikkaBotan/stable-static-embedding-fast-retrieval-mrl-en (StaticEmbedding) - minishlab/potion-base-8M (model2vec) - minishlab/potion-base-32M (model2vec)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
HuggingFace model name or local path |
required |
embedding_dim
|
int | None
|
Optional dimension for MRL models (None = full dimension) |
None
|
Source code in src/novelentitymatcher/backends/static_embedding.py
Attributes¶
embedding_dimension
property
¶
Return the embedding dimension of the model.
Functions¶
encode(texts, batch_size=32)
¶
Generate embeddings using static lookup.
Source code in src/novelentitymatcher/backends/static_embedding.py
similarity(query_embeddings, corpus_embeddings, top_k)
¶
Compute similarity using cosine similarity.
Source code in src/novelentitymatcher/backends/static_embedding.py
Functions¶
novelentitymatcher.backends.reranker_st
¶
SentenceTransformer-based cross-encoder reranker backend.
Classes¶
STReranker(model_name='BAAI/bge-reranker-v2-m3', device=None, batch_size=32)
¶
Bases: RerankerBackend
SentenceTransformer cross-encoder reranker.
Uses CrossEncoder models for precise query-document scoring.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Name or path of the CrossEncoder model |
'BAAI/bge-reranker-v2-m3'
|
device
|
str | None
|
Device to run model on (None for auto-detection) |
None
|
batch_size
|
int
|
Batch size for inference |
32
|
Source code in src/novelentitymatcher/backends/reranker_st.py
Functions¶
score(query, docs)
¶
Score query-document pairs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Query text |
required |
docs
|
list[str]
|
List of document texts |
required |
Returns:
| Type | Description |
|---|---|
list[float]
|
List of scores (one per document) |
Source code in src/novelentitymatcher/backends/reranker_st.py
rerank(query, candidates, top_k=5, text_field='text')
¶
Rerank candidates and return top_k.
Default implementation using score(). Subclasses can override for optimization.