🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Milvus
Zilliz

How do I implement BM25 alongside vector search?

To implement BM25 alongside vector search, you need to combine traditional keyword-based ranking with modern semantic search techniques. Start by running both search methods independently and then merge their results. BM25 calculates relevance based on term frequency and document length, while vector search uses embeddings (dense numerical representations) to find semantically similar content. The key is to normalize the scores from both methods and combine them using a weighted sum or a hybrid ranking algorithm. For example, you might assign 60% weight to BM25 scores and 40% to vector similarity scores, depending on your use case. Tools like Elasticsearch (for BM25) and FAISS or Annoy (for vector search) can handle each part separately, with custom code to merge the results.

Preprocessing and indexing are critical for consistency. For BM25, tokenize text into terms, remove stopwords, and store document-term frequencies. For vector search, generate embeddings using models like BERT or sentence-transformers and index them in a vector database. Ensure both systems process the same text input—for example, if you lowercase text for BM25, apply the same transformation before generating embeddings. When handling queries, run them through both pipelines: BM25 retrieves documents with matching keywords, while vector search finds contextually similar content. To merge results, normalize scores (e.g., min-max scaling) so BM25’s unbounded scores and vector similarity’s cosine ranges (typically -1 to 1) are comparable. You might also use reciprocal rank fusion (RRF), which combines the rankings without relying on raw scores. For instance, if a document ranks 1st in BM25 and 3rd in vector search, its combined score could be 1/(1+1) + 1/(3+1) based on RRF.

For a practical implementation, consider libraries like rank_bm25 for BM25 and sentence-transformers for embeddings. Here’s a simplified workflow:

  1. Index documents with BM25 using tokenized text.
  2. Generate embeddings for all documents and index them in FAISS.
  3. For a query, retrieve top-k results from BM25 and vector search.
  4. Normalize scores (e.g., scale BM25 scores to 0-1 using the highest score in the batch).
  5. Combine scores using a formula like final_score = (bm25_weight * bm25_score) + (vector_weight * vector_score).
  6. Sort and return the merged list.

Testing is crucial—adjust weights based on metrics like precision@k or user feedback. If BM25 outperforms vector search for exact matches, increase its weight. If vector search better handles paraphrased queries, prioritize it. Tools like Elasticsearch’s hybrid search plugin simplify this by integrating both methods natively, reducing custom code.

Like the article? Spread the word