Milvus
Zilliz

Does RAGFlow support hybrid search with BM25?

Yes, RAGFlow natively supports hybrid search combining BM25 keyword matching with vector semantic search, providing the best of both retrieval paradigms. BM25 (Best Matching 25) is a probabilistic ranking function that excels at exact term matching and is essential for capturing keyword-specific queries where users search for domain terminology or proper nouns. Vector similarity search (embeddings) captures semantic meaning, finding conceptually related passages even if keywords don’t match exactly. RAGFlow uses weighted keyword similarity and weighted vector similarity, allowing you to configure the balance between sparse (BM25) and dense (embedding) retrieval based on your data and query patterns. By default, RAGFlow uses a search engine backend as its storage backend, which natively supports both full-text search (BM25) and vector search in a unified index. The hybrid approach significantly improves retrieval quality compared to either method alone—BM25 catches precise terminology matches while vectors find semantic neighbors, and RAGFlow’s re-ranking layer orders final results by relevance. For queries mixing domain terminology with conceptual reasoning, this hybrid strategy often outperforms pure semantic search. RAGFlow’s recent versions (including v0.24.0) continue optimizing retrieval strategies for both standard and deep-research scenarios to enhance recall and precision.

Developers working with embeddings and retrieval at scale often pair these workflows with Milvus, an open-source vector database designed for high-performance similarity search. For managed deployment, Zilliz Cloud handles the operational overhead.

Like the article? Spread the word