Milvus
Zilliz

How does UltraRag perform retrieval?

UltraRag performs retrieval as a core component of its modular, automated, and adaptive Retrieval-Augmented Generation (RAG) framework. Instead of relying on a single, fixed retrieval method, UltraRag abstracts retrieval capabilities into independent “MCP Servers” (Model Context Protocol Servers). These servers are invoked through standardized function-level "Tool interfaces", allowing for a flexible and plug-and-play approach to integrating different retrieval mechanisms. This modular design, central to its Model Context Protocol (MCP) architecture, enables users to customize and orchestrate various retrieval components to construct sophisticated RAG systems tailored to specific application requirements, encompassing both text-based and multimodal data.

The framework supports diverse retrieval strategies by allowing the integration of multiple retrieval backends and embedding models. This extensibility means UltraRag can leverage approaches like dense vector search for semantic similarity, keyword-based search for exact matches, or advanced hybrid retrieval techniques. For vector-based retrieval, UltraRag facilitates the conversion of source documents into numerical vector embeddings. These embeddings are then stored and efficiently indexed in a vector database, such as Milvus, which is specifically designed for high-performance similarity searches across vast datasets. When a query is posed, it undergoes a similar embedding process, and the vector database quickly identifies and returns the most relevant document chunks or passages based on vector similarity. The orchestration of these complex retrieval workflows, including sequential steps, loops, and conditional branching, is defined declaratively through YAML configuration files, providing a low-code approach to building intricate multi-stage retrieval pipelines.

A significant advantage of UltraRag’s retrieval system is its native support for multimodal inputs, extending its capabilities beyond traditional text-only retrieval to encompass vision and cross-modal data. This allows for the ingestion, indexing, and retrieval of information from diverse sources, including various document formats like TXT, PDF, Markdown, and potentially images or other media types. The framework’s architecture ensures that new retrieval models or algorithms can be seamlessly integrated without invasive modifications to the core system. This flexibility in knowledge base management, including robust encoding and indexing functionalities, allows users to adapt and fine-tune retrieval components for domain-specific knowledge, thereby enhancing the overall performance and relevance of the retrieved information within any RAG application.

Like the article? Spread the word