queryNode-related Configurations

About Milvus
Get Started
Concepts
User Guide
Data Import
Administration Guide
Tools
Integrations
Tutorials
FAQs
API Reference

Related configuration of queryNode, used to run hybrid search between vector and scalar data.

`queryNode.stats.publishInterval`

Description	Default Value
The interval that query node publishes the node statistics information, including segment status, cpu usage, memory usage, health status, etc. Unit: ms.	1000

`queryNode.segcore.knowhereThreadPoolNumRatio`

Description	Default Value
The number of threads in knowhere's thread pool. If disk is enabled, the pool size will multiply with knowhereThreadPoolNumRatio([1, 32]).	4

`queryNode.segcore.chunkRows`

Description	Default Value
Row count by which Segcore divides a segment into chunks.	128

`queryNode.segcore.interimIndex.enableIndex`

Description	Default Value
Whether to create a temporary index for growing segments and sealed segments not yet indexed, improving search performance. Milvus will eventually seals and indexes all segments, but enabling this optimizes search performance for immediate queries following data insertion. This defaults to true, indicating that Milvus creates temporary index for growing segments and the sealed segments that are not indexed upon searches.	true

Description

Default Value

Whether to create a temporary index for growing segments and sealed segments not yet indexed, improving search performance.

Milvus will eventually seals and indexes all segments, but enabling this optimizes search performance for immediate queries following data insertion.

This defaults to true, indicating that Milvus creates temporary index for growing segments and the sealed segments that are not indexed upon searches.

true

`queryNode.segcore.interimIndex.nlist`

Description	Default Value
temp index nlist, recommend to set sqrt(chunkRows), must smaller than chunkRows/8	128

`queryNode.segcore.interimIndex.nprobe`

Description	Default Value
nprobe to search small index, based on your accuracy requirement, must smaller than nlist	16

`queryNode.segcore.interimIndex.memExpansionRate`

Description	Default Value
extra memory needed by building interim index	1.15

`queryNode.segcore.interimIndex.buildParallelRate`

Description	Default Value
the ratio of building interim index parallel matched with cpu num	0.5

`queryNode.segcore.multipleChunkedEnable`

Description	Default Value
Enable multiple chunked search	true

`queryNode.segcore.knowhereScoreConsistency`

Description	Default Value
Enable knowhere strong consistency score computation logic	false

`queryNode.loadMemoryUsageFactor`

Description	Default Value
The multiply factor of calculating the memory usage while loading segments	1

`queryNode.enableDisk`

Description	Default Value
enable querynode load disk index, and search on disk index	false

`queryNode.cache.memoryLimit`

Description	Default Value
2 GB, 2 * 1024 1024 1024	2147483648

`queryNode.cache.readAheadPolicy`

Description	Default Value
The read ahead policy of chunk cache, options: `normal, random, sequential, willneed, dontneed`	willneed

`queryNode.cache.warmup`

Description	Default Value
options: async, sync, disable. Specifies the necessity for warming up the chunk cache. 1. If set to "sync" or "async" the original vector data will be synchronously/asynchronously loaded into the chunk cache during the load process. This approach has the potential to substantially reduce query/search latency for a specific duration post-load, albeit accompanied by a concurrent increase in disk usage; 2. If set to "disable" original vector data will only be loaded into the chunk cache during search/query.	disable

Description

Default Value

options: async, sync, disable.

Specifies the necessity for warming up the chunk cache.

1. If set to "sync" or "async" the original vector data will be synchronously/asynchronously loaded into the

chunk cache during the load process. This approach has the potential to substantially reduce query/search latency

for a specific duration post-load, albeit accompanied by a concurrent increase in disk usage;

2. If set to "disable" original vector data will only be loaded into the chunk cache during search/query.

disable

`queryNode.mmap.vectorField`

Description	Default Value
Enable mmap for loading vector data	false

`queryNode.mmap.vectorIndex`

Description	Default Value
Enable mmap for loading vector index	false

`queryNode.mmap.scalarField`

Description	Default Value
Enable mmap for loading scalar data	false

`queryNode.mmap.scalarIndex`

Description	Default Value
Enable mmap for loading scalar index	false

`queryNode.mmap.chunkCache`

Description	Default Value
Enable mmap for chunk cache (raw vector retrieving).	true

`queryNode.mmap.growingMmapEnabled`

Description	Default Value
Enable memory mapping (mmap) to optimize the handling of growing raw data. By activating this feature, the memory overhead associated with newly added or modified data will be significantly minimized. However, this optimization may come at the cost of a slight decrease in query latency for the affected data segments.	false

`queryNode.mmap.fixedFileSizeForMmapAlloc`

Description	Default Value
tmp file size for mmap chunk manager	1

`queryNode.mmap.maxDiskUsagePercentageForMmapAlloc`

Description	Default Value
disk percentage used in mmap chunk manager	50

`queryNode.lazyload.enabled`

Description	Default Value
Enable lazyload for loading data	false

`queryNode.lazyload.waitTimeout`

Description	Default Value
max wait timeout duration in milliseconds before start to do lazyload search and retrieve	30000

`queryNode.lazyload.requestResourceTimeout`

Description	Default Value
max timeout in milliseconds for waiting request resource for lazy load, 5s by default	5000

`queryNode.lazyload.requestResourceRetryInterval`

Description	Default Value
retry interval in milliseconds for waiting request resource for lazy load, 2s by default	2000

`queryNode.lazyload.maxRetryTimes`

Description	Default Value
max retry times for lazy load, 1 by default	1

`queryNode.lazyload.maxEvictPerRetry`

Description	Default Value
max evict count for lazy load, 1 by default	1

`queryNode.indexOffsetCacheEnabled`

Description	Default Value
enable index offset cache for some scalar indexes, now is just for bitmap index, enable this param can improve performance for retrieving raw data from index	false

`queryNode.scheduler.maxReadConcurrentRatio`

Description	Default Value
maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task). Max read concurrency would be the value of hardware.GetCPUNum * maxReadConcurrentRatio. It defaults to 2.0, which means max read concurrency would be the value of hardware.GetCPUNum * 2. Max read concurrency must greater than or equal to 1, and less than or equal to hardware.GetCPUNum * 100. (0, 100]	1

`queryNode.scheduler.cpuRatio`

Description	Default Value
ratio used to estimate read task cpu usage.	10

`queryNode.scheduler.scheduleReadPolicy.name`

Description	Default Value
fifo: A FIFO queue support the schedule. user-task-polling: The user's tasks will be polled one by one and scheduled. Scheduling is fair on task granularity. The policy is based on the username for authentication. And an empty username is considered the same user. When there are no multi-users, the policy decay into FIFO"	fifo

`queryNode.scheduler.scheduleReadPolicy.taskQueueExpire`

Description	Default Value
Control how long (many seconds) that queue retains since queue is empty	60

`queryNode.scheduler.scheduleReadPolicy.enableCrossUserGrouping`

Description	Default Value
Enable Cross user grouping when using user-task-polling policy. (Disable it if user's task can not merge each other)	false

`queryNode.scheduler.scheduleReadPolicy.maxPendingTaskPerUser`

Description	Default Value
Max pending task per user in scheduler	1024

`queryNode.levelZeroForwardPolicy`

Description	Default Value
delegator level zero deletion forward policy, possible option["FilterByBF", "RemoteLoad"]	FilterByBF

`queryNode.streamingDeltaForwardPolicy`

Description	Default Value
delegator streaming deletion forward policy, possible option["FilterByBF", "Direct"]	FilterByBF

`queryNode.dataSync.flowGraph.maxQueueLength`

Description	Default Value
The maximum size of task queue cache in flow graph in query node.	16

`queryNode.dataSync.flowGraph.maxParallelism`

Description	Default Value
Maximum number of tasks executed in parallel in the flowgraph	1024

`queryNode.enableSegmentPrune`

Description	Default Value
use partition stats to prune data in search/query on shard delegator	false

`queryNode.queryStreamBatchSize`

Description	Default Value
return min batch size of stream query	4194304

`queryNode.queryStreamMaxBatchSize`

Description	Default Value
return max batch size of stream query	134217728

`queryNode.bloomFilterApplyParallelFactor`

Description	Default Value
parallel factor when to apply pk to bloom filter, default to 4*CPU_CORE_NUM	4

`queryNode.workerPooling.size`

Description	Default Value
the size for worker querynode client pool	10

`queryNode.ip`

Description	Default Value
TCP/IP address of queryNode. If not specified, use the first unicastable address

`queryNode.port`

Description	Default Value
TCP port of queryNode	21123

`queryNode.grpc.serverMaxSendSize`

Description	Default Value
The maximum size of each RPC request that the queryNode can send, unit: byte	536870912

`queryNode.grpc.serverMaxRecvSize`

Description	Default Value
The maximum size of each RPC request that the queryNode can receive, unit: byte	268435456

`queryNode.grpc.clientMaxSendSize`

Description	Default Value
The maximum size of each RPC request that the clients on queryNode can send, unit: byte	268435456

`queryNode.grpc.clientMaxRecvSize`

Description	Default Value
The maximum size of each RPC request that the clients on queryNode can receive, unit: byte	536870912

queryNode-related Configurations
queryNode.stats.publishInterval
queryNode.segcore.knowhereThreadPoolNumRatio
queryNode.segcore.chunkRows
queryNode.segcore.interimIndex.enableIndex
queryNode.segcore.interimIndex.nlist
queryNode.segcore.interimIndex.nprobe
queryNode.segcore.interimIndex.memExpansionRate
queryNode.segcore.interimIndex.buildParallelRate
queryNode.segcore.multipleChunkedEnable
queryNode.segcore.knowhereScoreConsistency
queryNode.loadMemoryUsageFactor
queryNode.enableDisk
queryNode.cache.memoryLimit
queryNode.cache.readAheadPolicy
queryNode.cache.warmup
queryNode.mmap.vectorField
queryNode.mmap.vectorIndex
queryNode.mmap.scalarField
queryNode.mmap.scalarIndex
queryNode.mmap.chunkCache
queryNode.mmap.growingMmapEnabled
queryNode.mmap.fixedFileSizeForMmapAlloc
queryNode.mmap.maxDiskUsagePercentageForMmapAlloc
queryNode.lazyload.enabled
queryNode.lazyload.waitTimeout
queryNode.lazyload.requestResourceTimeout
queryNode.lazyload.requestResourceRetryInterval
queryNode.lazyload.maxRetryTimes
queryNode.lazyload.maxEvictPerRetry
queryNode.indexOffsetCacheEnabled
queryNode.scheduler.maxReadConcurrentRatio
queryNode.scheduler.cpuRatio
queryNode.scheduler.scheduleReadPolicy.name
queryNode.scheduler.scheduleReadPolicy.taskQueueExpire
queryNode.scheduler.scheduleReadPolicy.enableCrossUserGrouping
queryNode.scheduler.scheduleReadPolicy.maxPendingTaskPerUser
queryNode.levelZeroForwardPolicy
queryNode.streamingDeltaForwardPolicy
queryNode.dataSync.flowGraph.maxQueueLength
queryNode.dataSync.flowGraph.maxParallelism
queryNode.enableSegmentPrune
queryNode.queryStreamBatchSize
queryNode.queryStreamMaxBatchSize
queryNode.bloomFilterApplyParallelFactor
queryNode.workerPooling.size
queryNode.ip
queryNode.port
queryNode.grpc.serverMaxSendSize
queryNode.grpc.serverMaxRecvSize
queryNode.grpc.clientMaxSendSize
queryNode.grpc.clientMaxRecvSize

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?

queryNode.stats.publishInterval

queryNode.segcore.knowhereThreadPoolNumRatio

queryNode.segcore.chunkRows

queryNode.segcore.interimIndex.enableIndex

queryNode.segcore.interimIndex.nlist

queryNode.segcore.interimIndex.nprobe

queryNode.segcore.interimIndex.memExpansionRate

queryNode.segcore.interimIndex.buildParallelRate

queryNode.segcore.multipleChunkedEnable

queryNode.segcore.knowhereScoreConsistency

queryNode.loadMemoryUsageFactor

queryNode.enableDisk

queryNode.cache.memoryLimit

queryNode.cache.readAheadPolicy

queryNode.cache.warmup

queryNode.mmap.vectorField

queryNode.mmap.vectorIndex

queryNode.mmap.scalarField

queryNode.mmap.scalarIndex

queryNode.mmap.chunkCache

queryNode.mmap.growingMmapEnabled

queryNode.mmap.fixedFileSizeForMmapAlloc

queryNode.mmap.maxDiskUsagePercentageForMmapAlloc

queryNode.lazyload.enabled

queryNode.lazyload.waitTimeout

queryNode.lazyload.requestResourceTimeout

queryNode.lazyload.requestResourceRetryInterval

queryNode.lazyload.maxRetryTimes

queryNode.lazyload.maxEvictPerRetry

queryNode.indexOffsetCacheEnabled

queryNode.scheduler.maxReadConcurrentRatio

queryNode.scheduler.cpuRatio

queryNode.scheduler.scheduleReadPolicy.name

queryNode.scheduler.scheduleReadPolicy.taskQueueExpire

queryNode.scheduler.scheduleReadPolicy.enableCrossUserGrouping

queryNode.scheduler.scheduleReadPolicy.maxPendingTaskPerUser

queryNode.levelZeroForwardPolicy

queryNode.streamingDeltaForwardPolicy

queryNode.dataSync.flowGraph.maxQueueLength

queryNode.dataSync.flowGraph.maxParallelism

queryNode.enableSegmentPrune

queryNode.queryStreamBatchSize

queryNode.queryStreamMaxBatchSize

queryNode.bloomFilterApplyParallelFactor

queryNode.workerPooling.size

queryNode.ip

queryNode.port

queryNode.grpc.serverMaxSendSize

queryNode.grpc.serverMaxRecvSize

queryNode.grpc.clientMaxSendSize

queryNode.grpc.clientMaxRecvSize

Table of contents