Warm UpCompatible with Milvus 2.6.4+
Warm Up complements Tiered Storage by preloading selected fields or indexes into the cache before a segment becomes queryable. You can configure warmup at the cluster, collection, or individual field/index level, allowing fine-grained control over first-query latency and resource usage.
Why warm up
Lazy Load in Tiered Storage improves efficiency by loading only metadata initially. However, this can cause latency on the first query to cold data, since required chunks or indexes must be fetched from remote storage.
Warm Up solves this problem by proactively caching critical data during segment initialization.
It is especially beneficial when:
Certain scalar indexes are frequently used in filter conditions.
Vector indexes are essential for search performance and must be ready immediately.
Cold-start latency after QueryNode restart or new segment load is unacceptable.
In contrast, Warm Up is not recommended for fields or indexes that are queried infrequently. Disabling Warm Up shortens segment load time and conserves cache space—ideal for large vector fields or non-critical scalar fields.
Configuration levels
Level |
Scope |
Configuration method |
Priority |
|---|---|---|---|
Field/Index |
Single field or index |
SDK methods:
|
Highest |
Collection |
All fields/indexes in a collection |
SDK methods:
|
Medium |
Cluster |
All collections in the cluster |
|
Lowest (default) |
Override behavior:
If a field has its own warmup setting, that setting takes precedence over collection-level and cluster-level settings.
If no field- or index-level setting exists, the collection-level setting applies.
If neither field- or index-level nor collection-level settings exist, the cluster-level applies.
When using alter operations, the most recent alter value takes effect.
Configure warmup at cluster level
Cluster-level warmup is configured in the Milvus configuration file milvus.yaml and applies to all collections in the cluster. This serves as the baseline default.
Each target type supports two settings:
Warmup Setting |
Description |
Typical scenario |
|---|---|---|
|
Preload before the segment becomes queryable. Load time increases slightly, but the first query incurs no latency. |
Use for performance-critical data that must be immediately available, such as high-frequency scalar indexes or key vector indexes used in search. |
|
Skip preloading. The segment becomes queryable faster, but the first query may trigger on-demand loading. |
Use for infrequently accessed or large data such as raw vector fields or non-critical scalar fields. |
Example YAML:
queryNode:
segcore:
tieredStorage:
warmup:
# options: sync, disable.
# Specifies the timing for warming up the Tiered Storage cache.
# - `sync`: data will be loaded into the cache before a segment is considered loaded.
# - `disable`: data will not be proactively loaded into the cache, and loaded only if needed by search/query tasks.
# Defaults to `sync`, except for vector field which defaults to `disable`.
scalarField: sync
scalarIndex: sync
vectorField: disable # cache warmup for vector field raw data is by default disabled.
vectorIndex: sync
Parameter |
Warmup Setting |
Description |
Recommended use case |
|---|---|---|---|
|
|
Controls whether scalar field data is preloaded. |
Use |
|
|
Controls whether scalar indexes are preloaded. |
Use |
|
|
Controls whether vector field data is preloaded. |
Generally |
|
|
Controls whether vector indexes are preloaded. |
Use |
Configure warmup at collection levelCompatible with Milvus 2.6.11+
Collection-level warmup allows you to override cluster defaults for a specific collection. This is useful when a collection has different access patterns than the cluster-wide baseline.
Set warmup when creating a collection
from pymilvus import MilvusClient
client = MilvusClient(uri="http://localhost:19530")
client.create_collection(
collection_name="my_collection",
schema=schema,
properties={
"warmup.scalarField": "sync",
"warmup.scalarIndex": "sync",
"warmup.vectorField": "disable",
"warmup.vectorIndex": "sync"
}
)
Alter warmup settings on an existing collection
You must alter collection properties before calling load(). Altering a loaded collection returns an error. Changes to warmup settings take effect the next time you load the collection.
client.alter_collection_properties(
collection_name="my_collection",
properties={
"warmup.vectorIndex": "disable",
"warmup.scalarField": "sync"
}
)
Property reference:
Property |
Warmup Setting |
Description |
|---|---|---|
|
|
Warmup setting for all scalar fields in the collection. |
|
|
Warmup setting for all scalar indexes in the collection. |
|
|
Warmup setting for all vector fields in the collection. |
|
|
Warmup setting for all vector indexes in the collection. |
Configure warmup at field levelCompatible with Milvus 2.6.11+
Field-level warmup provides the finest granularity, allowing you to control warmup behavior for individual fields. This is useful when specific fields have unique access patterns.
Field-level warmup applies to field raw data only, not to indexes on that field. To configure warmup for an index, use index-level configuration.
Set warmup when creating a field
from pymilvus import MilvusClient, DataType
schema = MilvusClient.create_schema()
schema.add_field(
field_name="id",
datatype=DataType.INT64,
is_primary=True
)
schema.add_field(
field_name="category",
datatype=DataType.VARCHAR,
max_length=128,
warmup="sync" # Preload this field at load time
)
schema.add_field(
field_name="embedding",
datatype=DataType.FLOAT_VECTOR,
dim=768,
warmup="disable" # Do not preload vector raw data
)
Alter warmup settings on an existing field
You must alter field settings before calling load(). Altering a field on a loaded collection returns an error. Changes to warmup settings take effect the next time you load the collection.
client.alter_collection_field(
collection_name="my_collection",
field_name="category",
field_params={"warmup": "sync"}
)
Configure warmup at index levelCompatible with Milvus 2.6.11+
Index-level warmup allows you to control preloading for individual indexes, independent of the underlying field’s warmup setting.
Set warmup when creating an index
from pymilvus import MilvusClient
client = MilvusClient(uri="http://localhost:19530")
index_params = client.prepare_index_params()
index_params.add_index(
field_name="embedding",
index_type="HNSW",
metric_type="COSINE",
params={
"M": 16,
"efConstruction": 256,
"warmup": "sync" # Preload this index at load time
}
)
index_params.add_index(
field_name="category",
index_type="AUTOINDEX",
params={"warmup": "disable"} # Do not preload this index
)
client.create_index(
collection_name="my_collection",
index_params=index_params
)
Alter warmup settings on an existing index
You must alter index settings before calling load(). Altering an index on a loaded collection returns an error. Changes to warmup settings take effect the next time you load the collection.
client.alter_index_properties(
collection_name="my_collection",
index_name="embedding",
properties={"warmup": "sync"}
)
Warmup behavior reference
The following table summarizes warmup behavior at different stages of the segment lifecycle.
Warmup Setting |
Load Phase |
Search/Query Phase |
Release Phase |
|---|---|---|---|
|
Data is loaded to local storage. Destination (disk or memory) depends on mmap setting. |
Query hits local cache directly. |
Local cached data is cleared. |
|
Data is not loaded to local storage. |
Data is fetched on demand from object storage, then cached locally based on mmap setting. |
Local cached data is cleared. |
Interaction with mmap:
Warmup Setting |
Mmap Enabled |
Data Location |
|---|---|---|
|
|
Local disk ( |
|
|
Local memory |
|
|
Fetched to local disk on first access |
|
|
Fetched to local memory on first access |
Local cache directory structure (when mmap is enabled):
Data Type |
Directory Path |
|---|---|
Scalar/Vector field data |
|
Scalar/Vector index files |
|
Best practices
Warm Up only affects the initial load. If cached data is later evicted, the next query will reload it on demand.
Avoid overusing
sync. Preloading too many fields increases load time and cache pressure.Start conservatively—enable Warm Up only for fields and indexes that are frequently accessed.
Monitor query latency and cache metrics, then expand preloading as needed.
For mixed workloads, apply
syncto performance-sensitive collections anddisableto capacity-oriented ones.