Milvus
Zilliz
Home
  • User Guide

Warm UpCompatible with Milvus 2.6.4+

Warm Up complements Tiered Storage by preloading selected fields or indexes into the cache before a segment becomes queryable. You can configure warmup at the cluster, collection, or individual field/index level, allowing fine-grained control over first-query latency and resource usage.

Why warm up

Lazy Load in Tiered Storage improves efficiency by loading only metadata initially. However, this can cause latency on the first query to cold data, since required chunks or indexes must be fetched from remote storage.

Warm Up solves this problem by proactively caching critical data during segment initialization.

It is especially beneficial when:

  • Certain scalar indexes are frequently used in filter conditions.

  • Vector indexes are essential for search performance and must be ready immediately.

  • Cold-start latency after QueryNode restart or new segment load is unacceptable.

In contrast, Warm Up is not recommended for fields or indexes that are queried infrequently. Disabling Warm Up shortens segment load time and conserves cache space—ideal for large vector fields or non-critical scalar fields.

Configuration levels

Level

Scope

Configuration method

Priority

Field/Index

Single field or index

SDK methods:

  • add_field()

  • alter_collection_field()

  • add_index()

  • alter_index_properties()

Highest

Collection

All fields/indexes in a collection

SDK methods:

  • create_collection()

  • alter_collection_properties()

Medium

Cluster

All collections in the cluster

milvus.yaml config file

Lowest (default)

Override behavior:

  • If a field has its own warmup setting, that setting takes precedence over collection-level and cluster-level settings.

  • If no field- or index-level setting exists, the collection-level setting applies.

  • If neither field- or index-level nor collection-level settings exist, the cluster-level applies.

  • When using alter operations, the most recent alter value takes effect.

Configure warmup at cluster level

Cluster-level warmup is configured in the Milvus configuration file milvus.yaml and applies to all collections in the cluster. This serves as the baseline default.

Each target type supports two settings:

Warmup Setting

Description

Typical scenario

sync

Preload before the segment becomes queryable. Load time increases slightly, but the first query incurs no latency.

Use for performance-critical data that must be immediately available, such as high-frequency scalar indexes or key vector indexes used in search.

disable

Skip preloading. The segment becomes queryable faster, but the first query may trigger on-demand loading.

Use for infrequently accessed or large data such as raw vector fields or non-critical scalar fields.

Example YAML:

queryNode:
  segcore:
    tieredStorage:
      warmup:
        # options: sync, disable.
        # Specifies the timing for warming up the Tiered Storage cache.
        # - `sync`: data will be loaded into the cache before a segment is considered loaded.
        # - `disable`: data will not be proactively loaded into the cache, and loaded only if needed by search/query tasks.
        # Defaults to `sync`, except for vector field which defaults to `disable`.
        scalarField: sync
        scalarIndex: sync
        vectorField: disable # cache warmup for vector field raw data is by default disabled.
        vectorIndex: sync

Parameter

Warmup Setting

Description

Recommended use case

scalarField

sync | disable

Controls whether scalar field data is preloaded.

Use sync only if scalar fields are small and accessed frequently in filters. Otherwise, disable to reduce load time.

scalarIndex

sync | disable

Controls whether scalar indexes are preloaded.

Use sync for scalar indexes involved in frequent filter conditions or range queries.

vectorField

sync | disable

Controls whether vector field data is preloaded.

Generally disable to avoid heavy cache use. Enable sync only when raw vectors must be retrieved immediately after search (for example, similarity results with vector recall).

vectorIndex

sync | disable

Controls whether vector indexes are preloaded.

Use sync for vector indexes that are critical to search latency. In batch or low-frequency workloads, disable for faster segment readiness.

Configure warmup at collection levelCompatible with Milvus 2.6.11+

Collection-level warmup allows you to override cluster defaults for a specific collection. This is useful when a collection has different access patterns than the cluster-wide baseline.

Set warmup when creating a collection

from pymilvus import MilvusClient

client = MilvusClient(uri="http://localhost:19530")

client.create_collection(
    collection_name="my_collection",
    schema=schema,
    properties={
        "warmup.scalarField": "sync",
        "warmup.scalarIndex": "sync",
        "warmup.vectorField": "disable",
        "warmup.vectorIndex": "sync"
    }
)

Alter warmup settings on an existing collection

You must alter collection properties before calling load(). Altering a loaded collection returns an error. Changes to warmup settings take effect the next time you load the collection.

client.alter_collection_properties(
    collection_name="my_collection",
    properties={
        "warmup.vectorIndex": "disable",
        "warmup.scalarField": "sync"
    }
)

Property reference:

Property

Warmup Setting

Description

warmup.scalarField

sync | disable

Warmup setting for all scalar fields in the collection.

warmup.scalarIndex

sync | disable

Warmup setting for all scalar indexes in the collection.

warmup.vectorField

sync | disable

Warmup setting for all vector fields in the collection.

warmup.vectorIndex

sync | disable

Warmup setting for all vector indexes in the collection.

Configure warmup at field levelCompatible with Milvus 2.6.11+

Field-level warmup provides the finest granularity, allowing you to control warmup behavior for individual fields. This is useful when specific fields have unique access patterns.

Field-level warmup applies to field raw data only, not to indexes on that field. To configure warmup for an index, use index-level configuration.

Set warmup when creating a field

from pymilvus import MilvusClient, DataType

schema = MilvusClient.create_schema()

schema.add_field(
    field_name="id",
    datatype=DataType.INT64,
    is_primary=True
)

schema.add_field(
    field_name="category",
    datatype=DataType.VARCHAR,
    max_length=128,
    warmup="sync"  # Preload this field at load time
)

schema.add_field(
    field_name="embedding",
    datatype=DataType.FLOAT_VECTOR,
    dim=768,
    warmup="disable"  # Do not preload vector raw data
)

Alter warmup settings on an existing field

You must alter field settings before calling load(). Altering a field on a loaded collection returns an error. Changes to warmup settings take effect the next time you load the collection.

client.alter_collection_field(
    collection_name="my_collection",
    field_name="category",
    field_params={"warmup": "sync"}
)

Configure warmup at index levelCompatible with Milvus 2.6.11+

Index-level warmup allows you to control preloading for individual indexes, independent of the underlying field’s warmup setting.

Set warmup when creating an index

from pymilvus import MilvusClient

client = MilvusClient(uri="http://localhost:19530")

index_params = client.prepare_index_params()

index_params.add_index(
    field_name="embedding",
    index_type="HNSW",
    metric_type="COSINE",
    params={
        "M": 16,
        "efConstruction": 256,
        "warmup": "sync"  # Preload this index at load time
    }
)

index_params.add_index(
    field_name="category",
    index_type="AUTOINDEX",
    params={"warmup": "disable"}  # Do not preload this index
)

client.create_index(
    collection_name="my_collection",
    index_params=index_params
)

Alter warmup settings on an existing index

You must alter index settings before calling load(). Altering an index on a loaded collection returns an error. Changes to warmup settings take effect the next time you load the collection.

client.alter_index_properties(
    collection_name="my_collection",
    index_name="embedding",
    properties={"warmup": "sync"}
)

Warmup behavior reference

The following table summarizes warmup behavior at different stages of the segment lifecycle.

Warmup Setting

Load Phase

Search/Query Phase

Release Phase

sync

Data is loaded to local storage. Destination (disk or memory) depends on mmap setting.

Query hits local cache directly.

Local cached data is cleared.

disable

Data is not loaded to local storage.

Data is fetched on demand from object storage, then cached locally based on mmap setting.

Local cached data is cleared.

Interaction with mmap:

Warmup Setting

Mmap Enabled

Data Location

sync

true

Local disk (localStorage.path/cache/...)

sync

false

Local memory

disable

true

Fetched to local disk on first access

disable

false

Fetched to local memory on first access

Local cache directory structure (when mmap is enabled):

Data Type

Directory Path

Scalar/Vector field data

localStorage.path/cache/<collection_id>/local_chunk/...

Scalar/Vector index files

localStorage.path/cache/<collection_id>/local_chunk/index_files/...

Best practices

Warm Up only affects the initial load. If cached data is later evicted, the next query will reload it on demand.

  • Avoid overusing sync. Preloading too many fields increases load time and cache pressure.

  • Start conservatively—enable Warm Up only for fields and indexes that are frequently accessed.

  • Monitor query latency and cache metrics, then expand preloading as needed.

  • For mixed workloads, apply sync to performance-sensitive collections and disable to capacity-oriented ones.