🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Milvus
Zilliz

How does serverless handle long-running processes?

Serverless architectures handle long-running processes by breaking them into smaller, time-bound tasks and using orchestration tools to manage workflow. Platforms like AWS Lambda impose strict timeout limits (typically 15 minutes), making it impossible to run a single function for hours. To work around this, developers split tasks into steps that fit within these limits and use services like AWS Step Functions or durable functions in Azure to coordinate execution. For example, processing a large dataset might involve a Lambda function that processes a chunk of data, saves progress to a database, and triggers the next iteration until completion. This approach keeps individual functions short while maintaining end-to-end workflow continuity.

Orchestration tools are critical for managing state and retries in long-running serverless workflows. Step Functions allow defining state machines that track progress, handle errors, and pass data between Lambda invocations. A video encoding job, for instance, could split into stages: splitting the file, encoding segments in parallel, and merging results. Each stage runs as a separate Lambda invocation, with Step Functions managing dependencies and retries if a step fails. Similarly, Azure Durable Functions use an “orchestrator” pattern to checkpoint progress and resume from the last known state after interruptions. These tools abstract away the complexity of tracking execution across short-lived functions.

Developers also use asynchronous event-driven patterns with queues or streams for long tasks. For example, an initial Lambda function might place a message in Amazon SQS or write to a DynamoDB table to trigger subsequent processing. Each message could represent a unit of work, like resizing an image or aggregating logs, handled by separate functions. Services like AWS Batch or Fargate can complement serverless for compute-heavy tasks, but pure serverless solutions prioritize stateless, event-triggered steps. While this adds complexity in managing state and error handling, it avoids server maintenance and scales automatically. The key is designing workflows that respect platform limits while maintaining reliability through decomposition and orchestration.

Like the article? Spread the word