-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cold start search blog #3520
Merged
nateynateynate
merged 9 commits into
opensearch-project:main
from
kolchfa-aws:cold-search
Jan 7, 2025
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
9497749
Add cold start search blog
kolchfa-aws 94b8790
Doc review
kolchfa-aws e4c8495
Apply suggestions from code review
kolchfa-aws 0b25d05
Add picture
kolchfa-aws faaffab
Add authors info
kolchfa-aws cbee8b7
Apply suggestions from code review
kolchfa-aws 5332c56
Add new pic and github
kolchfa-aws eb19136
Apply suggestions from code review
kolchfa-aws 194e993
Update _posts/2024-12-23-cold-start-search.md
kolchfa-aws File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
--- | ||
short_name: allan | ||
name: Allan Pienaar | ||
linkedin: allan-pienaar-37769178 | ||
photo: "/assets/media/community/members/allan.png" | ||
title: 'OpenSearch Community Member: Allan Pienaar' | ||
primary_title: Allan Pienaar | ||
breadcrumbs: | ||
icon: community | ||
items: | ||
- title: Community | ||
url: /community/index.html | ||
- title: Members | ||
url: /community/members/index.html | ||
- title: 'Allan Pienaar's Profile' | ||
url: '/community/members/allan-pienaar.html' | ||
job_title_and_company: 'Senior Search Engine Architect at AWS focusing on OpenSearch' | ||
personas: | ||
- author | ||
permalink: '/community/members/allan-pienaar.html' | ||
redirect_from: '/authors/allan/' | ||
--- | ||
|
||
**Allan Pienaar** is an OpenSearch SME and Customer Success Engineer at AWS. He works closely with enterprise customers to ensure operational excellence, maintain production stability, and optimize costs using Amazon OpenSearch Service. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
--- | ||
short_name: aswath | ||
name: Aswath Srinivasan | ||
github: aswath86 | ||
photo: "/assets/media/community/members/aswath.jpg" | ||
title: 'OpenSearch Community Member: Aswath Srinivasan' | ||
primary_title: Aswath Srinivasan | ||
breadcrumbs: | ||
icon: community | ||
items: | ||
- title: Community | ||
url: /community/index.html | ||
- title: Members | ||
url: /community/members/index.html | ||
- title: 'Aswath Srinivasan's Profile' | ||
url: '/community/members/aswath-srinivasan.html' | ||
job_title_and_company: 'Senior Search Engine Architect at AWS focusing on OpenSearch' | ||
personas: | ||
- author | ||
permalink: '/community/members/aswath-srinivasan.html' | ||
redirect_from: '/authors/aswath/' | ||
--- | ||
|
||
**Aswath Srinivasan** is a Senior Search Engine Architect at AWS currently based in Munich, Germany. With over 17 years of experience in various search technologies, Aswath currently focuses on OpenSearch. He is a search and open source enthusiast and helps customers and the search community with their search problems. | ||
Check failure on line 24 in _community_members/aswath.md GitHub Actions / style-job
Check failure on line 24 in _community_members/aswath.md GitHub Actions / style-job
Check failure on line 24 in _community_members/aswath.md GitHub Actions / style-job
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
--- | ||
layout: post | ||
title: "Solving the cold start search problem in OpenSearch" | ||
authors: | ||
- aswath | ||
- allan | ||
- kolchfa | ||
date: 2025-01-07 | ||
categories: | ||
- technical-posts | ||
meta_keywords: cold start search, OpenSearch refresh interval, search latency, search performance optimization | ||
meta_description: Explore the cold start search problem in OpenSearch after upgrading from older Elasticsearch versions. Learn about the root causes and discover practical solutions to optimize search performance for various workload scenarios. | ||
has_math: false | ||
has_science_table: true | ||
--- | ||
|
||
Upgrading to OpenSearch offers many advantages, but it can also introduce unexpected challenges. One such issue we've encountered while assisting with upgrades from older Elasticsearch versions is the "cold start search" problem. You might notice that the first search after a period of inactivity is unusually slow, even though subsequent searches perform as expected. This blog post will explore the root cause of this behavior and offer potential solutions tailored to your needs. | ||
|
||
## Understanding the cold start search problem | ||
|
||
After upgrading from Elasticsearch 6.x to OpenSearch (or even to later Elasticsearch versions), you may see a pattern: the first search after some inactivity is slow, while subsequent searches run much faster. After another idle period, the slow search recurs. This issue is particularly noticeable in non-production environments, where search activity isn't as constant as in live systems. The following image presents a typical search rate metric illustrating this behavior. | ||
|
||
![Search rate metric](/assets/media/blog-images/2024-12-23-cold-start-search/search-metric.png) | ||
|
||
At first glance, this might look like a cache-warming issue. However, the pattern persists even for queries that don't use caching. Both simple and complex queries are affected equally, and slow logs don't identify these as slow queries. This means that caching or query complexity isn't the cause of the problem. | ||
|
||
## Uncovering the root cause | ||
|
||
Through detailed investigation using [search slow logs](https://opensearch.org/docs/latest/install-and-configure/configuring-opensearch/logs/#shard-slow-logs) and [query profiling](https://opensearch.org/docs/latest/api-reference/profile/), we traced the root cause to two key settings in OpenSearch: | ||
|
||
- **`refresh_interval`**: OpenSearch buffers newly indexed documents in memory until a refresh operation transfers them to searchable segments. By default, `refresh_interval` is set to 1 second for near real-time (NRT) search. However, if a shard becomes idle (determined by the `index.search.idle.after` time period), it stops refreshing until a search request triggers a refresh. | ||
|
||
- **`index.search.idle.after`**: This setting defines how long a shard can stay idle before it stops automatic refreshes. Its default value is 30 seconds. While this improves bulk indexing performance by reducing refresh frequency, it introduces a delay for the first search after a period of inactivity. | ||
|
||
When upgrading from Elasticsearch 6.x to OpenSearch or Elasticsearch 7.x, this behavior can cause the first search after a long idle period to wait for the refresh to complete before executing. Older Elasticsearch versions didn't exhibit this behavior because `index.search.idle.after` didn't exist. The severity of the delay depends on how much data needs to be refreshed, which in turn depends on how much indexing occurred during the idle period. | ||
|
||
## Practical solutions for cold start searches | ||
|
||
The best way to address this issue depends on your workload. Below are some common scenarios and recommended solutions: | ||
|
||
- **Predictable business hours with idle periods** | ||
If your search activity is heavy during specific times (for example, during typical 9--5 work hours) and indexing happens off-hours, you can leave the default settings in place. Perform a [manual refresh](https://opensearch.org/docs/latest/api-reference/index-apis/refresh/) before the busy period begins or right after nightly indexing completes. | ||
|
||
- **Write-heavy use cases (for example, observability or log analytics)**: For workloads where search latency isn't as critical, increasing `refresh_interval` to 30 or 60 seconds can improve indexing performance. Explicitly setting `refresh_interval` avoids interference from `index.search.idle.after`. | ||
|
||
- **Read-heavy use cases with sporadic writes**: Setting `refresh_interval` to 1 second ensures NRT search and eliminates delays caused by idle shards. | ||
|
||
- **Balanced workloads (where search latency, indexing, and NRT results are equally important)**: Retain the default settings. Don't base your decision on behavior in non-production systems because live systems typically have more consistent search activity. | ||
|
||
- **Predictable but infrequent searches**: Consider increasing `index.search.idle.after` to 5 or 10 minutes if search patterns are predictable. This reduces refresh overhead without affecting responsiveness during active periods. | ||
|
||
## Conclusion | ||
|
||
Addressing the cold start search problem requires understanding your specific workload and priorities. Explicitly setting `refresh_interval` or adjusting `index.search.idle.after` can help, but each solution comes with trade-offs. For most production systems, this issue is less likely to occur because of continuous search activity. | ||
|
||
Always test these configurations in your environment to find the right balance for your needs. For more tips on optimizing refresh intervals, check out our [blog post on optimizing OpenSearch refresh intervals](https://opensearch.org/blog/optimize-refresh-interval/). |
Binary file added
BIN
+31.6 KB
assets/media/blog-images/2024-12-23-cold-start-search/search-metric.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"search slow logs" => "shard slow logs"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the link's heading says
Shard slow logs
but it still commonly referred to asSearch slow logs
, so I will leave it as it is.