You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem?
We need to provide users with more accurate and accessible metrics for estimating the size and document count of their Anomaly Detection result index. This will help users better plan their storage requirements and manage their cluster resources more effectively.
Current solution
Users can manually estimate storage requirements based on the formula provided in the documentation or after running there detector for a while on a test cluster they can better understand how much storage there results are taking.
For example:
Default result index: The size depends on the number of result documents (both anomalous and non-anomalous), their size (approximately 1 KB each), the retention period (default 30 days), and the number of shard replicas. Example:
A detector with a 10-minute interval and 1 million entities can generate roughly 144 GB/day, resulting in approximately 4,320 GB over 30 days.
Adjusting the primary shard and replica settings changes the total disk requirements.
Custom result index: Users have more control over index settings, such as the number of shards, replica and they can even configure their own ISM policy
What solution would you like?
Option 1: Add an estimation API on the backend or some frontend only calculator feature to give user an easier time with estimating how much storage they will be utilizing.
Option 2: Add result index memory estimates to our current validation API and give user a warning if we aren't going to have enough disk space for the results, this might not automatically always provide the estimate back to the user unless we make slight changes to validation API response.
Additional notes:
For HC detectors a large part of understanding how much storage we will need for the result is based on the number of entities. We can query the historical data to gain a better understanding of this but this might not be too accurate if their isn't enough historical data or the more the number of active entities are changing.
What alternatives have you considered?
If we believe this is a simple enough task, what we might be missing is just more documentation and examples for how users can easily estimate the disk space they will need
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
We need to provide users with more accurate and accessible metrics for estimating the size and document count of their Anomaly Detection result index. This will help users better plan their storage requirements and manage their cluster resources more effectively.
Current solution
Users can manually estimate storage requirements based on the formula provided in the documentation or after running there detector for a while on a test cluster they can better understand how much storage there results are taking.
For example:
Default result index: The size depends on the number of result documents (both anomalous and non-anomalous), their size (approximately 1 KB each), the retention period (default 30 days), and the number of shard replicas. Example:
A detector with a 10-minute interval and 1 million entities can generate roughly 144 GB/day, resulting in approximately 4,320 GB over 30 days.
Adjusting the primary shard and replica settings changes the total disk requirements.
Custom result index: Users have more control over index settings, such as the number of shards, replica and they can even configure their own ISM policy
What solution would you like?
Option 1: Add an estimation API on the backend or some frontend only calculator feature to give user an easier time with estimating how much storage they will be utilizing.
Option 2: Add result index memory estimates to our current validation API and give user a warning if we aren't going to have enough disk space for the results, this might not automatically always provide the estimate back to the user unless we make slight changes to validation API response.
Additional notes:
What alternatives have you considered?
If we believe this is a simple enough task, what we might be missing is just more documentation and examples for how users can easily estimate the disk space they will need
The text was updated successfully, but these errors were encountered: