Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSM O11y: improve the retry settings of list_time_series API call #30

Merged
merged 1 commit into from
Feb 2, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions tests/gamma/csm_observability_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@

from absl import flags
from absl.testing import absltest
from google.api_core import exceptions as gapi_errors
from google.api_core import retry as gapi_retries
from google.cloud import monitoring_v3
import yaml

Expand Down Expand Up @@ -398,6 +400,27 @@ def query_metrics(
A helper function to make the cloud monitoring API call to query
metrics created by this test run.
"""

# Based on default retry settings for list_time_series method:
# https://github.com/googleapis/google-cloud-python/blob/google-cloud-monitoring-v2.18.0/packages/google-cloud-monitoring/google/cloud/monitoring_v3/services/metric_service/transports/base.py#L210-L218
# Modified: predicate extended to retry on a wider range of error types.
retry_settings = gapi_retries.Retry(
initial=0.1,
maximum=30.0,
multiplier=1.3,
predicate=gapi_retries.if_exception_type(
# Retry on 5xx, not just 503 ServiceUnavailable. This also
# covers gRPC Unknown, DataLoss, and DeadlineExceeded statuses.
# 501 MethodNotImplemented not excluded because most likely
# reason we'd see this error is server misconfiguration, so we
# want to give it a chance to recovering this situation too.
gapi_errors.ServerError,
# Retry on 429/ResourceExhausted: recoverable rate limiting.
gapi_errors.TooManyRequests,
),
deadline=90.0,
)

results = {}
for metric in metric_names:
logger.info("Requesting list_time_series for metric %s", metric)
Expand All @@ -406,6 +429,7 @@ def query_metrics(
filter=build_query_fn(metric),
interval=interval,
view=monitoring_v3.ListTimeSeriesRequest.TimeSeriesView.FULL,
retry=retry_settings,
)
time_series = list(response)

Expand Down
Loading