-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Made Compute Engine Metadata service exceptions more informative #1637
base: main
Are you sure you want to change the base?
Conversation
Previously, the information about the original exceptions was omitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
attempt, | ||
retry_count, | ||
response.status, | ||
raise exceptions.TransportError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the retryable status code is expected in this case so I'm not sure a conversion to exception is the right thing and it makes the code a bit harder to understand.
(1) Can the customer set log level to an appropriate level to see these attempts? This is the best solution because it preserves the full history of retry attempts.
(2) If the code gets here, response
is set and you can just use that to raise the error below. This would also catch the non-retryable error case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take another look at the code execution flow. It's likely that the code change is not changing the behavior the way you think it does. There are no new exceptions thrown to the user. And the logging still happens the same way as before. It's just that now in one of the cases instead of having duplicated logging code there the logging branch is triggered by an exception.
The initial solution was simple and did not change this area. However, @viacheslav-rostovtsev noticed an issue when the retriable error limit is exhausted. In that case the information about the service responses (retriable errors) is lost. A simplest way to capture that information in the final exception is to wrap it in TransportError and let it get assigned to last_exception
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do need this change, then I agree with Wes's proposal in (2).
We could build the message "Response status: {}; Response:\n{}".format( response.status, response.data ),
and store in a string, use it for the warning message and also use it in the final exception. The string will have the last response status.
I do not understand why the execution flow needs to be changed to just log the last response.
else: | ||
break | ||
|
||
except exceptions.TransportError as e: | ||
last_exception = e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea. Clear response
here if you're following my suggestions.
@@ -229,7 +231,7 @@ def get( | |||
raise exceptions.TransportError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we should consult last_exception
and response
to construct the right error:
(1) Non-retryable error encountered: response.status
(2) Ran out of retries: last_exception
(3) Ran out of retries: response.status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was the first iteration of the idea. It seems to me that converting response to an exception works just as well and simplifies the logic since only the last exception needs to be tracked. The logging etc does not change. Do you want us to change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code flow is more confusing with exceptions, consistent with the Python style guide, so I'd like it changed. Happy to get a 2nd opinion from someone with more Python experience like @sai-sunder-s or @parthea
/cc @Ark-kun
@@ -202,22 +202,24 @@ def get( | |||
|
|||
backoff = ExponentialBackoff(total_attempts=retry_count) | |||
|
|||
last_exception = None | |||
for attempt in backoff: | |||
try: | |||
response = request(url=url, method="GET", headers=headers_to_use) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clear last_exception
here if you're following my suggestions.
) | ||
continue | ||
else: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clear last_exception
here if you follow my suggestions.
Previously, the information about the original exceptions was omitted.