-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add core watcher module #397
base: main
Are you sure you want to change the base?
Conversation
jneo8
commented
Jan 9, 2025
- Add core watcher module
- Refactor guests_on_hypervisor
1aadaf1
to
2547e84
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor suggestions
9cbea7d
to
aeb87ba
Compare
- Add core watcher module - Refactor guests_on_hypervisor
- fix/Read watcher configuration from clusterdb to build watcher client - doc/Update guests_on_hypervisor func docstring - doc/Add comment for TIMEOUT and TIMEOUT_INTERVAL variables
aeb87ba
to
2ed5ac4
Compare
LOG.info(f"All Action plan for Audit {audit.uuid} execution successfully") | ||
|
||
|
||
@timeout_decorator.timeout(TIMEOUT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each actionplan corresponds to multiple actions and each action can be migration. If there are more VMs to be migrated, then the actionplan may take more than TIMEOUT i.e., 3 minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expected behavior is as follows:
- If the migration times out, the command should fail.
- When the user re-runs the command, it should fail if any VMs are still migrating, displaying a message indicating that some instances are in the migration process.
- Once all migration watcher actions are complete, re-running the command should result in success.
…her action - Refactor: Replace timeout decorator by tenacity - Fix: Raise exception on failed watcher action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments inline.
Couple of things to improve upon as separate PRs:
- Show progress of maintenance actions in cli (similar to how deployment control plane shows during bootstrap command)
- Return meaningful output at end of maintenance
For example,
If maintenance is failed or timed out, list of actions that are completed, actions that are in error state with reason and actions that are still pending.
@@ -98,3 +98,7 @@ min-file-size = 1 | |||
|
|||
[tool.ruff.lint.mccabe] | |||
max-complexity = 15 | |||
|
|||
[[tool.mypy.overrides]] | |||
module = ["watcherclient.*", "timeout_decorator"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeout_decorator
is not need here
|
||
@tenacity.retry( | ||
reraise=True, | ||
stop=tenacity.stop_after_delay(WAIT_TIMEOUT), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, the WAIT_TIMEOUT should adjust based on number of actions it has to perform. Otherwise user might end up to run sunbeam maintenance
command multiple times if there are more instances to migrate.
client: watcher_client.Client, audit: watcher.Audit | ||
) -> list[watcher.Action]: | ||
"""Get list of actions by audit.""" | ||
return client.action.list(audit=audit.uuid, detail=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add limit=0 to get all the actions [1]