-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support scaling up node groups to the configured min size if needed #5195
Conversation
This PR is verified in an AKS cluster with the following steps. At the beginning, the node group had 3 nodes.
Then cluster-autoscaler was deployed to the cluster with the follow config. We can see the min size 5 in the command is greater than the node group size at that moment.
As expected, the node group was scaling up to 5 nodes in the first main loop.
After few minutes, the node group became 5 nodes, which means the cluster-autoscaler was able to scale up the node group to meet the min size requirement.
The next experiment is to verify if the node group can be scaled up again if a node is deleted manually.
Checking the log right after the instance deletion, we saw the node group is being scaled up from 4 to 5. And after few minutes, the new node appeared in the node list.
In addition, the cluster-autoscaler pod worked well, and no restarts happened.
|
Thanks @liuxintong for the PR! ❤️ I have one question, because I would not expect an additional flag to enable an expected behavior. |
Hi @mickare, I set the feature disabled as default just in case others have dependencies on the old "unexpected" behaviors. What you said also makes sense to me, I'm open for suggestions. |
@x13n, thanks for the review! I've addressed all comments in the latest iteration, please take another look. |
Fixed the go lint and the build. |
/auto-cc |
/cc feiskyer towca Jeffwan |
1ea3671
to
3ae0a0a
Compare
@x13n, thanks again for your code review. The latest iteration has resolved all comments. Could you please take another look? Please let me know if anybody else needs to be involved in this PR. |
3ae0a0a
to
81b8507
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, just two minor comments.
Resolved git conflicts. |
/assign @x13n |
/lgtm |
/assign feiskyer towca |
3f66775
to
2d512f1
Compare
2d512f1
to
524886f
Compare
Rebased to resolve git conflicts with master. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: feiskyer, liuxintong The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks for contributing the feature and squashed the commits. |
Which component this PR applies to?
cluster-autoscaler
What type of PR is this?
/kind feature
What this PR does / why we need it:
The node group size can be smaller than the minimum size configured in cluster-autoscaler, but cluster-autoscaler doesn't support scaling up the cluster to the desired state. Here are 2 common scenarios that could cause this issue.
To support scenarios above, we need this feature to scale up node groups that have less nodes than the configured node group min size.
Which issue(s) this PR fixes:
Fixes #5162
Fixes #4942
Special notes for your reviewer:
This feature is disabled by default. It has been verified in unit tests. I'll also test it in a real cluster.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: