Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tolerations for targetless agent. #3033

Conversation

meowjesty
Copy link
Member

@aviramha
Copy link
Member

Wait - the sentence in the issue might be tricky but we don't want to use default tolerations on the agent when running targetless.

@meowjesty
Copy link
Member Author

What should happen then? I thought it was about: if user sets tolerations in operator.agent-config, use those, otherwise ... do?

@aviramha
Copy link
Member

aviramha commented Jan 23, 2025

What should happen then? I thought it was about: if user sets tolerations in operator.agent-config, use those, otherwise ... do?

Otherwise don't set tolerations. (only in targetless)

@meowjesty
Copy link
Member Author

Alright!

@meowjesty
Copy link
Member Author

Removed the default, now it'll just pass the Option.

mirrord/kube/src/api/container/pod.rs Outdated Show resolved Hide resolved
@aviramha aviramha self-requested a review January 23, 2025 19:48
@aviramha aviramha dismissed their stale review January 23, 2025 19:48

idk really

Copy link
Contributor

@Razz4780 Razz4780 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we even have tolerations 👀 :

The default Kubernetes scheduler takes taints and tolerations into account when selecting a node to run a particular Pod. However, if you manually specify the .spec.nodeName for a Pod, that action bypasses the scheduler

We either:

  1. (targetless) Don't care about the node at all
  2. (targeted) Require a specific node and we use .spec.nodeName

mirrord/config/src/agent.rs Show resolved Hide resolved
@aviramha
Copy link
Member

aviramha commented Jan 24, 2025

I believe we saw agent pods failing to start because they lacked the tolerations
I think the scheduler might not care but it will still not start if not tolerated.

Copy link
Contributor

@Razz4780 Razz4780 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good ^^

@aviramha aviramha mentioned this pull request Jan 27, 2025
@Razz4780 Razz4780 enabled auto-merge January 27, 2025 12:09
@Razz4780 Razz4780 added this pull request to the merge queue Jan 27, 2025
Merged via the queue into metalbear-co:main with commit c416f79 Jan 27, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants