Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASMTRIAGE-7308: Add Troubleshooting Steps #5430

Merged
merged 1 commit into from
Oct 10, 2024

Conversation

arka-pramanik-hpe
Copy link
Contributor

@arka-pramanik-hpe arka-pramanik-hpe commented Oct 8, 2024

Istio Pods Crashing post upgrade due to fs.inotify Limits

Description

CASMTRIAGE-7308
After the Istio upgrade, the nodes have not yet been rebooted into a new image where these limits (fs.inotify.max_user_instances and fs.inotify.max_user_watches) have been increased. As a result, when the pods are restarted, they might be trying to monitor more files or create more inotify instances than allowed by the system. This can cause the pods to crash or fail because they are unable to watch the files or directories they need, which may be critical for service mesh operations like traffic management, logging, or configuration changes.
In addition, power outage or node reboot mid-upgrade would hit the same problem because the pods would restart without the required kernel parameters being updated on the nodes.

Relates to:

Checklist

  • If I added any command snippets, the steps they belong to follow the prompt conventions (see example).
  • If I added a new directory, I also updated .github/CODEOWNERS with the corresponding team in Cray-HPE.
  • My commits or Pull-Request Title contain my JIRA information, or I do not have a JIRA.

@arka-pramanik-hpe arka-pramanik-hpe force-pushed the CASMTRIAGE-7308 branch 3 times, most recently from 0eb551d to 33a95aa Compare October 8, 2024 15:55
@arka-pramanik-hpe arka-pramanik-hpe changed the title Add Troubleshooting Steps CASMTRIAGE-7308: Add Troubleshooting Steps Oct 8, 2024
@arka-pramanik-hpe arka-pramanik-hpe self-assigned this Oct 8, 2024
@arka-pramanik-hpe arka-pramanik-hpe force-pushed the CASMTRIAGE-7308 branch 4 times, most recently from e507f4a to 82e7101 Compare October 9, 2024 06:39
* Istio Pods Crashing post upgrade due to `fs.inotify` Limits
@arka-pramanik-hpe arka-pramanik-hpe merged commit 4f5784e into release/1.6 Oct 10, 2024
8 checks passed
@arka-pramanik-hpe arka-pramanik-hpe deleted the CASMTRIAGE-7308 branch October 10, 2024 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants