Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect propagation of frozen state during unfreeze #6632

Open
wants to merge 1 commit into
base: rpi-6.6.y
Choose a base branch
from

Conversation

umarsync
Copy link

@umarsync umarsync commented Jan 26, 2025

In the cgroup_propagate_frozen function, during unfreezing (when frozen is false), the code subtracts desc (which increments for each unfrozen parent) from each ancestor's nr_frozen_descendants. This results in over-subtraction, leading to negative counts and incorrect frozen states.

Example Scenario:

  • Hierarchy: A -> B -> C
  • All cgroups (A, B, C) are frozen.
  • Unfreeze C:
  • Current Code: Subtracts desc=1 from B's count (correctly setting it to 0). Then subtracts desc=2 from A's count, reducing it from 2 to 0. If A has another frozen child D, this incorrectly sets A's count to 0 instead of 1 (since D is still frozen).

Explanation of the Fix

  • Unfreeze Case Adjustment: Change cgrp->freezer.nr_frozen_descendants -= desc; to cgrp->freezer.nr_frozen_descendants -= 1; to decrement each ancestor's count by exactly 1, not the accumulated desc.
  • Check for State Change: After decrementing, check if the ancestor's nr_frozen_descendants no longer matches nr_descendants. If so, unfreeze and propagate further.
  • Remove desc Increment: The desc variable isn't needed for unfreezing since each ancestor is handled individually.

Additional Considerations

  • Negative Count Prevention: Added WARN_ON_ONCE to catch underflows, though proper decrement by 1 should prevent negatives.
  • Propagation Continuation: The loop continues upwards until all affected ancestors are updated.

Test Case: Incorrect unfreezing of parent cgroup when one of multiple frozen descendants is unfrozen.

  1. Setup Hierarchy:
# Create cgroups
mkdir /sys/fs/cgroup/A
mkdir /sys/fs/cgroup/A/B
mkdir /sys/fs/cgroup/A/D
mkdir /sys/fs/cgroup/A/B/C
  1. Freeze Leaf Cgroups:
# Freeze C and D (A and B will auto-freeze via propagation)
echo 1 > /sys/fs/cgroup/A/B/C/cgroup.freeze
echo 1 > /sys/fs/cgroup/A/D/cgroup.freeze
  1. Verify All Cgroups Are Frozen:
cat /sys/fs/cgroup/A/cgroup.freeze # Should show 1 (frozen)
cat /sys/fs/cgroup/A/B/cgroup.freeze # Should show 1 (frozen)
cat /sys/fs/cgroup/A/B/C/cgroup.freeze # Should show 1 (frozen)
cat /sys/fs/cgroup/A/D/cgroup.freeze # Should show 1 (frozen)
  1. Unfreeze Cgroup C:
echo 0 > /sys/fs/cgroup/A/B/C/cgroup.freeze
  1. Observe Incorrect Behavior:
cat /sys/fs/cgroup/A/cgroup.freeze # ❌ Shows 0 (unfrozen) **BUG**
cat /sys/fs/cgroup/A/D/cgroup.freeze # Still shows 1 (frozen)

Expected Behavior:

  • A should remain frozen because its descendant D is still frozen.
  • B should unfreeze (correctly) because its only descendant C was unfrozen.

Actual Behavior (with Bug):

  • A incorrectly unfreezes due to over-decrement in nr_frozen_descendants.

The existing freezer subsystem incorrectly propagates the frozen state
when unfreezing a cgroup with multiple descendants. The critical bug
was in cgroup_propagate_frozen() where ancestor cgroups' nr_frozen_descendants
were being decremented by an increasing counter ('desc') instead of 1,
leading to:
1. Premature unfreezing of parent cgroups
2. Negative nr_frozen_descendants counts
3. Incorrect frozen state reporting
@pelwell
Copy link
Contributor

pelwell commented Jan 26, 2025

That seems like a good catch. However, that file is not one that we've changed in our version of the kernel, so the patch should be sent to the Linux kernel maintainers. Given the clarity of your explanation, I'm sure @htejun would be he happy to guide you through to its acceptance.

@umarsync
Copy link
Author

That seems like a good catch. However, that file is not one that we've changed in our version of the kernel, so the patch should be sent to the Linux kernel maintainers. Given the clarity of your explanation, I'm sure @htejun would be he happy to guide you through to its acceptance.

I appreciate the guidance, I’ll submit this patch upstream to the Linux kernel maintainers as suggested. I’ll also reach out to @htejun for feedback and to ensure alignment with the cgroup subsystem’s workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants