Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update containerd to v2.0.2 #3828

Merged
merged 1 commit into from
Jan 17, 2025

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Dec 20, 2024

Fix #3768

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 20, 2024
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Dec 20, 2024
@AkihiroSuda
Copy link
Member Author

/retest

@stmcginnis
Copy link
Contributor

I plan on doing some more local testing, but this looks good to me and CI is happy.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 20, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 20, 2024
}
var snapshotter string
switch configVersion {
case 2: // Introduced in containerd v1.3. Still supported in containerd v2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably need to allow user supplied config patches to similarly target 2 vs 3 separately, though that doesn't need to be done in this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we depend on the version of the config?

if len(ctx.Config.ContainerdConfigPatches) > 0 || len(ctx.Config.ContainerdConfigPatchesJSON6902) > 0 {
fns := make([]func() error, len(kubeNodes))
for i, node := range kubeNodes {
node := node // capture loop variable
fns[i] = func() error {
// read and patch the config
const containerdConfigPath = "/etc/containerd/config.toml"
var buff bytes.Buffer
if err := node.Command("cat", containerdConfigPath).SetStdout(&buff).Run(); err != nil {
return errors.Wrap(err, "failed to read containerd config from node")
}
patched, err := patch.TOML(buff.String(), ctx.Config.ContainerdConfigPatches, ctx.Config.ContainerdConfigPatchesJSON6902)
if err != nil {
return errors.Wrap(err, "failed to patch containerd config")
}
if err := nodeutils.WriteFile(node, containerdConfigPath, patched); err != nil {
return errors.Wrap(err, "failed to write patched containerd config")
}
// restart containerd now that we've re-configured it
// skip if containerd is not running
if err := node.Command("bash", "-c", `! pgrep --exact containerd || systemctl restart containerd`).Run(); err != nil {
return errors.Wrap(err, "failed to restart containerd after patching config")
}
return nil
}
}
if err := errors.UntilErrorConcurrent(fns); err != nil {
return err

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we depend on the version of the config?

user's patches will not work as intended, they will wind up merging in keys that are not valid for v3 because all the keys have new namespacing again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and if they simply switch to v3 keys, it will not work with older kind images)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, so you refer to the "existing" user patches, I think is a matter of how you see it then ... for me those are containerd specific patches, and yes, users need to adapt them to containerd 2.0 unless there is some containerd feature that allows to migrate between formats

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for me those are containerd specific patches, and yes, users need to adapt them to containerd 2.0 unless there is some containerd feature that allows to migrate between formats

But then their config won't work for older kind releases/images.

The way we do this with kubeadm config patches is you can have multiple patches that are targeting different kubeadm API versions. So far that's been irrelevant because for nearly all of our history we've only had containerd 2.0 config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But actually, that is for a future PR, because in this PR we're still using a v2.0 config (images/base/files/etc/containerd/config.toml), this part of the diff seems to be technically unecessary for now.

Copy link
Contributor

@aojea aojea Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I got it now, so we keep using containerd 2.0 version for the config and let containerd deal with the translation internall to version 3 https://github.com/containerd/containerd/blob/main/docs/cri/config.md#config-versions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, before we switch our config to v3 we should give users a way to deal with it smoothly, but we're not actually switching to v3 in this PR, even though this PR adds partial support for v3.

@aojea
Copy link
Contributor

aojea commented Jan 7, 2025

/lgtm

I'm positively surprised the diff is so small ... IIRC we already had issues with ctr and storage formats in the past

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 7, 2025
@BenTheElder
Copy link
Member

I'm positively surprised the diff is so small ... IIRC we already had issues with ctr and storage formats in the past

I don't think this addresses all of the issues, and since CRI only has v1 and not v1alpha2 in containerd v2, we break kubernetes < 1.26 (which is probably OK, but worth noting, we have not actively broken an old k8s version for some time now).

@AkihiroSuda
Copy link
Member Author

since CRI only has v1 and not v1alpha2 in containerd v2, we break kubernetes < 1.26

CRI v1 seems introduced in Kubernetes 1.20: kubernetes/kubernetes@9fcede9

@aojea
Copy link
Contributor

aojea commented Jan 7, 2025

and since CRI only has v1 and not v1alpha2 in containerd v2, we break kubernetes < 1.26 (which is probably OK, but worth noting, we have not actively broken an old k8s version for some time now).

v1 was added in 1.20 kubernetes/kubernetes#96387
it was made default in 1.23 kubernetes/kubernetes#106501
In 1.26 v1alpha2 was dropped kubernetes/kubernetes#110618, but we should support until 1.20 IIUIC

@BenTheElder
Copy link
Member

CRI v1 seems introduced in Kubernetes 1.20: kubernetes/kubernetes@9fcede9

ACK, definitely no problem with that.

We will still need to make sure to note this for the release, and we should consider if this is sufficient excuse to drop other < 1.20 bits from the project. We've been very lenient about dropping support outright so far.

And re: other parts: #3828 (comment)

@BenTheElder
Copy link
Member

BenTheElder commented Jan 7, 2025

At some point <1.19 will be widely unusable because of cgroups v2 anyhow.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 8, 2025
@AkihiroSuda AkihiroSuda marked this pull request as draft January 10, 2025 00:31
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 10, 2025
Fix issue 3768

Signed-off-by: Akihiro Suda <[email protected]>
@AkihiroSuda AkihiroSuda changed the title update containerd to v2.0.1 update containerd to v2.0.2 Jan 14, 2025
@AkihiroSuda AkihiroSuda marked this pull request as ready for review January 14, 2025 06:16
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 14, 2025
@AkihiroSuda
Copy link
Member Author

/retest

@aojea
Copy link
Contributor

aojea commented Jan 14, 2025

/lgtm
/assign @BenTheElder

Thanks

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 14, 2025
@BenTheElder
Copy link
Member

/lgtm
/approve
/hold cancel

Note: we have automated base image builds after merge, so most of the CI here is using the current base image still. After this merges we will send another PR to pickup the image, it's possible we'll find issues and revert. This is fine, just pointing it out in case.

Sometimes we preemptively test by temporarily adding a commit using a locally built & pushed image, but only to test. We don't need to do that here, but we will get more signal when we adopt a built image.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AkihiroSuda, BenTheElder

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 17, 2025
@BenTheElder
Copy link
Member

Thanks for working on this and the upstream bug fixes, as always!

@k8s-ci-robot k8s-ci-robot merged commit f528b02 into kubernetes-sigs:main Jan 17, 2025
30 checks passed
@BenTheElder
Copy link
Member

filed #3848 after the build job completed so we can keep this moving along

@BenTheElder
Copy link
Member

BenTheElder commented Jan 17, 2025

I did find an issue #3848 (comment)

We can mitigate, but the underlying cause may be worth investigating.

Previously we did not wait for containerd to be ready because it was fully booted in ~0.3s even when cross-architecture, so we never had any issue, now that takes approximately 1s, so image pulls fail because the socket is not available yet.

Mitigating that would be pretty trivial (we can either retry image pulls which we should consider anyhow, watch for containerd to be ready before doing any pulls, or both), but that seems like a pretty big regression that will impact startup time at runtime as well (both initially, and anytime we need to restart). Not sure if that's something the containerd project would track or just accept.

cc @samuelkarp

@BenTheElder
Copy link
Member

(These times are v1.7.24 vs v2.0.2, arm64 builds on an amd64 GCE machine, more details in #3848 (comment))

@BenTheElder
Copy link
Member

Update: this is probably not worth discussing upstream, because 2.0.2 is still < 0.07s with amd64 + amd64 host. It is however consistently longer than 1.7.4. Something with arm64 qemu must be even more pathological.

@BenTheElder
Copy link
Member

Moving further discussion to #3848

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Containerd 2.0 support
5 participants