-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not assume systemd-resolved for resolv.conf #11813
base: master
Are you sure you want to change the base?
Do not assume systemd-resolved for resolv.conf #11813
Conversation
/ok-to-test |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: VannTen The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
aee5e23
to
e4d3a6f
Compare
Thanks @VannTen |
If systemd-resolved is not enabled, use
I think that should solve your situation. |
stub-resolv.conf use systemd-resolved stub resolver in
Pods don't have access to localhost network, so stub-resolv.conf wouldn't work. |
@@ -6,7 +6,7 @@ kubelet_address: "{{ ip | default(fallback_ip) }}{{ (',' + ip6) if enable_dual_s | |||
kubelet_bind_address: "{{ ip | default('0.0.0.0') }}" | |||
|
|||
# resolv.conf to base dns config | |||
kube_resolv_conf: "/etc/resolv.conf" | |||
kube_resolv_conf: "{{ '/run/systemd/resolve/resolv.conf' if 'systemd-resolved' in active_dns_services else '/etc/resolv.conf' }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that some resolv.conf modes (static if I remember correctly) check /etc/resolv.conf
for files.
Using the systemd-resolved enablement as a check doesn't seem to be accurate.
kube_resolv_conf
is by default still /etc/resolv.conf
.
If systemd-resolved is enabled and /etc/resolv.conf
is a soft link, change kube_resolv_conf
to /run/systemd/resolve/resolv.conf
.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/etc/resolv.conf
is a soft link for /run/systemd/resolve/stub-resolv.conf
not /run/systemd/resolve/resolv.conf
.
And from the advice of https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues ,
So changing the config to run/systemd/resolve/resolv.conf
is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, what I'm trying to say is that systemd-resolved
enablement is not the only way to tell, if systemd-resolved
is enabled but /etc/resolv.conf
is a file (not a soft link), it should probably be /etc/resolv.conf
instead of /run/systemd/resolve/resolv.conf
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if systemd-resolved
is enabled/running but /etc/resolv.conf is not a symlink to either /run/systemd/resolve/stub-resolv.conf
/ /run/systemd/resolve/resolv.conf
/ /usr/lib/systemd/resolv.conf
, systemd-resolved will use it as source of DNS configuration. See this excerpt from man systemd-resolved (8):
/ETC/RESOLV.CONF
Four modes of handling /etc/resolv.conf (see resolv.conf(5)) are supported:
• systemd-resolved maintains the /run/systemd/resolve/stub-resolv.conf file for compatibility with traditional Linux programs. This file lists the
127.0.0.53 DNS stub (see above) as the only DNS server. It also contains a list of search domains that are in use by systemd-resolved. The list of
search domains is always kept up-to-date. Note that /run/systemd/resolve/stub-resolv.conf should not be used directly by applications, but only
through a symlink from /etc/resolv.conf. This file may be symlinked from /etc/resolv.conf in order to connect all local clients that bypass local
DNS APIs to systemd-resolved with correct search domains settings. This mode of operation is recommended.
• A static file /usr/lib/systemd/resolv.conf is provided that lists the 127.0.0.53 DNS stub (see above) as only DNS server. This file may be symlinked
from /etc/resolv.conf in order to connect all local clients that bypass local DNS APIs to systemd-resolved. This file does not contain any search
domains.
• systemd-resolved maintains the /run/systemd/resolve/resolv.conf file for compatibility with traditional Linux programs. This file may be symlinked
from /etc/resolv.conf and is always kept up-to-date, containing information about all known DNS servers. Note the file format's limitations: it does
not know a concept of per-interface DNS servers and hence only contains system-wide DNS server definitions. Note that
/run/systemd/resolve/resolv.conf should not be used directly by applications, but only through a symlink from /etc/resolv.conf. If this mode of
operation is used local clients that bypass any local DNS API will also bypass systemd-resolved and will talk directly to the known DNS servers.
• Alternatively, /etc/resolv.conf may be managed by other packages, in which case systemd-resolved will read it for DNS configuration data. In this
mode of operation systemd-resolved is consumer rather than provider of this configuration file.
Note that the selected mode of operation for this file is detected fully automatically, depending on whether /etc/resolv.conf is a symlink to
/run/systemd/resolve/resolv.conf or lists 127.0.0.53 as DNS server.
So I think relying on systemd-resolved alone might actually work 🤔
I was initially going with checking for a symlink, but I'm not sure we should do that, because we end up with several boolean instead of ones, which results in ambiguity in certain cases:
- /etc/resolv.conf is a symlink to one of the managed files, but systemd-resolved is not started/enabled. Or the reverse.
- for that matter, systemd-resolved is started, but not enabled, or the reverse.
Wdyt ?
We currently assume on some distribution that systemd-resolved is used and therefore we can use /run/systemd/resolve/resolv.conf to pass to the kubelet configuration. This breaks if the distribution is configured differently (use another DNS service) and force us to special case. Instead, detect if systemd-resolved is running dynamically and set kube_resolv_conf default accordingly.
e4d3a6f
to
cfabb32
Compare
We encountered this issue recently on Ubuntu. But initially, the cluster ran smoothly and coredns can work well to forward to local
This will only happen if there is no external DNS in
This issue is quite subtle and is likely to occur when CoreDNS restarts, posing a significant threat to many users' production environments. |
HI @cyclinder |
About with #11813 (comment). It looks like a lack of upstream dns, or changed the /etc/resolv.conf directly when systemd-resolved is running. FYI, when So maybe we could add a checking to check if at least one configured nameserver is fixed when these conditions are met:
or
The configured nameserver may be set by resolved.conf(5) or network interface configuration (netplan in Ubuntu), we could easily use We ever accepted a pr #9502 which make the similar thing as I mentioned above, but that's a small range. Taking |
What type of PR is this?
/kind bug
What this PR does / why we need it:
We currently assume on some distribution that systemd-resolved is used
and therefore we can use /run/systemd/resolve/resolv.conf to pass to the
kubelet configuration.
This breaks if the distribution is configured differently (use another
DNS service) and force us to special case.
Instead, detect if systemd-resolved is running dynamically and set
kube_resolv_conf default accordingly.
Which issue(s) this PR fixes:
Fixes #11810
Special notes for your reviewer:
I'm just wondering if this could cause breakage when systemd-resolved is running but /etc/resolv.conf does not point to its managed files and has other settings 🤔
I'm not sure what we should do in that case (but I don't think hardcoding is the answer).
Does this PR introduce a user-facing change?: