Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nginx may be missing a restart on worker refresh? #94

Comments

@PietroPasotti
Copy link
Contributor

Bug Description

when you deploy cos Pro and refresh the worker, tempo's coordinator fails to route to the /ready endpoint

an nginx restart seems to fix it, which is odd because the FQDNs haven't changed.

To Reproduce

deploy cos-pro
refresh tempo-worker

Environment

n/a

Relevant log output

n/a

Additional context

No response

@PietroPasotti PietroPasotti changed the title Nginx may be missing a restart Nginx may be missing a restart on worker refresh? Nov 7, 2024
@PietroPasotti
Copy link
Contributor Author

probably an instance of https://tenzer.dk/nginx-with-dynamic-upstreams/

@PietroPasotti
Copy link
Contributor Author

PietroPasotti commented Nov 11, 2024

looks like adding a resolver: ns.dns.cluster.local statement to each route fixes it (updates the IP within 5 seconds).
TBD: can we assume this dns address is stable?

if not, we could use this approach:

dig NS <my fqdn>

(we'd need to add dig to our rocks/images or apt install it)

@PietroPasotti
Copy link
Contributor Author

or:

# cat /etc/resolv.conf                                           
search foo.svc.cluster.local svc.cluster.local cluster.local home
nameserver 10.152.183.10                                         

@michaeldmitry
Copy link
Contributor

Reopening this issue, as we're still experiencing this issue with tempo-coordinatior-k8s latest edge 46

@michaeldmitry
Copy link
Contributor

michaeldmitry commented Dec 12, 2024

It seems that adding the dynamic resolver alone won't fix the issue.
See more in https://www.f5.com/company/blog/nginx/dns-service-discovery-nginx-plus

Solutions considered:

  1. Adding a dynamic resolver fetched from /etc/resolv.confNginx will only do lookups at startup and cache the result
  2. Adding a dynamic resolver + using proxy_pass with variables → This will enable DNS lookup during runtime. However, this will not work for our case because it doesn't work with upstream groups. So, we can’t leverage the load balancing functionality of upstream groups.
  3. Use Nginx 1.27.3 to leverage the Nginx-plus functionality. they pulled into opensource nginx.

Solution 3 works best for our use-case. However, as of today, there's no Ubuntu-provided nginx image for 1.27.3

Sub-solutions considered:
A. Going ahead of Debian in adding an nginx package for plucky 25.04 with a development release (1.27.x)
B. Create our custom rock

Sub-solution A will probably not be an option since nginx odd releases are development releases, and we can't take 1.27.x for 25.04 and not know if 1.28.x (the stable release) will be released in time for 26.04 (the LTS)

This leaves us with sub-solution B, where we could create a rock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment