-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ray debugger] Unable to use debugger on Ray Cluster on k8s #45541
Comments
Or do I need to configure launch.json in vscode? |
I think the problem is that Ray debugger uses a random port, so it's not possible to know ahead which port to open when running on Kubernetes From https://github.com/ray-project/ray/blob/master/python/ray/util/debugpy.py:
And from definition of listen() in https://github.com/microsoft/debugpy/blob/main/src/debugpy/public_api.
In our case we're running ephemeral Ray clusters using |
@brycehuang30 does the new distributed debugger have this capability? if we don't I say we build forward and add this as a feature request to that. |
Distributed debugger currently cannot custom the debugging ports. I think we could solve this in two steps:
|
I've run into an issue that seems very similar to this one. In fact, it might very well be the same issue. I'm using Ray 2.30 and I get a connection refused error when I try to connect vs code to the paused task. I noticed that debugpy on the task actually crashes soon after
|
…#49116) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This addresses #45541 and #49014 ## Related issue number <!-- For example: "Closes #1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Philipp Moritz <[email protected]> Co-authored-by: angelinalg <[email protected]>
…ray-project#49116) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This addresses ray-project#45541 and ray-project#49014 ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Philipp Moritz <[email protected]> Co-authored-by: angelinalg <[email protected]> Signed-off-by: ujjawal-khare <[email protected]>
We are facing the same issue with our local Ray Cluster but in our case behind docker-compose for local development/testing. We were wondering if the suggested solution like an optional parameter to specify debugpy ports is still an option or if there is any other recommendation to overcome the issue. |
We ended up deploying https://docs.linuxserver.io/images/docker-code-server/ inside the Kubernetes cluster, which can then access the necessary ports |
@rasmus-unity Thank you for sharing. Can you explain the specific operation process? At the same time, I noticed that there is a relevant pr (#49116). Can I assume that this requirement can be met by referring to this document? cc @brycehuang30 |
@rasmus-unity and @Moonquakes, thank you for your insights! To test this, we created a Dockerfile based on Ray images and installed the SSH server as mentioned in #49116, along with other necessary components. We were seeking a solution for agile local development and debugging, so we also ended up having to mount the source code being developed as volumes on the Ray head node and installed various tools such as Devbox that we require for developing. This setup allowed us to develop directly on the Ray head and utilize the Ray Distributed Debugger extension, but we believe this is so much complexity added aside from installing unnecessary stuff on the ray-head that could potentially be overcome. While this approach was useful and for the moment makes the trick for us, we still believe that having an out-of-the-box solution without the need to install SSH servers and other possible dependencies would be extremely valuable having already the amazing provided Ray Distributed Debugger extension. Implementing a way to configure a range of ports for |
Hi @rogerfydp , Could you explain your operation steps and dockerfile in more detail? I installed ssh according to the instructions in #49116, and opened port 22. It seems that there will be other problems. Kuberay will open some ports by default when the port is not filled in, but it will not be added if 22 is added manually (https://github.com/ray-project/kuberay/blob/v1.2.2/ray-operator/controllers/ray/common/service.go#L409-L417). |
What happened + What you expected to happen
I tried to use the debugger plug-in on VScode according to the guidance(https://www.anyscale.com/blog/ray-distributed-debugger), but when I click on a paused task to attach the VSCode debugger, I always get an error
connect ECONNREFUSED $ip:port
.I tried to enable the plug-in locally and it worked normally.
I also tried to add the
ray-debugger-external
flag and tested that the Ray Cluster on k8s can enable the native debugger normally.I don’t know how to use the debugger plug-in on VScode on the Ray Cluster on k8s. Can you provide relevant guidance or help?
Versions / Dependencies
Ray 2.23.0
Python 3.10.12
Reproduction script
Sample code in guidance
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: