-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can not use GPUMounter on k8s #20
Comments
do not have slave pod in my namespace: gpu-pool |
@liuweibin6566396837 |
It seems like that you edit the k8s version in this issue. |
thanks for your reply. I have fixed the problem earlier(just make sure env: |
Now, I met a new bug in cluster bug: when I request addgpu, it return "Add GPU Success", however no slaver pod in gpu-pool.
|
Thx for your report. |
environment:
problem: following QuickStart.md, I install GPUMounter successfully in my k8s. However, never request
remove gpu
andadd gpu
sucessfully.I pasted some logs from gpu-mounter-master-container:
remove gpu
2022-02-18T03:44:55.184Z INFO GPUMounter-master/main.go:120 access remove gpu service
2022-02-18T03:44:55.184Z INFO GPUMounter-master/main.go:134 GPU-5d237016-9ea5-77bd-8c2f-2b3fd4bfa2cd
2022-02-18T03:44:55.184Z INFO GPUMounter-master/main.go:135 GPU-5d237016-9ea5-77bd-8c2f-2b3fd4bfa2cd
2022-02-18T03:44:55.184Z INFO GPUMounter-master/main.go:146 Pod: jupyter-lab-54d76f5d58-rlklh Namespace: default UUIDs: GPU-5d237016-9ea5-77bd-8c2f-2b3fd4bfa2cd force: true
2022-02-18T03:44:55.188Z INFO GPUMounter-master/main.go:169 Found Pod: jupyter-lab-54d76f5d58-rlklh in Namespace: default on Node: dev06.ucd.qzm.stonewise.cn
2022-02-18T03:44:55.193Z INFO GPUMounter-master/main.go:265 Worker: gpu-mounter-workers-fbfj8 Node: dev05.ucd.qzm.stonewise.cn
2022-02-18T03:44:55.193Z INFO GPUMounter-master/main.go:265 Worker: gpu-mounter-workers-kwmsn Node: dev06.ucd.qzm.stonewise.cn
2022-02-18T03:44:55.201Z ERROR GPUMounter-master/main.go:217 Invalid UUIDs: GPU-5d237016-9ea5-77bd-8c2f-2b3fd4bfa2cd
add gpu
2022-02-18T03:42:22.897Z INFO GPUMounter-master/main.go:25 access add gpu service
2022-02-18T03:42:22.898Z INFO GPUMounter-master/main.go:30 Pod: jupyter-lab-54d76f5d58-rlklh Namespace: default GPU Num: 4 Is entire mount: false
2022-02-18T03:42:22.902Z INFO GPUMounter-master/main.go:66 Found Pod: jupyter-lab-54d76f5d58-rlklh in Namespace: default on Node: dev06.ucd.qzm.stonewise.cn
2022-02-18T03:42:22.907Z INFO GPUMounter-master/main.go:265 Worker: gpu-mounter-workers-fbfj8 Node: dev05.ucd.qzm.stonewise.cn
2022-02-18T03:42:22.907Z INFO GPUMounter-master/main.go:265 Worker: gpu-mounter-workers-kwmsn Node: dev06.ucd.qzm.stonewise.cn
2022-02-18T03:42:22.921Z ERROR GPUMounter-master/main.go:98 Failed to call add gpu service
2022-02-18T03:42:22.921Z ERROR GPUMounter-master/main.go:99 rpc error: code = Unknown desc = FailedCreated
The text was updated successfully, but these errors were encountered: