-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Insufficient GPU on Node: xxx #23
Comments
describe node看下GPU资源是否空闲 |
1.master日志: ############################################################################################ ####################################################################################### NetworkUnavailable False Tue, 07 Nov 2023 18:13:30 +0800 Tue, 07 Nov 2023 18:13:30 +0800 CalicoIsUp Calico is running on this node calico-system calico-node-gfzjw 0 (0%) 0 (0%) 0 (0%) 0 (0%) 28d cpu 1331m (4%) 2970m (9%) Normal RegisteredNode 32m node-controller Node yigou-dev-102-46 event: Registered Node yigou-dev-102-46 in Controller |
我可能说错了,第一次也没挂上gpu, 但是gpu是空闲的。上面的信息是master和worker的日志,和46节点的信息,gpu-pool 下面没有slave-pod. |
在k8s 1.20+有一个已知问题,ownerReference不允许跨namespaces,因此slavePod会创建失败 |
[root@yigou-dev-102-45 ~]# cat /etc/docker/daemon.json |
第一次挂载成功了,后面卸载再次deploy 显示这个 Insufficient GPU on Node: yigou-dev-102-46,gpu 实际空闲
The text was updated successfully, but these errors were encountered: