-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dask workers can be scheduled on hub pods with default config #59
Comments
If you want to keep non-core pods off your core (hub) pool, you need to add a taint that only core pods can tolerate. I tend to just size the core pool to the smallest possible size to fit the hub pods. If you don't leave space, things wont try to schedule there. You can also up the node purpose scheduling requirements for dask pods, but in my experience, this is unnecessary. For posterity, I should also link to this blog post that describes all of this in more detail: https://medium.com/pangeo/pangeo-cloud-cluster-design-9d58a1bf1ad3 |
@jhamman - i'm thinking we might want the core pool to autoscale eventually if we try to consolidate multiple hubs on a single EKS cluster. If we add a taint to the core pool, it seems like pods in the kube-system namespace might have trouble (for example aws-node, tiller-deploy, cluster-autoscaler). Another approach is to expose match_node_purpose="require" in https://github.com/dask/dask-kubernetes/blob/ec4666a4af5acad03c24b84aca4fcf8ccd791b4f/dask_kubernetes/objects.py#L177 |
@jhamman is there a downside to the hard affinity (at least optionally)? It couldn't be the default, but it seems useful as an option. |
FYI, rather than exposing it as a config / parameter in KubeCluster, we could document how to achieve it. kind: Pod
metadata:
labels:
foo: bar
spec:
restartPolicy: Never
containers:
- image: daskdev/dask:latest
imagePullPolicy: IfNotPresent
args: [dask-worker, --nthreads, '2', --no-bokeh, --memory-limit, 6GB, --death-timeout, '60']
name: dask
resources:
limits:
cpu: "2"
memory: 6G
requests:
cpu: "2"
memory: 6G
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
key: k8s.dask.org/node-purpose
operator: In
values:
- worker On master, that'll result in both the preferred and required affinity types being applied.
I'm not sure how Kubernetes will handle that (presumably it's fine, just not the cleanest). Right now my preference would be to add a config option / argument to KubeCluster that's passed through to |
Not really. I think this is a fine approach. Of course, there is not way to enforce that users follow this pattern so dask workers may still end up in your core pool with this approach. |
In thinking about this a little more, it may be easier for some to simply add a taint to the core pool that the hub and ingress pods can tolerate. |
@jhamman are you doing this now on the google clusters? |
No. Not yet, but we could. |
If you don't feel like modifying all of the JupyterHub services' configurations to include the toleration, this can also be accomplished by 1) adding a taint to the worker pools to prevent scheduling from core services, with corresponding tolerances added to worker pods and 2) adding a node selector to the worker pods with corresponding labels on the worker nodes. This will pretty much guarantee that everything ends up on the right nodes without having to taint/tolerate the core services. |
Our current setup allows for dask pods on hub nodes:
https://github.com/pangeo-data/pangeo-stacks/blob/master/base-notebook/binder/dask_config.yaml
This seems to be due to 'prefer' rather than 'require' when scheduling:
https://github.com/dask/dask-kubernetes/blob/ec4666a4af5acad03c24b84aca4fcf8ccd791b4f/dask_kubernetes/objects.py#L177
which results in the following for pods:
not sure how we modify the config file to get the stricter 'require' condition like we have for notebook pods:
@jhamman , @TomAugspurger
The text was updated successfully, but these errors were encountered: