Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slurm (slurmctld?) creating files in /tmp of head_node and not cleaning up.... #6572

Open
gwolski opened this issue Nov 18, 2024 · 0 comments
Labels

Comments

@gwolski
Copy link

gwolski commented Nov 18, 2024

parallelcluster 3.9.1 and 3.11.1
I see many files in /tmp since the start of the head_node of the form:

-rw-r----- 1 slurm pcluster-slurm-share 1088 Nov 18 09:14 tmp.VBWqTAz4SS
-rw-r----- 1 slurm pcluster-slurm-share 262 Nov 18 09:05 tmp.zMJAymfADb
-rw-r----- 1 slurm pcluster-slurm-share 276 Nov 18 09:01 tmp.4FNDMLk4rC
-rw-r----- 1 slurm pcluster-slurm-share 282 Nov 18 08:59 tmp.xVx3sST9n3
-rw-r----- 1 slurm pcluster-slurm-share 282 Nov 18 08:55 tmp.a23WPhksqt
-rw-r----- 1 slurm pcluster-slurm-share 473 Nov 18 08:41 tmp.D3MvXoHL1g
-rw-r----- 1 slurm pcluster-slurm-share 1691 Nov 18 08:40 tmp.zYS9m1CU3j
-rw-r----- 1 slurm pcluster-slurm-share 1488 Nov 18 08:40 tmp.mCOXXmqOHt
-rw-r----- 1 slurm pcluster-slurm-share 1488 Nov 18 08:39 tmp.di3dHuc923
-rw-r----- 1 slurm pcluster-slurm-share 265 Nov 18 08:37 tmp.AXTQoAX1hL
-rw-r----- 1 slurm pcluster-slurm-share 265 Nov 18 08:37 tmp.45sPSvBlrB

I would expect slurm (slurmctld since we're on the head node?) to clean up and not leave crumbs.
Contents of the file seem to relate starting jobs and are of the form:

{"jobs":[{"extra":null,"job_id":17836,"features":null,"nodes_alloc":"sp-r7a-m-dy-sp-8-gb-1-cores-40","nodes_resume":"sp-r7a-m-dy-sp-8-gb-1-cores-40","oversubscribe":"NO","partition":"sp-8-gb","reservation":null},{"extra":null,"job_id":17837,"features":null,"nodes_alloc":"sp-r7a-m-dy-sp-8-gb-1-cores-41","nodes_resume":"sp-r7a-m-dy-sp-8-gb-1-cores-41","oversubscribe":"NO","partition":"sp-8-gb","reservation":null}],"all_nodes_resume":"sp-r7a-m-dy-sp-8-gb-1-cores-[40-41]"}

Is this a bug, feature, or known issue? Should I be cleaning up head_node /tmp/ files older than N days?

@gwolski gwolski added the 3.x label Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant