-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Payu runs input checksums on every run when submitting with -n N #526
Comments
The point of the manifests is to record everything that goes into a run. Are you adding files to the manifest that aren't actually used? Typically directories were specified in the https://github.com/ACCESS-NRI/access-om2-configs/blob/release-025deg_jra55_ryf/config.yaml#L40-L52 This has the benefit of being much more specific about what the model needs to run, also any changes to specific input files are more "atomic" and are reflected directly in the There are exceptions though, e.g. JRA-55 RYF forcing data has a heap of files, so we use a directory https://github.com/ACCESS-NRI/access-om2-configs/blob/release-025deg_jra55_ryf/config.yaml#L33 and even more for the IAF version https://github.com/ACCESS-NRI/access-om2-configs/blob/release-025deg_jra55_iaf/config.yaml#L33-L43 |
At least one of the configurations we want to support has ~1000 input files (Met forcing files which are for some reason split into single year chunks). It might well be that the original dataset is not split like this, but the user who pulled it originally did it to make it easier to write the I/O handler. I like moving to explicitly specifying the input files (does it support |
You have the option of using more CPUs. It is an embarassingly parallel problem, so will scale with nCPUs. We could create a version of This would also need some manual testing beforehand to check if it is worth the bother, and I doubt it would make much difference (I think file ops like opening and closing have a big overhead).
Not currently. Originally it was just directories, but this logic branch was added to support adding specific filepaths https://github.com/payu-org/payu/blob/master/payu/models/model.py#L277-L285 (Note that it is slightly weird, building a mock iterator so that it can reuse the main code loop below) I don't think it would be difficult to invert the logic, test for a directory and otherwise assume a If you think that is useful functionality probably best to create a specific issue for it and link back to this one. In the mean time you could emulate this functionality with symbolic links: create some directories that group your inputs in some way and make symbolic links in the sub-dirs. That way you can select out just th inputs you need. |
Payu re-submissions in a
-n N
run job trigger re-generating the input manifest. For small jobs, this becomes a significant portion of run time (maybe this is only relevant forstaged_cable
jobs?). I don't think there's any reason to recompute the input manifest for subsequent runs.The text was updated successfully, but these errors were encountered: