Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File handling changes #667

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

hellkite500
Copy link
Contributor

When running a large run with BMI models that leak file handles, it is easy to exhaust the OS's available descriptors, which can lead to crashes with little context.

Additions

  • Error message from the OS errno when BMI init config files are not readable.

Removals

Changes

  • Change the FileStreamHandler to not hold each output's file descriptor for the duration of the program, but instead open and close as needed to write output.

Testing

  1. All existing tests pass, tested with faulty BMI module on large regional domain.

Todos

  • Consider updating other file reading checks with errno outputs.

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Target Environment support

  • Linux
  • MacOS

Copy link
Contributor

@donaldwj donaldwj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest adding flag that determines if we always close and file after each use. This change would have a rather high performance cost.

@PhilMiller
Copy link
Contributor

Agreed with Donald that this is going to be a disaster for performance, especially running on any sort of HPC system with shared storage, and most especially Lustre.

That may not be a reason not to do this right now, and then optimize the performance impacts later if/as they arise, though.

@PhilMiller
Copy link
Contributor

FileStreamHandler is used for what, CSV output? In the long run, that should go away anyway, in favor of NetCDF or Parquet or whatnot, but those will presumably have the same issues.

Thinking broadly, I don't think we can really expect to ship anything that does file-per-catchment output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants