Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide means of accessing partial filesets (with no ROI file) #87

Open
joefutrelle opened this issue Jan 2, 2025 · 1 comment
Open
Assignees
Milestone

Comments

@joefutrelle
Copy link
Owner

Use case: compute metrics for bins that are telemetered without their ROI files.

Right now DataDirectory enforces a requirement that all three files are present--here's one place.

pyifcb/ifcb/data/files.py

Lines 255 to 283 in 5e822ad

def list_filesets(dirpath, blacklist=DEFAULT_BLACKLIST, whitelist=DEFAULT_WHITELIST, sort=True, validate=True):
"""
Iterate over entire directory tree and yield a Fileset
object for each .adc/.hdr/.roi fileset found. Warning: for
large directories, this is slow.
:param blacklist: list of directory names to ignore
:param whitelist: list of directory names to include, even if they
do not match a file's basename
:param sort: whether to sort output (sorts by alpha)
:param validate: whether to validate each path
"""
if not set(blacklist).isdisjoint(set(whitelist)):
raise ValueError('whitelist and blacklist must be disjoint')
for dp, dirnames, filenames in os.walk(dirpath):
for d in dirnames:
if d in blacklist:
dirnames.remove(d)
if sort:
dirnames.sort(reverse=True)
filenames.sort(reverse=True)
for f in filenames:
basename, extension = f[:-4], f[-3:]
if extension == 'adc' and basename+'.hdr' in filenames and basename+'.roi' in filenames:
if validate:
reldir = dp[len(dirpath)+1:]
if not validate_path(os.path.join(reldir,basename), whitelist=whitelist, blacklist=blacklist):
continue
yield dp, basename

FilesetBin constructs a RoiFile object for the specified path:

pyifcb/ifcb/data/files.py

Lines 115 to 121 in 5e822ad

def __init__(self, fileset):
"""
:param fileset: the ``Fileset`` to represent
"""
self.fileset = fileset
self.adc_file = AdcFile(fileset.adc_path)
self.roi_file = RoiFile(self.adc_file, fileset.roi_path)

but RoiFile's constructor doesn't open the file or test for existence.

The desired solution is a configuration on DataDirectory (called require_roi_files) defaulting to True, that, when False, skips checking for the existence of ROI files.

@joefutrelle joefutrelle added this to the 1.2 milestone Jan 2, 2025
@joefutrelle joefutrelle self-assigned this Jan 2, 2025
@joefutrelle
Copy link
Owner Author

joefutrelle commented Jan 10, 2025

There's also this exists method that returns False if the ROI file is missing, I believe the solution is for that method to accept a flag called require_roi_file

def exists(self):
"""
Checks for existence of all three raw data files.
:returns bool: whether or not all files exist
"""
if not os.path.exists(self.adc_path):
return False
if not os.path.exists(self.hdr_path):
return False
if not os.path.exists(self.roi_path):
return False
return True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant