You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When downloading corpora from versioned data stores, I would expect to take into account a tag or specific hash of that dataset. That way users are sure if a specific version of audiomate yields an identical corpus to foster reproducibility.
e.g. lets take the esc-50 corpus: the root url downloads directly from master branch
To improve reproducibility, I suggest that audiomate uses tags where possible (github, zenodo, ...) and furthermore provide a checksum mechanism that verifies a successful download.
Yes, that is a good idea. I also had something similar in mind.
Due to some datasets changing "frequently", I wanted to introduce versions.
So you could actually select which version you want to use.
Of course this should/could be combined with your approach with tags and checksums.
Due to some datasets changing "frequently", I wanted to introduce versions.
So you could actually select which version you want to use.
yes, that's also a good idea. and could be added on top of tags. But I think you might end up with a less confusing API if one version of audiomate only supports a fixed amount of dataset versions. Of course, this would come with the drawback that users would be unable to load an older dataset version even though they use a new version of audiomate. But then I think this wouldn't happen in practice as most users would use audiomate for a single dataset per project.
For now, I would suggest to freeze as many downloads as possible instead of pointing to the latest.
When downloading corpora from versioned data stores, I would expect to take into account a tag or specific hash of that dataset. That way users are sure if a specific version of audiomate yields an identical corpus to foster reproducibility.
e.g. lets take the esc-50 corpus: the root url downloads directly from master branch
audiomate/audiomate/corpus/io/esc.py
Line 11 in 28696c0
To improve reproducibility, I suggest that audiomate uses tags where possible (github, zenodo, ...) and furthermore provide a checksum mechanism that verifies a successful download.
This issue is part of a JOSS review openjournals/joss-reviews#2135
The text was updated successfully, but these errors were encountered: