-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using File('<finename>').commit_shas
results in error due to missing .tch file "/da5_fast/f2cFullU.13.tch"
#50
Comments
Copying discussion from challenge as it helps clear misconceptions:
No functionality is removed, just misleading and badly thought of interfaces.
Common files (README.md) produce giant values. To handle those one needs to return iterator on iterators (over keys and then over values). This is not the job of a key value database as none allow unlimited value size. A better alternative is to
as a substantial fraction of the entire collection of commits is returned, there is no expectation of immediate Running shell scripts in python processes is, generally, a very bad bad idea that goes against the core design of unix tools in general and stream processing used in WoC in particular. Python filters could be use on the command line if one wants to utilize python for processing of these streams, but sed/awk/sort/join are much more effective and transparent alternatives in most cases. It is important to note that other maps, such as a2c or b2c may return huge values as well, but in such cases this is because of bad data (generic author ID, empty file, etc) and keys for such values are not a typical use case (as in f2X). getValues actually handles such cases by storing these giant values using different means, but I am not sure if python classes actually are able to access them (this needs to be documented, especially if the answer is negative). |
@audrism , could you please update https://ssc-oscar.github.io/oscar.py/, so that the information that |
@jnareb I'm planning to fix this instead of deprecating. I cannot give a timeline, but I'll update the issue once it's done. |
@user2589 thanks a lot in advance. As I see it (but might be wrong, because I have not take a look at the code) it could look like this:
Does this set of steps look right? |
Yes, this is correct. There are two (or maybe three) independent tasks, however
Potentially third task concerns handling multiple backends |
Would it use |
|
When trying to find all commits that changed given file (and then all projects that included given file at some point), I have tried to follow the example from oscar.py documentation: https://ssc-oscar.github.io/oscar.py/#oscar.File
Unfortunately, it does not work, and instead returns the following error:
In the discussion for woc-hack/mining-challenge-msr-2023#2 (where I originally created an issue for this problem) @audrism wrote:
I propose that oscar use the same technique, that is use prior versions of f2c file if current version does not exist (possibly giving also some warning).
The text was updated successfully, but these errors were encountered: