-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make data directory parent configurable #8
Conversation
I really like the concept of what you're doing here; honestly I should have thought to make this configurable before. One question I do have: what do you think about, instead of os env variables, going with a command line argument for when you call the API? Since this is more of a one-off app, I wasn't sure if folks might feel more comfortable doing that than adding env variables for it. The python code might look something like this:
With this, you'd call
For the bat file, so that we can capture it and pass it along to the python script, we already do something similar in Wilmer- our .bat file might look like this:
Then we'd call the bat file with
For the .sh file, it might look like:
Thoughts? It expands the amount of code you'd add in the PR, but I feel like less tech savvy folks or folks just wanting to run this temporarily on a computer might feel more comfortable just adding --database_dir to the command line than an env variable. |
I actually would have preferred that, but since you are using I'll amend my commit and use |
Any preference on which line to put the
|
I don't have Windows, so I can't test the bat script. But I could write a bash equivalent. |
Application accepts a CLI argument to override the default path for the database directories. Added a bash script that was tested on Linux. It might also work on MacOS. Windows bat script still needs modifications, but had no way to test it.
b7b9281
to
f99cdb7
Compare
lol! Honestly I had just let PyCharm arrange them so I don't mind at all wherever it goes. I think at the top looks just fine to me. I can tackle the bat file and can test on Windows. That won't be a problem at all! If you have Linux and can test the .sh file for it, feel free to pull it out of untested. The reason I left them in is for 2 reasons:
|
I did this in my updated commit. Changed it a bit, though.
Maybe it has to do with git-lfs. Huggingface uses this extension for large files, models etc. If you don't have it installed, you probably only get a file pointer but not the actual file data. Large files in git repos cause all kinds of problems. Git-lfs fixes that. |
I completed your pull request, but you didn't get added to the contributor list. This happened to Jeff on Wilmer and mrrober on both wilmer and this project. It has something to do with your local email or username not matching with your github handle. If you'd like to show up on that contributor list, see if you can fix that and then just make a small update to the readme or something. |
Ok, thanks for telling me. I think my commit email is not recognized by github. Will change it for future commits. |
Added environment variable to set a custom parent directory for wiki-dataset and txtai-wikipedia.