-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package code, make pip installable, and add CLI #3
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- move all code into src/drugex folder to comply with the Python community standard - updated imports to reflect package structure - optimize imports (removed unused, fix order with PyCharm) - updated all module docstrings to use Sphinx style - made sure all modules had a main function (or a more_main function for ones that already had one). This involved changing some functions and variable passing because of the scope problems that came from code in if __name__ == '__main__'
Leave the readme with installation in development mode because the repo still relies on hard coded file paths
It looks like the data for ZINC is already available in the repo (see XuhanLiu#2) but I reorganized some fo the code in main() to make more extensible for later - added better tqdm logging - excised regex to save time on parsing/compilation
Works with RF, didn't work with DNN. Didn't test the rest, but added them to README.
- Reorganize the way the log / output models are named - add CLI and update README
After some number of hours, I got the following error. I'm pretty sure it's not due to any changes I've made, but maybe you have some ideas. $ python -m drugex.pretrainer -d data/ -o output/
Exploitation network begins to be trained...
/Users/cthoyt/dev/DrugEx/src/drugex/util.py:119: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
df = pd.read_table(df)
Traceback (most recent call last):
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 74, in <module>
output_directory=output_directory,
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 69, in main
def main(input_directory, output_directory, batch_size, cuda, use_tqdm):
File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 34, in _main_helper
if not os.path.exists(net_pr_pickle_path):
File "/Users/cthoyt/dev/DrugEx/src/drugex/model.py", line 419, in fit
best_error = np.inf
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 560, in __next__
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 560, in <listcomp>
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/Users/cthoyt/dev/DrugEx/src/drugex/util.py", line 135, in __getitem__
encoded = self.voc.encode(self.tokens[i])
File "/Users/cthoyt/dev/DrugEx/src/drugex/util.py", line 83, in encode
arr[i] = self.tk2ix[char]
KeyError: '[BH-]' |
Hi,
The reason is that '[BH-]' is not contained in the data/voc.txt, you can add it manually.
Cheers,
Xuhan
…------------------ 原始邮件 ------------------
发件人: "Charles Tapley Hoyt"<[email protected]>;
发送时间: 2019年6月24日(星期一) 晚上11:10
收件人: "XuhanLiu/DrugEx"<[email protected]>;
抄送: "Subscribed"<[email protected]>;
主题: Re: [XuhanLiu/DrugEx] Package code, make pip installable, and add CLI(#3)
After some number of hours, I got the following error. I'm pretty sure it's not due to any changes I've made, but maybe you have some ideas.
$ python -m drugex.pretrainer -d data/ -o output/ Exploitation network begins to be trained... /Users/cthoyt/dev/DrugEx/src/drugex/util.py:119: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'. df = pd.read_table(df) Traceback (most recent call last): File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 74, in <module> output_directory=output_directory, File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 69, in main def main(input_directory, output_directory, batch_size, cuda, use_tqdm): File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 34, in _main_helper if not os.path.exists(net_pr_pickle_path): File "/Users/cthoyt/dev/DrugEx/src/drugex/model.py", line 419, in fit best_error = np.inf File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 560, in __next__ batch = self.collate_fn([self.dataset[i] for i in indices]) File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 560, in <listcomp> batch = self.collate_fn([self.dataset[i] for i in indices]) File "/Users/cthoyt/dev/DrugEx/src/drugex/util.py", line 135, in __getitem__ encoded = self.voc.encode(self.tokens[i]) File "/Users/cthoyt/dev/DrugEx/src/drugex/util.py", line 83, in encode arr[i] = self.tk2ix[char] KeyError: '[BH-]'
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
martin-sicho
added a commit
to martin-sicho/DrugEx
that referenced
this pull request
Nov 12, 2019
…rly stages of learning XuhanLiu/DrugEx/XuhanLiu#3 (XuhanLiu#3 (comment))
martin-sicho
added a commit
to martin-sicho/DrugEx
that referenced
this pull request
Nov 15, 2019
Closes XuhanLiu#1, Closes XuhanLiu#2, Closes XuhanLiu#5, Closes XuhanLiu#6 (Integration of changes from PR XuhanLiu#3, also closes PR XuhanLiu#7 since those changes are now merged with the integrated changes from PR XuhanLiu#3) # Conflicts: # .gitignore # README.md # agent.py # pretrainer.py # src/drugex/dataset.py # src/drugex/designer.py # src/drugex/environ.py # src/drugex/model.py
Quite a few years later, but we now have an improved version that also includes CLI, packaging and other goodies: https://github.com/CDDLeiden/DrugEx :) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #1
This is a really big PR, so very quickly it does a few things:
setup.cfg
andsetup.py
and update imports so it can be pip installed (+ updated README)click
(+ updated README)As I'm making updates, I'm running the code to see if I can reproduce the results. I've had to change a few things around in the process:
if __name__ == '__main__'
when there were global variablestqdm
I'm submitting it as a draft PR. After the conference in Sheffield, I was excited to get back to work and try it out, but it was more than I could handle/my computer could process in one day. I'll continue updating the other scripts so they can be ran with the CLI until I've reproduced everything, then I'll give feedback on what I wasn't able to.
I would be happy to explain any of the changes I made in more detail, or give you some resources that I've read along the way while I was learning all of this packaging stuff.