Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generative Adversarial Networks #55

Open
maxhodak opened this issue Jan 14, 2017 · 13 comments
Open

Generative Adversarial Networks #55

maxhodak opened this issue Jan 14, 2017 · 13 comments

Comments

@maxhodak
Copy link
Owner

maxhodak commented Jan 14, 2017

In case you guys haven't seen it, this paper came out recently and looks kind of interesting: https://arxiv.org/abs/1701.01329

My first couple read throughs leave me with some questions. The paper triggers a couple of my first-order heuristics (explaining basic stuff like RNNs, seemingly magical performance on generating long valid SMILEs that suggests overfitting) and has kind of a weird application of fine-tuning as transfer learning, among other things. I'm planning on working up some parts of this paper like the stacked LSTMs as a SMILES generator for transfer to a property prediction network over this weekend. Anyone else have any comments on this paper or things to try?

@pechersky @dribnet @dakoner

@pechersky
Copy link
Collaborator

pechersky commented Jan 17, 2017 via email

@maxhodak
Copy link
Owner Author

maxhodak commented Jan 17, 2017

So I let a 3-LSTM run overnight on Saturday and the loss fell to near zero, but it was definitely overfitting; it clearly wasn't extracting any interesting information about the underlying chemistry. At this point I got distracted by the idea of using a GAN instead, which is what I've been doing since then. It's pretty difficult to get it to train well as the discriminator is clearly much easier to learn compared to the generator (the discriminator is pretty trivial when the generator is weak), so I haven't figured out yet how to keep the two in reasonable balance. I'm planning on asking a couple friends at OpenAI for some advice later today. I'll post my ipynb file once I have it working a little better!

@pechersky
Copy link
Collaborator

pechersky commented Jan 17, 2017 via email

@maxhodak
Copy link
Owner Author

I'm using presence/absence from the set. SMILES validity is an arguably even easier metric, as CCCCCcccccccccccccccccccccccccccccccccc is valid SMILES but not representative of the distribution we want to learn.

@maxhodak
Copy link
Owner Author

This is pretty typical of attempts to train my network right now:

screen shot 2017-01-17 at 11 16 02 am

Sampling from which gives me stuff that looks like Caaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

@pechersky
Copy link
Collaborator

pechersky commented Jan 17, 2017 via email

@maxhodak
Copy link
Owner Author

@maxhodak
Copy link
Owner Author

On pretraining, worth noting that if I don't pretrain the generator, no interesting training happens at all when I try and train the GAN. Discriminator loss just goes to 0 and generator loss goes to ~16. It's not clear if pretraining the discriminator matters, or even makes things worse.

Some posts suggest changing learning parameters at runtime depending on which side is "advantaged", see https://github.com/torch/torch.github.io/blob/master/blog/_posts/2015-11-13-gan.md

Some more ideas here I haven't worked through yet: https://github.com/soumith/ganhacks

@maxhodak maxhodak changed the title Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks by Segler et al Generative Adversarial Networks Jan 17, 2017
@pechersky
Copy link
Collaborator

pechersky commented Jan 17, 2017 via email

@maxhodak
Copy link
Owner Author

maxhodak commented Jan 17, 2017

I'm not sure that matters... this isn't an autoencoder; the input is just a source of entropy. The nonlinearities in the generator network should mean the distribution of the input need not resemble the distribution of the output unless I've misunderstood something.

@maxhodak
Copy link
Owner Author

I've got something looking much better now after working in a bunch of the tricks linked above, though it still has a lot of room for improvement:

screen shot 2017-01-17 at 3 36 26 pm

After 200 iterations the generator samples out stuff like:

CCcccC
CCcccN
CCcccc
CCcccC
CCcCc
CCCcccccc
CCcccC

Updated notebook at https://github.com/maxhodak/keras-molecules/blob/gan/SMILES_GAN.ipynb

@pechersky
Copy link
Collaborator

pechersky commented Jan 18, 2017 via email

@XericZephyr
Copy link

Hey, guys. Found a good place. I am also working on this field. I was trying to use seq2seq model to produce an unsupervised fingerprint for each molecule. I am also trying to use GAN as a future work. Does anyone have any update on this GAN idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants