Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to turn off task-specific component in initial layer. #23

Open
PavlosCh opened this issue Aug 28, 2019 · 6 comments
Open

How to turn off task-specific component in initial layer. #23

PavlosCh opened this issue Aug 28, 2019 · 6 comments

Comments

@PavlosCh
Copy link

Hi Ayman,

I am trying to implement one of the architectures you mention in your paper relating to this repository. Specifically I am interested in MTL-DGP* where the task-specific component is turned off.

I am not sure how to do this from reading the code. I understand that in order to switch off the shared components I set the MTL variable to false. Is there a similar way for this to be done for the task specific components?

Thanks

@aboustati
Copy link
Owner

Hi Pavlos,

Apologies for the lack of good documentation. It is currently being worked on.

The multitask attribute in Layer objects propagates task labels through the cascade. It is not related to sharing or task-specific components.

To create a model resembling MTL-DGP* you have to do this in the kernels. More specifically, you should be using a kernel that acts on all your data regardless on task label. The SwitchedKernel object uses different kernels for different tasks. So to have a completely shared middle layer, you should refrain from using this kernel in that Layer. You can still use the MultiKernelLayer object if you want you latent space to have multiple types of processes, even if they are all shared between the tasks.

I hope this clarifies things. Please let me know if you have any follow up questions.

Ayman

@rafaelcgon
Copy link

Hello Ayman,

First of all congratulations for your amazing work and thanks for sharing your code.
I am reading your multi-task notebook and trying to relate it to the MTL-DGP models from the paper. I have a few questions which I think are related Pavlos's.

The model in the notebook apparently creates a 2 layer model, with an input layer composed of 2 shared kernels, and an outer layer with task specific kernels. Are my assumptions correct?

If so, how would you create something similar to MTL-DGP, with an input layer with shared and task specific components, and an output layer that combine those 2?

Lastly, why does the multi-kernel layer have the restriction that the output dimension must be a multiple of the number of kernels? I mean, thinking about shared kernels and using the paper notation (equation 1), you could have i=1 to I GP priors, with I being independent of the number of tasks, right?

Thanks,
Rafael

@aboustati
Copy link
Owner

Hi Rafael,

Thanks for your interest and kind words about my work.

Your assumption is correct about the model in the notebook, if you want to create something similar to mMDGP (MTL-DGP in the old version of the paper) you need to use the SwitchedKernel object. In my code, task specific processes are handled on the kernel level and the SwitchedKernel object wrapped around a list of kernels handles this. The output layer should stay the same as in the notebook.

The reason for restricting the output dimension to be a multiple of the number of kernels is purely for ease of implementation. In theory, you can use an arbitrary number of kernels and outputs associated with their processes; however, this was difficult to implement so I opted for this restriction.

For a more up to date version of the code which is compatible with the alpha version of GPFlow 2.0 check out the gpflow2.0-port branch. The multitask notebook in this branch contains an example of how to construct mMDGP. This version depends on my own fork of GPFlow which is included in the requirements.txt file. To install the requirements you can create a virtual environment and run pip -r requirements.txt. There are plans to port this to the stable version of GPFlow 2.0 but I haven't had the time yet.

Let me know if you have any further question and good luck on your work!

@rafaelcgon
Copy link

Thanks a lot for the quick reply. What version of tensorflow do you run with the gpflow2.0 branch?

@aboustati
Copy link
Owner

This is compatible with Tensorflow 2.0 (not 2.1). The exact version is 2.0.0-dev20190916. I know this is quite brittle, so happy to help with getting this to work. I will be porting this to the stable version of GPflow in the next few months.

@rafaelcgon
Copy link

I got it working with tf 2.0, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants