Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN predictions when the inference is outside of the training data. #32

Open
mat-ej opened this issue Sep 25, 2023 · 3 comments
Open

NaN predictions when the inference is outside of the training data. #32

mat-ej opened this issue Sep 25, 2023 · 3 comments

Comments

@mat-ej
Copy link

mat-ej commented Sep 25, 2023

Hi, @wil-j-wil

I randomly stumbled upon your work when researching temporal GPs and found this cool package. (Thanks for such an awesome work behind it)

I am running into the following issue when using your package

  • whenever I use a more complex kernel my predictions that are outside of the X_train values are all NaNs.

For example airline passenger dataset using some default setup:

kernel = QuasiPeriodicMatern32()
likelihood = Gaussian()
model = MarkovVariationalGP(kernel=kernel, likelihood=likelihood, X=X_train, Y=y_train)
  • Again, whenever I predict value "within" the X_train, it works well:
mean_in, var_in = model.predict(X=X_train[-1]) # this is inside X_train, E[f] = "perfetto"
  • Whenever I try to "extrapolate" I get NaNs.
mean_out, var_out = model.predict(X=X_test[0]) # this is outside X_train, E[f] = nan

I also get NaN for the first observation of X_train, (see plots below)

  • This does not happen when using basic kernels such as:
kernel = Matern12()
  • Specifically, it does happen whenever I try to use Sum of kernels or any of the combination kernels.

matern12
qperiodmatern32

Am I perhaps misunderstanding the purpose of the model or doing something wrong? (Thanks in advance for your help : ). I am just a beginner GP enthusiast looking into what these models are capable of doing)

P.S. I installed bayesnewton (hopefully) according to requirements:

[tool.poetry.dependencies]
python = "3.10.11"
tensorflow-macos = "2.13.0"
tensorflow-probability = "^0.21.0"
numba = "^0.58.0"
gpflow = "^2.9.0"
scipy = "^1.11.2"
pandas = "^2.1.1"
jax = "0.4.2"
jaxlib = "0.4.2"
objax = "1.6.0"
ipykernel = "^6.25.2"
plotly = "^5.17.0"
seaborn = "^0.12.2"
nbformat = "^5.9.2"
scikit-learn = "^1.3.1"
convertbng = "^0.6.43"
@ThoreWietzke
Copy link
Contributor

Hi mat-ej,

I've also encoutered this bug. It happens inside model.predict, as effectively inf and -inf are appended to either side of your input data. Because the periodic kernel now uses expm() to calculate the discretized state transition matrix, this results in a NaN.

I don't know why this is done, so @wil-j-wil has to enlighten us.

@wil-j-wil
Copy link
Collaborator

Apologies for the delay in replying to this. Thanks for pointing out the issue. This is indeed due to the use of expm when the gap between time steps is too large, which is the case at the edges where we append a very large number. We append like this because it's needed when using the "doubly sparse" model (i.e. sparse in time). We also append in the non-sparse case because this conveniantly allowed us to share much of the prediction code between the different models. We'll have to come up with a better solution.

@wil-j-wil
Copy link
Collaborator

I have now implemented closed form transition matrices for the periodic and quasi-periodic kernels. This means that (almost) all of the implemented kernels have closed form solutions and don't need to use expm. So this shold fix your issue @mat-ej. In addition, these kernels are now much more efficient, which should also help @ThoreWietzke if you are still working with these models.

I am going to leave this issue open because the bug still exists whenever using expm, which would be the case if someone implements a custom kernel without implementing a closed form solution for the transition matrix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants