Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maximum Likelihood Demonstration #26

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Conversation

charlesknipp
Copy link
Collaborator

Overview

I added a quick little MLE demonstration, which should work for almost every AD backend (more on this later). This particular example was taken from Kalman.jl using Optimisers.jl to create a custom Newton's method.

A Note on Automatic Differentiation

All the AD here is done via DifferentiationInterface.jl to support something more universal. This allows us to quickly test various backends, switching to whichever suits the users' needs. Luckily, in this instance, every relevant backend can at least evaluate gradients. Unfortunately, since this is a Newton's method, the Hessian fails when using Enzyme. This is likely due to a bug in DifferentiationInterface, so it's not our problem.

Speaking of Enzyme, it requires type stability to efficiently compute gradients on the lowered code, which allowed me to catch an instability in the Kalman filter. See my post here where one of the Enzyme devs caught the issue.

Requests

  • I don't have the environment properly set up since GeneralisedFilters is not on the official registry, and I can't figure out how to add it to Project.toml
  • The SSMProblems dependency is not yet up to date, and I needed to set pkg> dev SSMProblems in order for it to execute
  • If you have any suggestions, feel free to tear it to shreds

@charlesknipp
Copy link
Collaborator Author

I have been testing various backends and there seem to be some problems. I'm not entirely sure it's our fault, but it seems like Zygote, Enzyme, and Mooncake all yield some sort of error.

Zygote fails rather spectacularly because the observation coefficient matrix is non-square. I'm not sure how this even factors into the computation of the gradient; and, if I'm being honest, if Zygote is the one we can't get working then I'm okay with that.

Both Enzyme and Mooncake error specifically when computing the Hessian. For Mooncake, this is due to some recursive tangent type error; I'm not entirely sure why this wasn't an issue with the gradient, but I think there may be some problems. Enzyme just has issues with the additional context (the Constant seen in the objective), which is an underlying bug in the way it interacts with DifferentiationInterface. Lastly, the gradient computation for both is incredibly slow, even for shorter series. Which leads me to believe there might be something else under the hood.

Takeaways

Of all the working interfaces, I'm glad it's ForwardDiff since it's a stable and widely adopted framework. Although, the future of AD is clearly in the direction of both Mooncake and Enzyme. In the future, I'd really like to get those two backends to work consistently.

@THargreaves
Copy link
Collaborator

I don't have the environment properly set up since GeneralisedFilters is not on the official registry, and I can't figure out how to add it to Project.toml

I've just been deving it using the relative path. But yes, should get it registered soon. Was just waiting to tidy things up a touch more but think it should be there by the end of the week.

@THargreaves
Copy link
Collaborator

All sounds very strange, especially with it not working for non-square H. Glad the ForwardDiff version is working though.

I'll see if Will (Mooncake.jl author) can have a quick look and see if anything stands out to him.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants