Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added qmc_quad based method for estimation of the constrained normalization factor #839

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

JasperMartins
Copy link
Contributor

@JasperMartins JasperMartins commented Oct 25, 2024

This PR implements a new method to estimate the normalization factor for constrained priors.
The changes are two-fold:

  1. The integration is performed with scipy.integrate.qmd_quad, a quasi-Monte Carlo-based integration routine that is expected to yield better results than regular Monte Carlo integration. However, the routine requires a rescaling step from the unit cube rather than direct sampling.
  2. The termination of the integration is based on its relative statistical error rather than the number of accepted samples.

I have tested the two implementations with a relatively easy scenario: A 2D uniform prior on the [-1,1] cube, constrained to an inscribed circle with different radii:

image

The new method is significantly faster for high normalization factors, and the relative errors show a similar spread.

The implementation is marked as a draft because of the requirement of a rescale-method of the priors. Thus, it could be nice to keep the old method as a fallback. Also, the relative-error termination criterion could be applied just as well to the old implementation.

Related issue: #835

@ColmTalbot
Copy link
Collaborator

The implementation is marked as a draft because of the requirement of a rescale-method of the priors. Thus, it could be nice to keep the old method as a fallback. Also, the relative-error termination criterion could be applied just as well to the old implementation.

We currently have a (soft) requirement that all priors should implement a rescale method (currently it will just return None, which is not ideal, https://github.com/bilby-dev/bilby/blob/main/bilby/core/prior/base.py#L137-L153), so this approach should be safe.

Even if we make the base class raise an error when you attempt to rescale it will still be possible to use some samplers in that case.

It's possible that people will implement their own prior subclasses that don't support the rescale, so I'm not opposed to keeping a fallback. It may make everything easier if we change bilby.core.prior.base.Prior.rescale to raise a NotImplementedError, but that should probably get some more eyes and be done in a separate PR.

@ColmTalbot
Copy link
Collaborator

I think that actually the existing method won't work if the new prior doesn't implement rescale as sample. I think it's sufficiently unlikely that people are manually defining sample without rescale.

@JasperMartins
Copy link
Contributor Author

JasperMartins commented Oct 30, 2024

I have updated the PR quite a bit. The core logic of the integration of the normalization factor is now handled by one of two functions: either MC-Integration based on samples from PriorDict.sample, or quasi MC-Integration based on the rescale method. The user can choose which is used, but the code also checks if the rescale method is implemented if qmc_quad is used and will default to from_samples if not. For both methods, termination of the integration is handled vi a bound of the estimated relative error. For both methods, a max_trials kwarg can be used to limit the number of probability evaluations.

I have also optimized the from_samples implementation. Before, every time min_accept was not reached, new samples were added to a list, and the constrained was applied to the full list - yielding a steep increase in runtime with the number of iterations while it would have been sufficient to check the new samples.

For the example I gave above, the qmc-based implementation is now actually slower than the from_samples method due to a higher overhead. Priors that implement sample by just calling rescale on unit-samples should perform much closer.
I still selected qmc_quad as the default because, as the attached plot shows, for normalization factors close to 1 (which is more likely in most applications), the relative error is smaller.

I have also improved the robustness against bugs by checking if the chosen keys are sufficient to compute the constrained, and if the PriorDict is constrained in the first place.

image

Copy link
Collaborator

@ColmTalbot ColmTalbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! I just have a few specific comments/questions.

bilby/core/prior/dict.py Outdated Show resolved Hide resolved
bilby/core/prior/dict.py Outdated Show resolved Hide resolved
bilby/core/prior/dict.py Outdated Show resolved Hide resolved
bilby/core/prior/dict.py Outdated Show resolved Hide resolved
bilby/core/prior/dict.py Outdated Show resolved Hide resolved
@JasperMartins JasperMartins force-pushed the improved_normalization_factor_estimation branch from 8e73e20 to 98bd7c6 Compare November 12, 2024 15:15
Copy link
Collaborator

@ColmTalbot ColmTalbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in great shape, just two last documentation based comments.

bilby/core/prior/dict.py Outdated Show resolved Hide resolved
bilby/core/prior/dict.py Show resolved Hide resolved
@ColmTalbot ColmTalbot added this to the 2.4.0 milestone Nov 12, 2024
@JasperMartins JasperMartins changed the title Draft: Added qmc_quad based method for estimation of the constrained normalization factor Added qmc_quad based method for estimation of the constrained normalization factor Nov 12, 2024
@JasperMartins JasperMartins force-pushed the improved_normalization_factor_estimation branch from 5cf5cb8 to 0ccf0c9 Compare November 13, 2024 17:40
Copy link
Collaborator

@ColmTalbot ColmTalbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing this through!

@JasperMartins JasperMartins changed the title Added qmc_quad based method for estimation of the constrained normalization factor Draft: Added qmc_quad based method for estimation of the constrained normalization factor Nov 14, 2024
@JasperMartins
Copy link
Contributor Author

Sorry, while taking a last look, I actually caught a bug. The logic enabling marginalization over joint prior keys was not working quite right. I switched to draft so that this is not merged yet. I'll do one last push shortly with a new test case that checks for JointPriors.

@ColmTalbot ColmTalbot removed this from the 2.4.0 milestone Nov 14, 2024
@ColmTalbot ColmTalbot marked this pull request as draft November 14, 2024 16:01
@mj-will mj-will added this to the 2.5.0 milestone Nov 14, 2024
@JasperMartins
Copy link
Contributor Author

I'll do one last push shortly with a new test case that checks for JointPriors.

This turned out to be quite the deep dive. I have implemented two test cases with multivariate Gaussian priors, both with covariance matrices equal to identity. One case has three dimensions and one case two dimensions, and both cases are constrained to the unit circle in the two dimensions shared- which means they should integrate to the same normalization constant (1/scipy.stats.chi^2(df=2).cdf(1), in fact).

However, using Halton sequences, the integration of the two-dimensional case consistently failed, while the three-dimensional case worked fine. I have also tested the scipy native qmc_quad routine, yielding the same, wrong result.
I have played around quite a bit, tried different sizes of the sampling chunk (sizes with $2^m + 1$ for some integer m yielded the right result, actually), tried Sobol sequences (which also yielded wrong results), checked if the problem is the seeding...and tried just standard MC sampling on the unit-cube, which worked consistently.

While trying to figure out what was going wrong, I also noticed this paragraph in the scipy documentation of the qmc module:

Also, to see an improvement over IID MC, (R)QMC requires a bit of smoothness of the integrand, roughly the mixed first order derivative in each direction,[...], must be integral. For instance, a function that is 1 inside the hypersphere and 0 outside of it has infinite variation in the sense of Hardy and Krause for any dimension d > 2.

The nature of the integration we want to perform is just that: we always integrate a characteristic function that is 0 somewhere and 1 elsewhere.

For these reasons, using QMC appears to be a bad idea after all, although I find it concerning that a standard scipy integration scheme yields (quite) wrong results for such a simple case. I will open an Issue upstream when I find time.

Anyway, the changes to the termination of the integration still seem worthwhile to me, and the added test cases are certainly nice to have. I also think it is still worthwhile to have two integration methods, one "from_samples" and one "from_rescale"), which can be used to check the consistency of the methods. The new commits adopt these changes.

I'm sorry this was only unearthed so late in this pull request. I wonder if this should be moved to an entirely new PR, as this one is related so much to QMC integration.

@JasperMartins
Copy link
Contributor Author

JasperMartins commented Nov 20, 2024

Another update from me: It turns out that the issue I have described above arises because the rescaling step of the multivariate Gaussian prior reveals the correlations between points produced by the Halton random number generator! Consider the following example:

import bilby
import scipy
import numpy as np

prior_dist = bilby.prior.MultivariateGaussianDist(
    names=["x", "y"],
    mus=[0, 0],
    sigmas=[1, 1],
)


joint_prior_x = bilby.prior.JointPrior(dist=prior_dist, name="x")
joint_prior_y = bilby.prior.JointPrior(dist=prior_dist, name="y")
prior_dict = bilby.prior.PriorDict(
    {
        "x": joint_prior_x_2,
        "y": joint_prior_y_2,
    },
)

First, let's produce samples using normal monte carlo:

keys = ("x", "y")
sampling_chunk = 2**14
samples = np.random.random(size=(len(keys), sampling_chunk))
samples = np.array(prior_dict.rescale(keys=keys, theta=theta)).reshape((len(keys), sampling_chunk))
samples = {key: np.array(samps) for key, samps in zip(keys, samples)}

If we look at a corner plot, everything looks good:
image

Now, lets use Halton:

keys = ("x", "y")
sampling_chunk = 2**14
qrng = scipy.stats.qmc.Halton(d=len(keys), scramble=True)
theta = qrng.random(sampling_chunk).T
samples = np.array(prior_dict.rescale(keys=keys, theta=theta)).reshape((len(keys), sampling_chunk))
samples = {key: np.array(samps) for key, samps in zip(keys, samples)}

And the result is clearly broken:
image

What led to considerable confusion on my part during debugging is that sampling_chunk = 2**14 + 1, as well as other, seemingly arbitrary changes, works just fine:
image

Also, one can observe how much smoother, i.e. more evenly sampled, the working Halton case is compared to normal monte carlo random numbers - indicating advantages for integration that could be exploited.
However, In the end, I think it is more important for the normalization factor integration to be robust to various ways to define priors, rather than squeezing out every bit of efficiency, so my above conclusion still stands.

Edit: Another interesting observation: The issue appears to depend on the dimensionality. 2D distributions appear to require an uneven number of samples, while uneven dimensions require an even count. I have only tested this for a limited number of sample sizes and dimensions, though.

…ray and update in-place once all keys are requested. Changed (Conditional)PriorDict.rescale to always return samples in right shape.
@JasperMartins JasperMartins force-pushed the improved_normalization_factor_estimation branch from fe6fdaf to 8133a2c Compare November 25, 2024 15:32
@JasperMartins JasperMartins force-pushed the improved_normalization_factor_estimation branch from 8133a2c to 1b02af6 Compare November 25, 2024 15:55
@JasperMartins
Copy link
Contributor Author

I hope to have a final update on this bug: I have found the bug that led to the correlations in the Quasi-Monte Carlo rescaling step.

    def _integrate_normalization_factor_from_qmc_quad(self, keys, sampling_chunk):
        qrng = Halton(len(keys), seed=random.rng)
        theta = qrng.random(sampling_chunk).reshape((len(keys), -1)) <-----1.  
        samples = self.rescale(keys=keys, theta=theta)
        samples = np.array({key: samps for key, samps in zip(keys, samples)}).reshape((len(keys, -1)) <----- 2.  
        factor = np.mean(self.evaluate_constraints(samples))
        return factor
  1. This line must have .T, not .reshape((len(keys,-1)) as the latter leads to a wrong order of the elements. For true Monte Carlo, the difference does not matter, because all entries of theta are independent random variables. For Quasi Monte Carlo, however, the order matters.
  2. This line was troubled by issues ultimately solved by Conserve rescale shape #863, hence, this PR now also depends on that PR.

Also, one can observe how much smoother, i.e. more evenly sampled, the working Halton case is compared to normal monte carlo random numbers - indicating advantages for integration that could be exploited.
However, In the end, I think it is more important for the normalization factor integration to be robust to various ways to define priors, rather than squeezing out every bit of efficiency, so my above conclusion still stands.

I would leave it up to you to decide whether to use Quasi Monte Carlo or "normal" Monte Carlo. In regard to the failing test, conversion_function and so on now require casts to numpy arrays to deal with lists of samples. See #863 where I have started a discussion on the matter.

@JasperMartins JasperMartins changed the title Draft: Added qmc_quad based method for estimation of the constrained normalization factor Added qmc_quad based method for estimation of the constrained normalization factor Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve perfomance of the normalization factor estimation for constraint priors
3 participants