Added qmc_quad based method for estimation of the constrained normalization factor #839

JasperMartins · 2024-10-25T14:46:42Z

This PR implements a new method to estimate the normalization factor for constrained priors.
The changes are two-fold:

The integration is performed with scipy.integrate.qmd_quad, a quasi-Monte Carlo-based integration routine that is expected to yield better results than regular Monte Carlo integration. However, the routine requires a rescaling step from the unit cube rather than direct sampling.
The termination of the integration is based on its relative statistical error rather than the number of accepted samples.

I have tested the two implementations with a relatively easy scenario: A 2D uniform prior on the [-1,1] cube, constrained to an inscribed circle with different radii:

The new method is significantly faster for high normalization factors, and the relative errors show a similar spread.

The implementation is marked as a draft because of the requirement of a rescale-method of the priors. Thus, it could be nice to keep the old method as a fallback. Also, the relative-error termination criterion could be applied just as well to the old implementation.

Related issue: #835

ColmTalbot · 2024-10-25T16:32:11Z

The implementation is marked as a draft because of the requirement of a rescale-method of the priors. Thus, it could be nice to keep the old method as a fallback. Also, the relative-error termination criterion could be applied just as well to the old implementation.

We currently have a (soft) requirement that all priors should implement a rescale method (currently it will just return None, which is not ideal, https://github.com/bilby-dev/bilby/blob/main/bilby/core/prior/base.py#L137-L153), so this approach should be safe.

Even if we make the base class raise an error when you attempt to rescale it will still be possible to use some samplers in that case.

It's possible that people will implement their own prior subclasses that don't support the rescale, so I'm not opposed to keeping a fallback. It may make everything easier if we change bilby.core.prior.base.Prior.rescale to raise a NotImplementedError, but that should probably get some more eyes and be done in a separate PR.

ColmTalbot · 2024-10-25T17:09:38Z

I think that actually the existing method won't work if the new prior doesn't implement rescale as sample. I think it's sufficiently unlikely that people are manually defining sample without rescale.

JasperMartins · 2024-10-30T14:51:00Z

I have updated the PR quite a bit. The core logic of the integration of the normalization factor is now handled by one of two functions: either MC-Integration based on samples from PriorDict.sample, or quasi MC-Integration based on the rescale method. The user can choose which is used, but the code also checks if the rescale method is implemented if qmc_quad is used and will default to from_samples if not. For both methods, termination of the integration is handled vi a bound of the estimated relative error. For both methods, a max_trials kwarg can be used to limit the number of probability evaluations.

I have also optimized the from_samples implementation. Before, every time min_accept was not reached, new samples were added to a list, and the constrained was applied to the full list - yielding a steep increase in runtime with the number of iterations while it would have been sufficient to check the new samples.

For the example I gave above, the qmc-based implementation is now actually slower than the from_samples method due to a higher overhead. Priors that implement sample by just calling rescale on unit-samples should perform much closer.
I still selected qmc_quad as the default because, as the attached plot shows, for normalization factors close to 1 (which is more likely in most applications), the relative error is smaller.

I have also improved the robustness against bugs by checking if the chosen keys are sufficient to compute the constrained, and if the PriorDict is constrained in the first place.

ColmTalbot

This is looking good! I just have a few specific comments/questions.

bilby/core/prior/dict.py

ColmTalbot

This is in great shape, just two last documentation based comments.

bilby/core/prior/dict.py

ColmTalbot

Thanks for pushing this through!

JasperMartins · 2024-11-14T09:24:19Z

Sorry, while taking a last look, I actually caught a bug. The logic enabling marginalization over joint prior keys was not working quite right. I switched to draft so that this is not merged yet. I'll do one last push shortly with a new test case that checks for JointPriors.

JasperMartins · 2024-11-15T10:20:24Z

I'll do one last push shortly with a new test case that checks for JointPriors.

This turned out to be quite the deep dive. I have implemented two test cases with multivariate Gaussian priors, both with covariance matrices equal to identity. One case has three dimensions and one case two dimensions, and both cases are constrained to the unit circle in the two dimensions shared- which means they should integrate to the same normalization constant (1/scipy.stats.chi^2(df=2).cdf(1), in fact).

However, using Halton sequences, the integration of the two-dimensional case consistently failed, while the three-dimensional case worked fine. I have also tested the scipy native qmc_quad routine, yielding the same, wrong result.
I have played around quite a bit, tried different sizes of the sampling chunk (sizes with $2^m + 1$ for some integer m yielded the right result, actually), tried Sobol sequences (which also yielded wrong results), checked if the problem is the seeding...and tried just standard MC sampling on the unit-cube, which worked consistently.

While trying to figure out what was going wrong, I also noticed this paragraph in the scipy documentation of the qmc module:

Also, to see an improvement over IID MC, (R)QMC requires a bit of smoothness of the integrand, roughly the mixed first order derivative in each direction,[...], must be integral. For instance, a function that is 1 inside the hypersphere and 0 outside of it has infinite variation in the sense of Hardy and Krause for any dimension d > 2.

The nature of the integration we want to perform is just that: we always integrate a characteristic function that is 0 somewhere and 1 elsewhere.

For these reasons, using QMC appears to be a bad idea after all, although I find it concerning that a standard scipy integration scheme yields (quite) wrong results for such a simple case. I will open an Issue upstream when I find time.

Anyway, the changes to the termination of the integration still seem worthwhile to me, and the added test cases are certainly nice to have. I also think it is still worthwhile to have two integration methods, one "from_samples" and one "from_rescale"), which can be used to check the consistency of the methods. The new commits adopt these changes.

I'm sorry this was only unearthed so late in this pull request. I wonder if this should be moved to an entirely new PR, as this one is related so much to QMC integration.

JasperMartins · 2024-11-20T13:35:55Z

Another update from me: It turns out that the issue I have described above arises because the rescaling step of the multivariate Gaussian prior reveals the correlations between points produced by the Halton random number generator! Consider the following example:

import bilby
import scipy
import numpy as np

prior_dist = bilby.prior.MultivariateGaussianDist(
    names=["x", "y"],
    mus=[0, 0],
    sigmas=[1, 1],
)


joint_prior_x = bilby.prior.JointPrior(dist=prior_dist, name="x")
joint_prior_y = bilby.prior.JointPrior(dist=prior_dist, name="y")
prior_dict = bilby.prior.PriorDict(
    {
        "x": joint_prior_x_2,
        "y": joint_prior_y_2,
    },
)

First, let's produce samples using normal monte carlo:

keys = ("x", "y")
sampling_chunk = 2**14
samples = np.random.random(size=(len(keys), sampling_chunk))
samples = np.array(prior_dict.rescale(keys=keys, theta=theta)).reshape((len(keys), sampling_chunk))
samples = {key: np.array(samps) for key, samps in zip(keys, samples)}

If we look at a corner plot, everything looks good:

Now, lets use Halton:

keys = ("x", "y")
sampling_chunk = 2**14
qrng = scipy.stats.qmc.Halton(d=len(keys), scramble=True)
theta = qrng.random(sampling_chunk).T
samples = np.array(prior_dict.rescale(keys=keys, theta=theta)).reshape((len(keys), sampling_chunk))
samples = {key: np.array(samps) for key, samps in zip(keys, samples)}

And the result is clearly broken:

What led to considerable confusion on my part during debugging is that sampling_chunk = 2**14 + 1, as well as other, seemingly arbitrary changes, works just fine:

Also, one can observe how much smoother, i.e. more evenly sampled, the working Halton case is compared to normal monte carlo random numbers - indicating advantages for integration that could be exploited.
However, In the end, I think it is more important for the normalization factor integration to be robust to various ways to define priors, rather than squeezing out every bit of efficiency, so my above conclusion still stands.

Edit: Another interesting observation: The issue appears to depend on the dimensionality. 2D distributions appear to require an uneven number of samples, while uneven dimensions require an even count. I have only tested this for a limited number of sample sizes and dimensions, though.

…ray and update in-place once all keys are requested. Changed (Conditional)PriorDict.rescale to always return samples in right shape.

…bility

…zation factor

…nst bugs

…putaional overhead and repeateability

…f rescale method is save

JasperMartins · 2024-11-25T19:25:20Z

I hope to have a final update on this bug: I have found the bug that led to the correlations in the Quasi-Monte Carlo rescaling step.

    def _integrate_normalization_factor_from_qmc_quad(self, keys, sampling_chunk):
        qrng = Halton(len(keys), seed=random.rng)
        theta = qrng.random(sampling_chunk).reshape((len(keys), -1)) <-----1.  
        samples = self.rescale(keys=keys, theta=theta)
        samples = np.array({key: samps for key, samps in zip(keys, samples)}).reshape((len(keys, -1)) <----- 2.  
        factor = np.mean(self.evaluate_constraints(samples))
        return factor

This line must have .T, not .reshape((len(keys,-1)) as the latter leads to a wrong order of the elements. For true Monte Carlo, the difference does not matter, because all entries of theta are independent random variables. For Quasi Monte Carlo, however, the order matters.
This line was troubled by issues ultimately solved by Conserve rescale shape #863, hence, this PR now also depends on that PR.

Also, one can observe how much smoother, i.e. more evenly sampled, the working Halton case is compared to normal monte carlo random numbers - indicating advantages for integration that could be exploited.
However, In the end, I think it is more important for the normalization factor integration to be robust to various ways to define priors, rather than squeezing out every bit of efficiency, so my above conclusion still stands.

I would leave it up to you to decide whether to use Quasi Monte Carlo or "normal" Monte Carlo. In regard to the failing test, conversion_function and so on now require casts to numpy arrays to deal with lists of samples. See #863 where I have started a discussion on the matter.

ColmTalbot linked an issue Oct 25, 2024 that may be closed by this pull request

Improve perfomance of the normalization factor estimation for constraint priors #835

Open

ColmTalbot added enhancement New feature or request >100 lines priors labels Oct 30, 2024

ColmTalbot requested changes Oct 30, 2024

View reviewed changes

JasperMartins force-pushed the improved_normalization_factor_estimation branch from 8e73e20 to 98bd7c6 Compare November 12, 2024 15:15

ColmTalbot reviewed Nov 12, 2024

View reviewed changes

bilby/core/prior/dict.py Outdated Show resolved Hide resolved

bilby/core/prior/dict.py Show resolved Hide resolved

ColmTalbot added this to the 2.4.0 milestone Nov 12, 2024

JasperMartins changed the title ~~Draft: Added qmc_quad based method for estimation of the constrained normalization factor~~ Added qmc_quad based method for estimation of the constrained normalization factor Nov 12, 2024

JasperMartins force-pushed the improved_normalization_factor_estimation branch from 5cf5cb8 to 0ccf0c9 Compare November 13, 2024 17:40

ColmTalbot approved these changes Nov 13, 2024

View reviewed changes

JasperMartins changed the title ~~Added qmc_quad based method for estimation of the constrained normalization factor~~ Draft: Added qmc_quad based method for estimation of the constrained normalization factor Nov 14, 2024

ColmTalbot removed this from the 2.4.0 milestone Nov 14, 2024

ColmTalbot marked this pull request as draft November 14, 2024 16:01

ColmTalbot mentioned this pull request Nov 14, 2024

BUG: issue warning when prior sampling efficiency is below 1e-3. #853

Open

mj-will added this to the 2.5.0 milestone Nov 14, 2024

JasperMartins added 2 commits November 20, 2024 18:01

Changed rescale-method of JointPrior to always return correct-size ar…

2d259ca

…ray and update in-place once all keys are requested. Changed (Conditional)PriorDict.rescale to always return samples in right shape.

Small fix to rescale of JointPrior

5ce0bbe

JasperMartins force-pushed the improved_normalization_factor_estimation branch from fe6fdaf to 8133a2c Compare November 25, 2024 15:32

JasperMartins added 5 commits November 25, 2024 16:50

For jointprior rescale, only cast to list once its save to loose muta…

4e1a3d2

…bility

Added qmc_quad based method for estimation of the constrained normali…

9abc7d8

…zation factor

added unit test

9158a7f

Refactor normalization factor estimation and make it more robust agai…

5e16450

…nst bugs

Applied changes discussed on Github and updated qmc_quad for less com…

dc8a5c8

…putaional overhead and repeateability

JasperMartins added 6 commits November 25, 2024 16:50

Added check if constraint estimation is necessary and updated check i…

a63918d

…f rescale method is save

switched to logger.info()

07a01c5

updated documentation to include versionchanged and seealso

9083bce

Updated method documentation

64e85cc

Added test cases and and jointprior handling

88e9ee7

Adopt new rescale function that maintains correct output shape

1b02af6

JasperMartins force-pushed the improved_normalization_factor_estimation branch from 8133a2c to 1b02af6 Compare November 25, 2024 15:55

JasperMartins changed the title ~~Draft: Added qmc_quad based method for estimation of the constrained normalization factor~~ Added qmc_quad based method for estimation of the constrained normalization factor Nov 25, 2024

JasperMartins requested a review from ColmTalbot November 25, 2024 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added qmc_quad based method for estimation of the constrained normalization factor #839

Added qmc_quad based method for estimation of the constrained normalization factor #839

JasperMartins commented Oct 25, 2024 •

edited

Loading

ColmTalbot commented Oct 25, 2024

ColmTalbot commented Oct 25, 2024

JasperMartins commented Oct 30, 2024 •

edited

Loading

ColmTalbot left a comment

ColmTalbot left a comment

ColmTalbot left a comment

JasperMartins commented Nov 14, 2024

JasperMartins commented Nov 15, 2024

JasperMartins commented Nov 20, 2024 •

edited

Loading

JasperMartins commented Nov 25, 2024

Added qmc_quad based method for estimation of the constrained normalization factor #839

Are you sure you want to change the base?

Added qmc_quad based method for estimation of the constrained normalization factor #839

Conversation

JasperMartins commented Oct 25, 2024 • edited Loading

ColmTalbot commented Oct 25, 2024

ColmTalbot commented Oct 25, 2024

JasperMartins commented Oct 30, 2024 • edited Loading

ColmTalbot left a comment

Choose a reason for hiding this comment

ColmTalbot left a comment

Choose a reason for hiding this comment

ColmTalbot left a comment

Choose a reason for hiding this comment

JasperMartins commented Nov 14, 2024

JasperMartins commented Nov 15, 2024

JasperMartins commented Nov 20, 2024 • edited Loading

JasperMartins commented Nov 25, 2024

JasperMartins commented Oct 25, 2024 •

edited

Loading

JasperMartins commented Oct 30, 2024 •

edited

Loading

JasperMartins commented Nov 20, 2024 •

edited

Loading