-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Samples parameterization #33
Comments
I'm not actually sure what it means to have a PDF parametrization whose parameters are the sample values? How do you compute the pdf or cdf or ppf from sample values? It seem to me that, given some sample values, you need to convert to some other representation, e.g., qp.spline_from_samples. Perhaps what you would like is a class that allows users to store sample values, and provides an easy interface to the conversion routines. |
Functionally, I agree that the methods would have to be implemented using the KDE as a default intermediary, so a class that initializes an ensemble from samples, outputs to samples, and connects to the conversion functions would indeed be useful. |
So, the KDE representation was really inefficient for large samples b/c it it evaluated the PDF by doing an operation that involved all the samples. And also b/c there wasn't really a smart way to implement _cdf or _ppf So what I did was to convert it to a spline. So the Spline_Gen.create_from_samples will create a PDF from samples. And of course you can generate samples from any ensemble using ens.rvs(). We could put in an explicit KDE that computes things using the samples, but we are gonna want to tell people not to use it form more than a few samples or a few PDFs, cause it is really not performant. |
It would probably be a better long term solution just to make a NB that shows how to invoke Spline_Gen.create_from_samples and maybe a function that does ensemble.write_samples() for any PDF. |
If for whatever reason you want something ensemblish that ties together the reading and writing of samples, I would actually consider using the newly minted ancillary data to do that. I.e., a spline_pdf that carries around the samples used to generate it. |
Re: KDE, I think the dominant use case would do it for lots of samples and lots of PDFs, so your concern about computation is a fair one. Perhaps the most natural thing to do is actually quantiles, where the N sample values {z} naturally define regular quantiles separated by 1/N, which could then be binned down upon conversion. |
#170 is a duplicate of this but the fresher conversation makes it the more reasonable issue to keep open. |
Currently we can make PDFs from samples via
qp.spline_from_samples
but, unless I'm missing something, there isn't a parameterization whose parameters are the sample values themselves rather than the spline parameters derived from a KDE thereof. This would be very helpful for things like the PIT metric used in RAIL, which is a 1D probability distribution defined by samples.The text was updated successfully, but these errors were encountered: