Skip to content

Commit

Permalink
Merge pull request #61 from LSSTDESC/issue/4/evaluation-rearrange
Browse files Browse the repository at this point in the history
Issue/4/evaluation rearrangement
  • Loading branch information
aimalz authored Jul 7, 2021
2 parents 81ff354 + a16fe02 commit 58e470a
Show file tree
Hide file tree
Showing 54 changed files with 3,808 additions and 1,933 deletions.
5 changes: 2 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -104,12 +104,11 @@ venv.bak/
.mypy_cache/


# MAC OS
# MAC OS
.DS_Store

# PyCharm
# PyCharm
.idea

# VSCode
.vscode

48 changes: 27 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@

# RAIL: Redshift Assessment Infrastructure Layers

This repo is home to a series of LSST-DESC projects aiming to quantify the impact of imperfect prior information on probabilistic redshift estimation.
RAIL differs from [PZIncomplete](https://github.com/LSSTDESC/pz_incomplete) in that it is broken into stages, each corresponding to a manageable unit of infrastructure advancement, a specific question, and a potential publication opportunity.
By pursuing the piecemeal development of RAIL, we aim to achieve the broad goals of PZIncomplete.
RAIL's purpose is to be the infrastructure enabling the PZ WG Deliverables in [the LSST-DESC Science Roadmap (see Sec. 5.18)](https://lsstdesc.org/assets/pdf/docs/DESC_SRM_latest.pdf), aiming to guide the selection and implementation of redshift estimators in DESC pipelines.
RAIL differs from previous plans for PZ pipeline infrastructure in that it is broken into stages, each corresponding to a manageable unit of infrastructure advancement, a specific question to answer with that code, and a guaranteed publication opportunity.
RAIL uses [qp](https://github.com/LSSTDESC/qp) as a back-end for handling univariate probability density functions (PDFs) such as photo-z posteriors or n(z) samples.

## Contributing

Expand All @@ -19,25 +19,31 @@ Once the changes have been approved, you can merge and squash the pull request.
## Immediate Plans

An outline of the baseline RAIL is illustrated [here](https://docs.google.com/drawings/d/1or8xyBqLkpc_4_Cr-ROSA3F7fBm3RMRnRzytorw_FYM/edit?usp=sharing).
1. _MonoRAIL_: Build the basic infrastructure for controlled experiments of forward-modeled photo-z posteriors
* a `rail.creation` submodule that can generate true photo-z posteriors and mock photometry
* an `rail.estimation` submodule with a class for photo-z posterior estimation routines, including a template example implementing the trainZ (experimental control) algorithm
* an `rail.evaluation.metric` submodules that calculate the metrics from the [PZ DC1 Paper](https://github.com/LSSTDESC/PZDC1paper) for estimated photo-z posteriors relative to the true photo-z posteriors
* documented scripts that demonstrate the use of RAIL in a DC1-like experiment on NERSC
* an LSST-DESC Note presenting the RAIL infrastructure
1. _Golden Spike_: Build the basic infrastructure for controlled experiments of forward-modeled photo-z posteriors
- [X] a `rail.creation` subpackage that can generate true photo-z posteriors and mock photometry
- [X] an `rail.estimation` subpackage with a superclass for photo-z posterior estimation routines and at least one subclass template example implementing the trainZ (experimental control) algorithm
- [X] a `rail.evaluation` subpackage that calculates at least the metrics from the [PZ DC1 Paper](https://github.com/LSSTDESC/PZDC1paper) for estimated photo-z posteriors relative to the true photo-z posteriors
- [ ] documented scripts that demonstrate the use of RAIL in a DC1-like experiment on NERSC
- [ ] sufficient documentation for a v1.0 release
- [ ] an LSST-DESC Note presenting the RAIL infrastructure
2. _RAILroad_: Quantify the impact of nonrepresentativity (imbalance and incompleteness) of a training set on estimated photo-z posteriors by multiple machine learning methods
* a `rail.creation.degradation` submodule that introduces an imperfect prior of the form of nonrepresentativity into the observed photometry
* at least two `rail.estimation.estimator` wrapped machine learning-based codes for estimating photo-z posteriors
* additional `rail.evaluation.metric` modules implementing the [qp](https://github.com/LSSTDESC/qp) metrics
* documented scripts that demonstrate the use of RAIL in a blinded experiment on NERSC
* an LSST-DESC paper presenting the results of a controlled experiment of non-representativity
- [ ] parameter specifications for degrading an existing `Creator` to make an imperfect prior of the form of nonrepresentativity into the observed photometry
- [ ] at least two `Estimator` wrapped machine learning-based codes for estimating photo-z posteriors
- [ ] additional `Evaluator` metrics with feed-through access to the [qp](https://github.com/LSSTDESC/qp) metrics
- [ ] end-to-end documented scripts that demonstrate a blinded experiment on NERSC
- [ ] an LSST-DESC paper presenting the results of the experiment

## Future Plans

The next stages (tentative project codenames subject to change) can be executed in any order or even simultaneously and may be broken into smaller pieces each corresponding to an LSST-DESC Note.
* Extend the imperfect prior models and experimental design to accommodate template-fitting codes _(name TBD)_
* _Off the RAILs_: Investigate the effects of erroneous spectroscopic redshifts (or uncertain narrow-band photo-zs) in a training set
* _Third RAIL_: Investigate the effects of imperfect deblending on estimated photo-z posteriors
* _RAIL gauge_: Investigate the impact of measurement errors (PSF, aperture photometry, flux calibration, etc.) on estimated photo-z posteriors
* _DERAIL_: Propagate the impact of imperfect prior information to 3x2pt cosmological parameter constraints
* _RAIL line_: Implement a more sophisticated true photo-z posterior model with SEDs and emission lines
RAIL's design aims to break up the PZ WG's pipeline responsibilities into smaller milestones that can be accomplished by individuals or small groups on short timescales, i.e. under a year.
The next stages of RAIL development (tentative project codenames subject to change) are intended to be paper-able projects each of which addresses one or more SRM Deliverables by incrementally advancing the code along the way to project completion.
They are scoped such that any can be executed in any order or even simultaneously.
* _RAILyard_: Assess the performance of template-fitting codes by extending the creation subpackage to forward model templates
* _RAIL network_: Assess the performance of clustering-redshift methods by extending the creation subpackage to forward model positions
* _Off the RAILs_: Investigate the effects of erroneous spectroscopic redshifts (or uncertain narrow-band photo-zs) in a training set by extending the creation subpackage's imperfect prior model
* _Third RAIL_: Investigate the effects of imperfect deblending on estimated photo-z posteriors by extending the creation subpackage to forward model the effect of imperfect deblending
* _RAIL gauge_: Investigate the impact of measurement errors (PSF, aperture photometry, flux calibration, etc.) on estimated photo-z posteriors by including their effects in the the forward model of the creation subpackage
* _DERAIL_: Investigate the impact of imperfect photo-z posterior estimation on a probe-specific (e.g. 3x2pt) cosmological parameter constraint by connecting the estimation subpackage to other DESC pipelines
* _RAIL line_: Assess the sensitivity of estimated photo-z posteriors to photometry impacted by emission lines by extending the creation subpackage's forward model

Informal library of fun train-themed names for future projects/pipelines built with RAIL: _monoRAIL_, _tRAILblazer_, _tRAILmix_, _tRAILer_
Loading

0 comments on commit 58e470a

Please sign in to comment.