v3.0.0
Version 3.0.0
Implements out-of-memory reading of input data and parallel computation using
the MPI standard.
Warning
This version presents a major rework of the package, which is incompatible
with any version 2 code. The changes listed below are a summary of the most
important differences in the public API and not necessarily complete.
Note
Data files produced by version 2 can still be read from version 3 (except
for cached catalogs).
Added features
- Implemented parallel processing using the MPI standard to support running on
multi-node compute systems. This is optional and pythonmultiprocessing
remains the default approach to parallel processing. - Creating catalogs from large datasets by reading and processing input data
in chunks using a parallelsied pipeline. This removes one of the main memory
restriction of version 2 and allows processing arbitrarily large inputs. - Improved the performace by a factor of 3-5, depending on the task and
hardware. - Improved integration of random generators. Added a random generator that
generates uniform randoms within the constraints of aHealPix
map. Catalogs
can be generated directly from the generator without creating an intermediate
input file. - Added support for units when specifying correlation scales. Scales may now
also be angles (radian, degrees, arcmin/sec) or comoving distances (kpc/h,
Mpc/h).
Removed features
- Catalogs can no longer be constructed in memory and instead always require a
cache directory (previously optional). - Bootstrap resampling has been removed permanently (previously not yet
implemented). - Removed the
treecorr
catalog and backend to compute correlations. - The external package
yet_another_wizz_cli
, which implements the command line
clientyaw_cli
, is no longer supported. In a future version, a limited
subset of its features may be integrated directly into this package. - Removed the docker image.
Changes
-
In
yaw.catalogs
:- Removed the
treecorr
catalog and theNewCatalog
factory class. - There is only as single catalog class (
yaw.Catalog
) that is created
directly from its factory methods :meth:yaw.Catalog.from_file
,
:meth:yaw.Catalog.from_dataframe
, :meth:yaw.Catalog.from_random
.
The factory methods now require as first argument a path serving as the
cache directory. - Most method arguments have been renamed slightly to be more consistent
throughout the package. - The
yaw.Catalog
how serves as a dictionary of
yaw.patch.Patch
es and most of its previous methods have been
removed. - Removed the
correlate()
andtrue_redshifts()
methods from
yaw.Catalog
. The latter is now implemented as a constructor for
yaw.HistData
.
- Removed the
-
In
yaw.config
:- Removed the
BackendConfig
andResamplingConfig
as bothtreecorr
catalogs and bootstrap resampling is no longer supported. - Removed the
backend
attribute ofyaw.Configuration
. - Renamed the serialisation methods from
to/from_yaml()
to
to/from_file()
. - In the :meth:
yaw.Config.create
and :meth:yaw.Config.modify
methods,
renamedrbin_num
toresolution
,zbin_num
tonum_bins
,
zbins
toedges
, andthread_num
tomax_workers
. Removed
rbin_slop
(no longer needed) and addedclosed
, which indicates
which side of the bin edges are closed intervals.
- Removed the
-
In
yaw.correlation
:- Removed the
linkage
argument fromyaw.autocorrelate
and
yaw.crosscorrelate
. Addedmax_workers
, which overrides the
value given in the configuration. yaw.autocorrelate
andyaw.crosscorrelate
now always
return a list ofyaw.CorrFunc
instances. In the previous version,
this was only the case if multiple scales where configured.- Changed the internal structure of correlation function HDF5 files.
- Removed the attributes related to the redshift binning in
yaw.CorrFunc
andyaw.CorrData
. These can now accessed
through thebinning
attribute (replacingget_binning()
). Renamed
n_bins
(n_patches
) tonum_bins
(num_patches
). - Changed the
get_data()
,get_error()
,get_covariance()
, and
get_correlation()
methods ofyaw.CorrData
to attributes called
data
,error
,covariance
, andcorrelation
.
- Removed the
-
In
yaw.redshifts
:- The changes to
yaw.CorrData
listed above also apply to
yaw.RedshiftData
andyaw.HistData
. - Removed the
rebin()
,mean()
, andshift()
methods from
yaw.RedshiftData
andyaw.HistData
. - The constructor function :meth:
yaw.RedshiftData.from_corrfuncs
no
longer accepts the*_est
arguments or theconfig
parameter. The
resampling always defaults to using the Davis-Peebles estimator or the
Landy-Szalay estimator if random-random pair counts are availble. This is
consistent with the previous default behaviour. - Added a new constructor to
yaw.HistData
to compute a redshift
histogram directly from ayaw.Catalog
instance.
- The changes to
-
Fully reimpleneted
yaw.randoms
and added a newHealPix
-map based
random generator.