Salmon Beta v0.4.0
Pre-releaseImportant change to output format
The FPKM column of the output has been removed. This closes issue #3, as Richard has made a compelling case that FPKM should be completely deprecated from use and its inclusion in the output might promote undesirable practices in downstream analysis.
Fixes
This release also fixes some build issues that prevented v0.3.2 from compiling successfully under OSX using the default compiler and linker. These issues have been resolved, and v0.4.0 should build properly on OSX 10.10 (using the latest version of the system's compiler). Note: Only the official system compiler (Apple's clang) is supported on OSX; building with a non-native compiler (e.g. GCC) on OSX is still experimental.
Some new features and deprecations
This beta brings some important improvements and changes to Salmon. Certain program options (that previously affected the behavior of secondary streaming passes over the data) have been marked as deprecated and will be removed in a future release. The most notable of these is the -n
(number of required observations) parameter. Salmon now optimizes the dataset until data-dependent convergence. This is possible as this release brings with it a novel hybrid optimization strategy that requires only a single streaming pass over the data (regardless of data set size). This leads to much faster run times on small datasets, and also, improved overall accuracy. The new hybrid optimization strategy is the default.
New command line flags
A new command-line flag, --useVBOpt
has been added. This flag allows the user to toggle between the Variational Bayesian EM algorithm and the "standard" EM algorithm in the second phase of the hybrid optimization. Some previous literature has shown that the Variational Bayesian EM is often slightly more accurate, though we are currently testing both.
To close issue #2, it is now possible to specify the parameters of the fragment length distribution for single-end experiments. This is useful for correctly estimating the effective-length of the transcripts in single-end experiments. The mean of the fragment length distribution can be specified with --fldMean
and the standard deviation with --fldSD
.