Skip to content

EPA-ng v0.2.0-beta

Pre-release
Pre-release
Compare
Choose a tag to compare
@pierrebarbera pierrebarbera released this 28 Mar 12:57
· 127 commits to master since this release

Interface Related Changes

  • changed CLI library from cxxopts to CLI11
  • removed the --repeats option, now repeats are linked to whether or not masking is used (no masking = yes repeats)
  • removed the -O option to re-optimize the tree branch lengths and model parameters (this should be done upstream, then pass the parameters via --model)

Feature Changes

  • can work with amino acid data again
  • removed the experimental "pipeline" parallelization mode
  • added masking heuristic, enabled by default
  • slightly changed the bfast format to accomodate the masking information
  • changed default value of accumulated LWR candidate selection heuristic to 0.99999
  • added --model argument that takes either a raxml-ng style model descriptor specifying the exact model parameters to be used, or a RAxML_info file (raxml 8.x) from a -f e run
  • added pplacers baseball heuristic, for now with static default settings (--strike-box 3.0, --max-strikes 6, --max-pitches 40)
  • added explicit --no-heur flag that disables all preplacement heursitics (does not disable the new masking heuristic!)
  • added --no-pre-mask flag to disable masking heuristic
  • added --split function that takes a reference MSA file and one or more query aligment files, in phylip format, and outputs a fasta file containing only aligned query sequences. Usage: epa-ng --split ref_alignment query_alignments+. (this is a tailored function to prepare papara output for epa-ng) (issue #16)

Performance Changes

  • greatly increased parallel efficiency
  • added aforementioned masking heuristic that greatly increases overall performance
  • many smaller performance improvements

Bugfixes

  • fixed non-working --version option (issue #13)
  • fixed issues with RNA data (issue #14)
  • fixed a segfault that occured when input query sequence volume got too high (issue #9)
  • fixed a bug that allowed empty pqueries to sometimes be printed to the jplace

Code Changes you probably don't care about

  • updated genesis and libpll/pll-modules to newer commits
  • whole bunch of refactoring
  • re-write of the jplace output module, supports MPI I/O operations now to do massively parallel writes, asynchronously. Now scales indefinetly with input volume
  • separate, and reworked, mid-level placement functions for preplacement and thorough placement. (treating them differently made a big difference performance wise)
  • added a layer of separation to the readers to enable better automatic file format detection
  • switched MSA_Stream to use genesis instead of pll for reading
  • altered behavior of MSA_Stream slighly
  • greatly improved performance lwr calculation
  • decreased reliance on Work container
  • ressurected the Range and call_focused code :spooky: and used it to implement masking