From 10eabc7129f3f092eb84ac1b3b54a79a14cbfd0d Mon Sep 17 00:00:00 2001
From: Will Dumm <wrhdumm@gmail.com>
Date: Thu, 8 Aug 2024 13:09:26 -0400
Subject: [PATCH] Add filtering by reversions to naive sequence, generalize
 ranking strategy (#130)

* add reversion func and rewrite ranking again

* update tests

* working ranking expression parsing

* cleanup, format, and lint

* fix forest summary

* more fixes and cleanup

* finish fixing output for zero coefficients

* WIP

* remove reversion counting from default ordering

* fix tree counting with pickled forest input

* add nucleotide annotations

* change to default minimize all criteria

* simplify ranking assuming all criteria are minimized

* WIP add linear combination to tree_stats pairplot

* format and lint

* fix quickstart and docs

* restore backward compatibility and tests
---
 docs/index.rst                |   2 -
 docs/quickstart.rst           |  78 +++--
 gctree/branching_processes.py | 593 +++++++++++++++++++++++++---------
 gctree/cli.py                 |  86 +++--
 gctree/isotyping.py           |   2 +-
 gctree/mutation_model.py      |   8 +-
 tests/smalltest.sh            |  36 ++-
 7 files changed, 582 insertions(+), 223 deletions(-)

diff --git a/docs/index.rst b/docs/index.rst
index b1e8fd56..ea31d15c 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -34,8 +34,6 @@ gctree documentation
    :maxdepth: 1
    :caption: Notes
 
-   CHANGELOG
-   faq
 
 
 Indices and tables
diff --git a/docs/quickstart.rst b/docs/quickstart.rst
index fdf1fcdc..735f4fb5 100644
--- a/docs/quickstart.rst
+++ b/docs/quickstart.rst
@@ -67,8 +67,8 @@ You now have two new files, ``outfile`` and ``outtree``.
 Note: if you want to rerun the above ``dnapars`` command, you must delete these two files first!
 
 
-gctree
-======
+Gctree Ranking
+==============
 
 We're now ready to run ``gctree infer`` to use abundance data (in ``abundances.csv``) to rank the equally parsimonious trees (in ``outfile``).
 We can use the optional argument ``--frame`` to indicate the coding frame of the sequence start, so that amino acid substitutions can be annotated on our trees.
@@ -106,48 +106,82 @@ This file may be manipulated using ``gctree infer``, instead of providing
 a dnapars ``outfile``.
 
 .. note::
-  Although described below, using context likelihood, mutability parsimony, or isotype parsimony
-   as ranking criteria is experimental, and has not yet been shown in a careful
-   validation to improve tree inference. Only the default branching process
-   likelihood is recommended for tree ranking!
+  Although described below, using context likelihood log loss, mutability parsimony, or isotype parsimony
+  as ranking criteria is experimental, and has not yet been shown in a careful
+  validation to improve tree inference. Only the default branching process
+  likelihood is recommended for tree ranking!
 
 Criteria other than branching process likelihoods can be used to break ties
 between trees. Providing arguments ``--isotype_mapfile`` and
 ``--idmapfile`` will allow trees to be ranked by isotype parsimony. Providing
 arguments ``--mutability`` and ``--substitution`` allows trees to be ranked
-according to a context-sensitive mutation model. By default, trees are ranked
-lexicographically, first maximizing likelihood, then minimizing isotype
-parsimony, and finally maximizing a context-based poisson likelihood, if such information is provided.
-Ranking priorities can be adjusted using the argument ``--ranking_coeffs``.
-
-For example, to find the optimal tree
-according to a linear combination of likelihood, isotype parsimony,
-mutabilities, and alleles:
-
-.. command-output:: gctree infer gctree.out.inference.parsimony_forest.p --frame 1 --idmap idmap.txt --isotype_mapfile ../example/isotypemap.txt --mutability ../HS5F_Mutability.csv --substitution ../HS5F_Substitution.csv --ranking_coeffs 1 0.1 0 --outbase newranking --summarize_forest --tree_stats --verbose
+according to a context-sensitive Poisson model. By default, trees are ranked
+lexicographically, first minimizing branching process log loss (negative
+log-likelihood), then minimizing isotype
+parsimony, minimizing context-based poisson log loss, and finally minimizing number of alleles,
+if required information for all criteria is provided. Criteria for which
+required information is not provided will be skipped.
+
+Ranking priorities can be adjusted using the argument ``--ranking_strategy``.
+This argument accepts a string describing either an alternative lexicographic
+ordering, or a linear combination of criteria to be minimized. The following
+identifiers are used to specify available ranking criteria:
+
+* ``B`` - branching process log loss
+* ``I`` - isotype parsimony
+* ``C`` - context-based Poisson log loss
+* ``M`` - old mutability parsimony
+* ``A`` - number of alleles
+* ``R`` - sitewise reversions to naive sequence
+
+An alternative lexicographic ordering can be specified using a comma-separated
+list of identifiers. For example by passing ``--ranking_strategy "C, B, R"`` to
+``gctree infer``, we will minimize context-based Poisson log loss, then
+branching process log-loss, and finally naive reversions. If for some reason
+the user wishes to maximize instead of minimizing a criterion, a negative
+coefficient can be provided, as in ``"C, B, -R"``. To compute the value of
+a criterion without using it for ranking, a coefficient of zero can be
+prepended to its identifier.
+
+Alternatively, the ``--ranking_strategy`` option can be used to rank trees to
+minimize a linear combination of criteria.
+For example, to find the optimal tree according to a linear combination of
+branching process log loss, isotype parsimony, context-based Poisson log loss,
+and alleles:
+
+.. command-output:: gctree infer gctree.out.inference.parsimony_forest.p \
+                                --frame 1 \
+                                --idmap idmap.txt \
+                                --isotype_mapfile ../example/isotypemap.txt \
+                                --mutability ../HS5F_Mutability.csv \
+                                --substitution ../HS5F_Substitution.csv \
+                                --ranking_strategy "B + I + 0.1C + 0.01A" \
+                                --outbase newranking --summarize_forest \
+                                --tree_stats \
+                                --verbose
    :shell:
 
 The files ``HS5F_Mutability.csv`` and ``HS5F_Substitution.csv`` are a context
-sensitive mutation model which can be downloaded from the `Shazam Project <https://bitbucket.org/kleinstein/shazam/src/master/data-raw/`>_.
+sensitive mutation model which can be downloaded from the `Shazam Project <https://bitbucket.org/kleinstein/shazam/src/master/data-raw/>`_, and are required here to compute context-based Poisson log loss.
 
 By default, only the files listed above will be generated, with the optional argument ``--outbase`` specifying how the output files should be named.
 
 .. image:: newranking.inference.1.svg
    :width: 1000
 
-For detailed information about each tree used for ranking, as well as a pairplot like the one below comparing the highest ranked tree to all other ranked trees,use the argument ``--tree_stats``.
+For detailed information about each tree used for ranking, as well as a pairplot like the one below comparing the highest ranked tree to all other ranked trees, use the argument ``--tree_stats``.
 
-.. image:: newranking.tree_stats.pairplot.png
+.. image:: newranking.tree_stats.pairplot.svg
    :width: 1000
 
-Sometimes ranked trees are too numerous, and generating the output of ``--tree_stats`` would require too many resources. For a summary of the collection of trees used for ranking, the argument ``--summarize_forest`` is provided. Most importantly, this option summarizes how much less likely the top ranked tree is, compared to the most likely tree being ranked, for example to validate coefficients passed to ``--ranking_coeffs``.
+Sometimes ranked trees are too numerous to generate the output of ``--tree_stats`` in a reasonable amount of time. For a summary of the collection of trees used for ranking, the option ``--summarize_forest`` is provided. Most importantly, this option summarizes how much less likely the top ranked tree is, compared to the most likely tree being ranked, for example to validate coefficients passed to ``--ranking_strategy``.
 
 .. command-output:: cat newranking.forest_summary.log
    :shell:
 
 
-isotype
-=======
+Isotypes
+========
 
 If we would like to add observed isotype data to trees output by gctree
 inference, we can now do so.
diff --git a/gctree/branching_processes.py b/gctree/branching_processes.py
index 08cd2b6d..b382b90e 100755
--- a/gctree/branching_processes.py
+++ b/gctree/branching_processes.py
@@ -45,6 +45,10 @@
 )
 
 
+class RankingCriterionError(Exception):
+    pass
+
+
 class CollapsedTree:
     r"""A collapsed tree, modeled as an infinite type Galton-Watson process run
     to extinction.
@@ -461,6 +465,7 @@ def render(
         frame2: Optional[int] = None,
         position_map2: Optional[List] = None,
         show_support: bool = False,
+        show_nuc_muts: bool = False,
     ):
         r"""Render to tree image file.
 
@@ -477,6 +482,8 @@ def render(
             frame2: coding frame for 2nd sequence when using ``chain_split``
             position_map2: like ``position_map``, but for 2nd sequence when using ``chain_split``
             show_support: annotate bootstrap support if available
+            show_nuc_muts: If True, annotate branches with nucleotide mutations. If False, and frame is provided, then branches
+                will be annotated with amino acid mutations.
         """
         if frame is not None and frame not in (1, 2, 3):
             raise RuntimeError("frame must be 1, 2, or 3")
@@ -560,7 +567,28 @@ def my_layout(node):
                 if "sequence" in tree_copy.features and set(
                     node.sequence.upper()
                 ) == set("ACGT"):
-                    if frame is not None:
+                    if show_nuc_muts:
+                        mutations = [
+                            f"{pn}{idx + 1}{cn}"
+                            for idx, (pn, cn) in enumerate(
+                                zip(node.up.sequence, node.sequence)
+                            )
+                            if pn != cn
+                        ]
+                        if mutations:
+                            T = ete3.TextFace(
+                                "\n".join(mutations),
+                                fsize=6,
+                                tight_text=False,
+                            )
+                            T.margin_top = 6
+                            T.rotation = -90
+                            node.add_face(
+                                T,
+                                0,
+                                position=("branch-bottom"),
+                            )
+                    elif frame is not None:
                         if chain_split is not None and frame2 is None:
                             raise ValueError(
                                 "must define frame2 when using chain_split"
@@ -622,6 +650,7 @@ def my_layout(node):
                                     )
                                 if "*" in aa:
                                     nstyle["hz_line_color"] = "red"
+
             node.set_style(nstyle)
 
         ts = ete3.TreeStyle()
@@ -1135,7 +1164,7 @@ def mle(self, **kwargs) -> Tuple[np.float64, np.float64]:
     @_requires_dag
     def filter_trees(  # noqa: C901
         self,
-        ranking_coeffs: Optional[Sequence[float]] = None,
+        ranking_strategy: Optional[str] = None,
         mutability_file: Optional[str] = None,
         substitution_file: Optional[str] = None,
         ignore_isotype: bool = False,
@@ -1144,26 +1173,23 @@ def filter_trees(  # noqa: C901
         outbase: str = "gctree.out",
         summarize_forest: bool = False,
         tree_stats: bool = False,
+        img_type: str = "svg",
+        ranking_coeffs: Optional[Sequence[float]] = None,
         branching_process_ranking_coeff: float = -1,
         use_old_mut_parsimony: bool = False,
     ) -> CollapsedForest:
         """Filter trees according to specified criteria.
 
-        Trim the forest to minimize a linear
-        combination of branching process likelihood, isotype parsimony score,
-        context/mutability-based Poisson likelihood, and number of alleles, with coefficients
-        provided in the argument ``ranking_coeffs`, in that order.
+        By default, the forest will be trimmed to maximize branching process likelihood,
+        then minimize isotype parsimony, then maximize context-based Poisson likelihood,
+        and finally minimize number of alleles. Any criteria for which the necessary arguments
+        are not provided will be automatically ignored.
+
+        For other ranking strategies, see the `ranking_strategy` argument.
 
         Args:
-            ranking_coeffs: A list or tuple of coefficients for prioritizing tree weights.
-                The order of coefficients is: isotype parsimony score, context poisson likelihood,
-                and number of alleles. A coefficient of ``-1`` will be applied to branching process
-                likelihood by default, unless a different value is provided to the keyword argument
-                `branching_process_ranking_coeff`. Trees are chosen to minimize this linear combination
-                of tree weights, so weights for which larger values are more optimal (such as
-                likelihoods) should have negative coefficients.
-                If ranking_coeffs is not provided, trees will be ranked lexicographically
-                by likelihood, then by other traits, in the same order.
+            ranking_strategy: A string expression describing how to rank trees. See docs for command line
+                argument `--ranking_strategy` for description.
             mutability_file: A mutability model
             substitution_file: A substitution model
             ignore_isotype: Ignore isotype parsimony when ranking. By default, isotype information added with
@@ -1174,9 +1200,20 @@ def filter_trees(  # noqa: C901
             outbase: file name stem for a file with information for each tree in the DAG.
             summarize_forest: whether to write a summary of the forest to file `[outbase].forest_summary.log`
             tree_stats: whether to write stats for each tree in the forest to file `[outbase].tree_stats.log`
-            branching_process_ranking_coeff: Ranking coefficient to use for branching process likelihood. Value
+            img_type: format for output plots.
+            ranking_coeffs: (Deprecated. Use ``ranking_strategy`` instead) A list or tuple of coefficients
+                for prioritizing tree weights.
+                The order of coefficients is: isotype parsimony score, context poisson likelihood,
+                and number of alleles. A coefficient of ``-1`` will be applied to branching process
+                likelihood by default, unless a different value is provided to the keyword argument
+                `branching_process_ranking_coeff`. Trees are chosen to minimize this linear combination
+                of tree weights, so weights for which larger values are more optimal (such as
+                likelihoods) should have negative coefficients.
+            branching_process_ranking_coeff: (Deprecated. Use ``ranking_strategy`` instead)
+                Ranking coefficient to use for branching process likelihood. Value
                 is ignored unless `ranking_coeffs` argument is provided.
-            use_old_mut_parsimony: Whether to use the deprecated 'mutability parsimony' instead of
+            use_old_mut_parsimony: (Deprecated. Use ``ranking_strategy`` instead)
+                Whether to use the deprecated 'mutability parsimony' instead of
                 context-based poisson likelihood (only applicable if mutability and substitution files are
                 provided.
 
@@ -1185,40 +1222,35 @@ def filter_trees(  # noqa: C901
             of data about the trees in that forest, with format (branching process likelihood, isotype parsimony,
             context-based Poisson likelihood, alleles).
         """
-        dag = self._forest
 
-        if ranking_coeffs:
-            if len(ranking_coeffs) != 3:
-                raise ValueError(
-                    "If ranking_coeffs are provided to `filter_trees` method, a list of three values is expected."
-                )
-            coeffs = [branching_process_ranking_coeff] + list(ranking_coeffs)
-            if sum(abs(c) for c in coeffs) == 0:
-                raise ValueError(
-                    "At least one value provided to ranking_coeffs or the value of branching_process_ranking_coeff must be nonzero."
-                )
-        else:
-            coeffs = [1] * 4
-
-        (
-            nz_coeff_bplikelihood,
-            nz_coeff_isotype_pars,
-            nz_coeff_context,
-            _,
-        ) = [val != 0 for val in coeffs]
-        coeff_bplikelihood, coeff_isotype_pars, coeff_context, coeff_alleles = coeffs
-
-        dag_filters = []
-        if nz_coeff_bplikelihood:
+        # To add a new ranking criterion:
+        #     * Add a prepare_func below which checks for prerequisites
+        #           and throws a RankingCriterionError if not met
+        #     * Add a key, value pair to ranking_function_keys so that a user
+        #           can use the new criterion
+        #     * Add the prepare_func to the list of default ranking
+        #           constructors, if desired.
+        #     * Modify docs in cli.py to describe new criterion
+        #     * All ranking criteria should be minimized by default.
+
+        def prepare_bp_likelihood_funcs():
             if self.parameters is None:
                 self.mle(marginal=True)
             p, q = self.parameters
             ll_dagfuncs = _ll_genotype_dagfuncs(p, q)
-            dag_filters.append((ll_dagfuncs, coeff_bplikelihood))
             if verbose:
                 print(f"Branching process parameters to be used for ranking: {(p, q)}")
-        if nz_coeff_isotype_pars and self.is_isotyped and (not ignore_isotype):
-            # Check for missing isotype data in all but root node, and fake root-adjacent leaf node
+            return ll_dagfuncs
+
+        def prepare_isotype_parsimony_funcs():
+            if not self.is_isotyped:
+                raise RankingCriterionError(
+                    "Isotypes have not been added to this CollapsedForest."
+                )
+            if ignore_isotype:
+                raise RankingCriterionError(
+                    "Isotype ranking is disabled by passed parameters."
+                )
             rootname = list(self._forest.dagroot.children())[0].attr["name"]
             if any(
                 not node.attr["isotype"]
@@ -1229,66 +1261,227 @@ def filter_trees(  # noqa: C901
                     "Some isotype data seems to be missing. Isotype parsimony scores may be incorrect."
                 )
 
-            iso_funcs = _isotype_dagfuncs()
-            dag_filters.append((iso_funcs, coeff_isotype_pars))
-        if nz_coeff_context and mutability_file and substitution_file:
-            if use_old_mut_parsimony:
-                mut_funcs = _mutability_dagfuncs(
-                    mutability_file=mutability_file,
-                    substitution_file=substitution_file,
-                    splits=[] if chain_split is None else [chain_split],
+            return _isotype_dagfuncs()
+
+        def prepare_context_poisson_funcs():
+            if not (mutability_file and substitution_file):
+                raise RankingCriterionError(
+                    "Poisson context likelihood loss requires mutability and substitution files."
+                )
+            return _context_poisson_likelihood_dagfuncs(
+                mutability_file=mutability_file,
+                substitution_file=substitution_file,
+                splits=[] if chain_split is None else [chain_split],
+            )
+
+        def prepare_mutability_parsimony_funcs():
+            if not (mutability_file and substitution_file):
+                raise RankingCriterionError(
+                    "Mutability parsimony requires mutability and substitution files."
+                )
+            return _mutability_dagfuncs(
+                mutability_file=mutability_file,
+                substitution_file=substitution_file,
+                splits=[] if chain_split is None else [chain_split],
+            )
+
+        def prepare_reversions_funcs():
+            return _naive_reversion_dagfuncs(self._validation_stats["root_seq"])
+
+        ranking_function_keys = {
+            "B": prepare_bp_likelihood_funcs,
+            "I": prepare_isotype_parsimony_funcs,
+            "C": prepare_context_poisson_funcs,
+            "M": prepare_mutability_parsimony_funcs,
+            "A": _allele_dagfuncs,
+            "R": prepare_reversions_funcs,
+        }
+
+        default_ranking_constructors = [
+            prepare_bp_likelihood_funcs,
+            prepare_isotype_parsimony_funcs,
+            prepare_context_poisson_funcs,
+            _allele_dagfuncs,
+        ]
+
+        # ####### BEGIN backward compatibility adjustments with old ways of specifying ranking
+        # ####### strategy:
+        if (
+            (ranking_coeffs is not None)
+            or (branching_process_ranking_coeff != -1)
+            or use_old_mut_parsimony
+        ):
+            if ranking_strategy is None:
+                # Then we'll try to recover old behavior.
+
+                warnings.warn(
+                    "Arguments `branching_process_ranking_coeff`, `use_old_mut_parsimony`, and `ranking_coeffs` "
+                    "are deprecated! Use `ranking_strategy` argument instead."
+                )
+                if ranking_coeffs is not None:
+                    if use_old_mut_parsimony:
+                        criteria = ("B", "I", "M", "A")
+                        max_to_min = (-1, 1, 1, 1)
+                    else:
+                        criteria = ("B", "I", "C", "A")
+                        max_to_min = (-1, 1, -1, 1)
+
+                    coefficients = (branching_process_ranking_coeff,) + tuple(
+                        ranking_coeffs
+                    )
+                    ranking_strategy = "+".join(
+                        [
+                            f"{coeff * mtm}{crit}"
+                            for coeff, mtm, crit in zip(
+                                coefficients, max_to_min, criteria
+                            )
+                            if coeff != 0
+                        ]
+                    )
+                elif use_old_mut_parsimony:
+                    default_ranking_constructors = [
+                        prepare_bp_likelihood_funcs,
+                        prepare_isotype_parsimony_funcs,
+                        prepare_mutability_parsimony_funcs,
+                        _allele_dagfuncs,
+                    ]
+                # in absence of ranking_coeffs, branching_process_ranking_coeff
+                # was ignored.
+            else:
+                warnings.warn(
+                    "Arguments `branching_process_ranking_coeff`, `use_old_mut_parsimony`, and `ranking_coeffs` "
+                    "are deprecated! They will be ignored, since a `ranking_strategy` was provided."
                 )
+        # ####### END backward compatibility for old ways of specifying ranking strategy
+
+        lexicographic = True
+        # Parsing ranking_strategy, if provided:
+        if ranking_strategy:
+            ranking_strategy = ranking_strategy.replace(" ", "")
+            if (
+                "," in ranking_strategy
+                or sum(key in ranking_strategy for key in ranking_function_keys) == 1
+            ):
+                # Then we're doing lexicographic ranking
+                criteria = ranking_strategy.split(",")
+
+            else:
+                lexicographic = False
+                criteria = ranking_strategy.replace("-", "+-").split("+")
+                # If there's a leading '-', or a '+-' already, then we'll get an empty
+                # string somewhere
+                criteria = [cr for cr in criteria if cr != ""]
+
+            def expand_criterion(c):
+                if c[0] == "-":
+                    fac = -1
+                    c = c[1:]
+                else:
+                    fac = 1
+                if len(c) == 1:
+                    return c, fac
+                else:
+                    return c[-1], fac * float(c[:-1])
+
+            try:
+                expanded_criteria = [
+                    expand_criterion(criterion) for criterion in criteria
+                ]
+            except Exception as e:
+                raise RankingCriterionError(
+                    f"Error parsing ranking strategy string: '{ranking_strategy}'"
+                ) from e
+
+            ranking_funcs_needed = set(name for name, _ in expanded_criteria)
+            initialized_ranking_funcs = {
+                name: ranking_function_keys[name]() for name in ranking_funcs_needed
+            }
+            all_filters = [
+                (initialized_ranking_funcs[name], coeff)
+                for name, coeff in expanded_criteria
+            ]
+            # Account for lexicographic filters with coefficient of 0, which
+            # will be computed on trees but not used for ranking.
+            if lexicographic:
+                # Want all the observer filters to be at the end, so it's easy
+                # to deal with them in the tree_stats section.
+                dag_filters = [
+                    (filt, coeff) for filt, coeff in all_filters if coeff != 0
+                ]
+                zero_filters = [
+                    (filt, coeff) for filt, coeff in all_filters if coeff == 0
+                ]
+                observer_filters = dag_filters + zero_filters
             else:
-                mut_funcs = _context_poisson_likelihood_dagfuncs(
-                    mutability_file=mutability_file,
-                    substitution_file=substitution_file,
-                    splits=[] if chain_split is None else [chain_split],
+                dag_filters = all_filters.copy()
+                observer_filters = all_filters.copy()
+
+        else:
+            # Then we need to do defaults
+            dag_filters = []
+            for cons in default_ranking_constructors:
+                try:
+                    dagfunc = cons()
+                    dag_filters.append((dagfunc, 1))
+                except RankingCriterionError:
+                    # We're just seeing what we have info to handle, so failing
+                    # silently is fine
+                    pass
+            observer_filters = dag_filters.copy()
+
+        # Switch optimal functions based on signs of coefficients
+        adjusted_filters = []
+        for dag_filter, coeff in dag_filters:
+            if coeff < 0:
+                warnings.warn(
+                    f"Lower values for {dag_filter.weight_funcs.name} are generally better, but the "
+                    "provided ranking coefficient is negative, so trees with higher values will be preferred."
                 )
-            dag_filters.append((mut_funcs, coeff_context))
+                if lexicographic and dag_filter.optimal_func != min:
+                    warnings.warn(
+                        f"Unrecognized optimality function for {dag_filter.weight_funcs.name} may not support negation."
+                    )
+                dag_filter = hdag.utils.HistoryDagFilter(dag_filter.weight_funcs, max)
+            adjusted_filters.append((dag_filter, coeff))
+        dag_filters = adjusted_filters
+
+        dag = self._forest
 
-        # add allele funcs no matter what, for logging
-        allele_funcs = _allele_dagfuncs()
-        dag_filters.append((allele_funcs, coeff_alleles))
         # add 0-returning functions so dagfuncs return tuples, even if allele funcs are the only ones used for filtering
-        dag_filters.append(
-            (
-                hdag.utils.HistoryDagFilter(
-                    hdag.utils.AddFuncDict(
-                        {
-                            "start_func": lambda n: 0,
-                            "edge_weight_func": lambda n1, n2: 0,
-                            "accum_func": lambda ls: 0,
-                        },
-                        name="",
-                    ),
-                    min,
-                    ordering_name="",
-                ),
-                0,
-            )
+        null_filter = hdag.utils.HistoryDagFilter(
+            hdag.utils.AddFuncDict(
+                {
+                    "start_func": lambda n: 0,
+                    "edge_weight_func": lambda n1, n2: 0,
+                    "accum_func": lambda ls: 0,
+                },
+                name="",
+            ),
+            min,
+            ordering_name="",
         )
+        dag_filters.append((null_filter, 0))
+        observer_filters.append((null_filter, 0))
 
         combined_dag_filter = functools.reduce(
             lambda x, y: x + y, (dag_filter for dag_filter, _ in dag_filters)
         )
 
-        if ranking_coeffs:
-            if len(ranking_coeffs) != 3:
-                raise ValueError(
-                    "If ranking_coeffs are provided to `filter_trees` method, a list of three values is expected."
-                )
-            for dag_filter, coeff in dag_filters:
-                if dag_filter.optimal_func == max and coeff > 0:
-                    warnings.warn(
-                        f"Higher values for {dag_filter.weight_funcs.name} are generally better, but the "
-                        "provided ranking coefficient is positive, so trees with lower values will be preferred."
-                    )
-                if dag_filter.optimal_func == min and coeff < 0:
-                    warnings.warn(
-                        f"Lower values for {dag_filter.weight_funcs.name} are generally better, but the "
-                        "provided ranking coefficient is negative, so trees with higher values will be preferred."
-                    )
+        combined_observer_filter = functools.reduce(
+            lambda x, y: x + y, (dag_filter for dag_filter, _ in observer_filters)
+        )
 
+        if lexicographic:
+            ranking_dag_filter = combined_dag_filter
+            ranking_description = "Ranking trees to " + " then ".join(
+                opt_name[:3] + "imize " + ord_name
+                for (opt_name, _), ord_name in zip(
+                    ranking_dag_filter.ordering_names,
+                    ranking_dag_filter.weight_funcs.names,
+                )
+                if ord_name != ""
+            )
+        else:
             filtered_coefficients = [coeff for _, coeff in dag_filters]
 
             def linear_combinator(weighttuple):
@@ -1321,65 +1514,101 @@ def linear_combinator(weighttuple):
                     if coeff != 0
                 )
             )
-        else:
-            ranking_dag_filter = combined_dag_filter
-            ranking_description = "Ranking trees to " + " then ".join(
-                opt_name[:3] + "imize " + ord_name
-                for (opt_name, _), ord_name in zip(
-                    ranking_dag_filter.ordering_names,
-                    ranking_dag_filter.weight_funcs.names,
-                )
-                if ord_name != ""
-            )
+
         if verbose:
             print(ranking_description)
 
-        def print_stats(statlist, title, file=None, suppress_score=False):
-            show_score = ranking_coeffs and not suppress_score
+        def print_stats(
+            statlist,
+            title,
+            file=None,
+            suppress_score=False,
+            filter_list=observer_filters,
+        ):
+            df_dict = {"Tree": list(range(1, len(statlist) + 1))}
+            df_dict.update(
+                {
+                    dfilter.weight_funcs.name: vals
+                    for (dfilter, _), vals in zip(filter_list, zip(*statlist))
+                }
+            )
 
-            def reformat(field, n=10):
-                if isinstance(field, int):
-                    return format(field, "<" + str(n))
-                else:
-                    return f"{field:{n}.{n}}"
+            if (not lexicographic) and (not suppress_score):
+                df_dict["TreeScore"] = [
+                    linear_combinator(stattup) for stattup in statlist
+                ]
 
-            print("\n" + title + ":", file=file)
-            statstring = "\t".join(
-                tuple(
-                    reformat(dfilter.weight_funcs.name, n=15)
-                    for dfilter, _ in dag_filters[:-1]
-                )
-            )
-            print(
-                f"tree     \t{statstring}" + ("\ttreescore" if show_score else ""),
-                file=file,
-            )
-            for j, best_weighttuple in enumerate(statlist, 1):
-                # ignore always-0 entry at end:
-                statstring = "\t".join(reformat(it) for it in best_weighttuple[:-1])
+            df_dict.pop("")
+            printing_df = pd.DataFrame(df_dict)
+            # Hide index
+            printing_df.index = [""] * len(statlist)
+            # Set display options without side-effects
+            with pd.option_context(
+                "display.max_rows",
+                None,
+                "display.max_columns",
+                100,
+                "display.width",
+                1000,
+                "display.max_colwidth",
+                None,
+                "display.show_dimensions",
+                False,
+                "display.colheader_justify",
+                "right",
+            ):
+                print(printing_df, file=file)
+
+        trimdag = dag[ranking_dag_filter]
+
+        trimmed_forest = self._trimmed_self(trimdag)
+
+        if verbose:
+            if trimmed_forest.n_trees > 1:
+                n_topologies = trimmed_forest.n_topologies()
                 print(
-                    f"{j:<10}\t{statstring}"
-                    + (
-                        f"\t{reformat(linear_combinator(best_weighttuple))}"
-                        if show_score
-                        else ""
-                    ),
-                    file=file,
+                    "Degenerate ranking criteria: filtered forest contains "
+                    f"{trimmed_forest.n_trees} unique trees, with {n_topologies} unique collapsed topologies."
                 )
+                if n_topologies > 10:
+                    print(
+                        "A representative from each of the ten most abundant topologies will be sampled randomly for rendering."
+                    )
+                else:
+                    print(
+                        "A representative of each topology will be sampled randomly for rendering."
+                    )
 
-        trimdag = dag[ranking_dag_filter]
+        random.seed(trimmed_forest.n_trees)
+        topoclasses = list(trimmed_forest.iter_topology_classes())
 
-        best_weighttuple = trimdag.optimal_weight_annotate(
-            **combined_dag_filter,
-        )
+        sampled_histories = [topoclass._forest.sample() for topoclass in topoclasses]
+
+        weighttuples = [
+            history.optimal_weight_annotate(**combined_observer_filter)
+            for history in sampled_histories
+        ]
+
+        first_tree_weighttuple = weighttuples[0]
+
+        ctrees = [
+            topoclass._clade_tree_to_ctree(history)
+            for topoclass, history in zip(topoclasses, sampled_histories)
+        ]
 
         if verbose:
-            print_stats([best_weighttuple], "Stats for optimal trees")
+            if len(observer_filters) == len(dag_filters):
+                # Then all selected tree stats are the same, no need to print
+                # them multiple times:
+                print_stats(weighttuples[:1], "Stats for optimal trees")
+            else:
+                print_stats(weighttuples, "Stats for optimal trees")
 
         if summarize_forest:
             with open(outbase + ".forest_summary.log", "w") as fh:
                 independent_best = []
-                for dfilter, _ in dag_filters:
+
+                for dfilter, _ in observer_filters[:-1]:
                     tempdag = dag.copy()
                     min_val, max_val = tempdag.weight_range_annotate(**dfilter)
                     opt_weight = tempdag.trim_optimal_weight(**dfilter)
@@ -1388,7 +1617,7 @@ def reformat(field, n=10):
                         f"\nOverall {dfilter.weight_funcs.name} range {min_val} to {max_val}."
                         f"\nAmong trees with {dfilter.optimal_func.__name__} {dfilter.weight_funcs.name} of: {opt_weight}\n"
                     )
-                    for indfilter, _ in dag_filters:
+                    for indfilter, _ in observer_filters[:-1]:
                         if indfilter.weight_funcs.name != dfilter.weight_funcs.name:
                             minval, maxval = tempdag.weight_range_annotate(
                                 **indfilter.weight_funcs
@@ -1401,8 +1630,11 @@ def reformat(field, n=10):
                     [
                         [
                             stat - best
-                            for stat, best in zip(best_weighttuple, independent_best)
+                            for stat, best in zip(
+                                first_tree_weighttuple, independent_best
+                            )
                         ]
+                        + [0]
                     ],
                     "Highest ranked tree: loss from best value",
                     file=fh,
@@ -1410,33 +1642,49 @@ def reformat(field, n=10):
                 )
 
         if tree_stats:
-            dag_ls = list(dag.weight_count(**combined_dag_filter).elements())
+            dag_ls = list(dag.weight_count(**combined_observer_filter).elements())
             # To clear _dp_data fields of their large cargo
             dag.optimal_weight_annotate(edge_weight_func=lambda n1, n2: 0)
-            if ranking_coeffs:
-                minfunckey = linear_combinator
+            col_names = combined_observer_filter.weight_funcs.names
+
+            if not lexicographic:
+                col_names += (ranking_strategy.replace(" ", ""),)
+                dag_ls = [tup + (linear_combinator(tup),) for tup in dag_ls]
+                best_tree_tup = first_tree_weighttuple + (
+                    linear_combinator(first_tree_weighttuple),
+                )
+
+                def minfunckey(tup):
+                    return tup[-1]
+
             else:
-                minfunckey = ranking_dag_filter.optimal_func
+                best_tree_tup = first_tree_weighttuple
+                _tuple_coeffs = [
+                    -1 if filt.optimal_func == max else 1 for filt, _ in dag_filters
+                ]
+
+                def minfunckey(tup):
+                    return tuple(coeff * it for coeff, it in zip(_tuple_coeffs, tup))
+
             dag_ls.sort(key=minfunckey)
 
-            df = pd.DataFrame(
-                dag_ls, columns=combined_dag_filter.weight_funcs.names
-            ).drop(columns=[""])
+            df = pd.DataFrame(dag_ls, columns=col_names).drop(columns=[""])
             df.to_csv(outbase + ".tree_stats.csv")
             df["set"] = ["all_trees"] * len(df)
             bestdf = pd.DataFrame(
-                [best_weighttuple], columns=combined_dag_filter.weight_funcs.names
+                [best_tree_tup],
+                columns=col_names,
             )
             bestdf["set"] = ["best_tree"]
             toplot_df = pd.concat([df, bestdf], ignore_index=True)
             pplot = sns.pairplot(
-                toplot_df.drop(columns=["Alleles", ""], errors="ignore"),
+                toplot_df.drop(columns=[""], errors="ignore"),
                 hue="set",
                 diag_kind="hist",
             )
-            pplot.savefig(outbase + ".tree_stats.pairplot.pdf")
+            pplot.savefig(outbase + f".tree_stats.pairplot.{img_type}")
 
-        return (self._trimmed_self(trimdag), best_weighttuple)
+        return (ctrees, trimmed_forest, weighttuples)
 
     def likelihood_rankplot(self, outbase, p, q, img_type="svg"):
         """Save a rank plot of likelihoods to the file
@@ -1945,8 +2193,8 @@ def accum_func(cmsetlist: List[multiset.FrozenMultiset]):
 
 
 def _ll_genotype_dagfuncs(p: np.float64, q: np.float64) -> hdag.utils.HistoryDagFilter:
-    """Return functions for counting tree log branching process likelihood on
-    the history DAG.
+    """Return functions for counting tree negative log branching process
+    likelihood on the history DAG.
 
     For numerical consistency, we resort to the use of ``decimal.Decimal``.
     This is exactly for the purpose of solving the problem that float sum is
@@ -1986,7 +2234,7 @@ def edge_weight_ll_genotype(n1: hdag.HistoryDagNode, n2: hdag.HistoryDagNode):
             if n1.is_root() and c == 0 and m == 1:
                 # Add pseudocount for unobserved root unifurcation
                 c = 1
-            res = Decimal(CollapsedTree._ll_genotype(c, m, p, q)[0])
+            res = Decimal(-CollapsedTree._ll_genotype(c, m, p, q)[0])
             return hdag.utils.FloatState(float(round(res, 8)), state=res)
 
     def accum_func(weightlist):
@@ -2000,9 +2248,9 @@ def accum_func(weightlist):
                 "edge_weight_func": edge_weight_ll_genotype,
                 "accum_func": accum_func,
             },
-            name="LogBPLikelihood",
+            name="BPLikelihoodLogLoss",
         ),
-        max,
+        min,
     )
 
 
@@ -2029,3 +2277,40 @@ def _allele_dagfuncs() -> hdag.utils.HistoryDagFilter:
         ),
         min,
     )
+
+
+def _naive_reversion_dagfuncs(naive_seq: str) -> hdag.utils.HistoryDagFilter:
+    """Return functions for filtering trees in a history DAG by the number of
+    reversions to the naive nucleotide sequence.
+
+    Args:
+        naive_seq: The naive sequence. Sitewise reversions to this sequence will be counted.
+    Returns:
+        A :meth:`historydag.utils.AddFuncDict` which may be passed as keyword arguments
+        to :meth:`historydag.HistoryDag.weight_count`, :meth:`historydag.HistoryDag.trim_optimal_weight`,
+        or :meth:`historydag.HistoryDag.optimal_weight_annotate`
+        methods to trim or annotate a :meth:`historydag.HistoryDag` according to allele count.
+        Weight format is ``int``.
+    """
+
+    def edge_weight_func(n1, n2):
+        if n1.is_ua_node():
+            return 0
+        else:
+            return sum(
+                1
+                for p, c, n in zip(n1.label.sequence, n2.label.sequence, naive_seq)
+                if (p != n and c == n)
+            )
+
+    return hdag.utils.HistoryDagFilter(
+        hdag.utils.AddFuncDict(
+            {
+                "start_func": lambda n: 0,
+                "edge_weight_func": edge_weight_func,
+                "accum_func": sum,
+            },
+            name="NaiveReversions",
+        ),
+        min,
+    )
diff --git a/gctree/cli.py b/gctree/cli.py
index 46f0ee72..904ef95b 100644
--- a/gctree/cli.py
+++ b/gctree/cli.py
@@ -171,9 +171,6 @@ def isotype_add(forest):
         if forest.n_trees == 1:
             warnings.warn("only one parsimony tree reported from dnapars")
 
-        if args.verbose:
-            print("number of trees with integer branch lengths:", forest.n_trees)
-
         forest.mle(marginal=True)
         # Add isotypes to forest
         if args.isotype_mapfile:
@@ -198,10 +195,13 @@ def isotype_add(forest):
             "The filename of a pickled history DAG object, or a phylipfile and abundance file, are required."
         )
 
+    if args.verbose:
+        print("number of trees with integer branch lengths:", forest.n_trees)
+
     # Filter the forest according to specified criteria, and along the way,
     # write a log file containing stats for all trees in the forest:
-    trimmed_forest, _ = forest.filter_trees(
-        ranking_coeffs=args.ranking_coeffs,
+    ctrees, _, _ = forest.filter_trees(
+        ranking_strategy=args.ranking_strategy,
         verbose=args.verbose,
         outbase=args.outbase,
         summarize_forest=args.summarize_forest,
@@ -209,32 +209,12 @@ def isotype_add(forest):
         mutability_file=args.mutability,
         substitution_file=args.substitution,
         chain_split=args.chain_split,
+        img_type=args.img_type,
+        ranking_coeffs=args.ranking_coeffs,
         branching_process_ranking_coeff=args.branching_process_ranking_coeff,
         use_old_mut_parsimony=args.use_old_mut_parsimony,
     )
 
-    if args.verbose:
-        if trimmed_forest.n_trees > 1:
-            n_topologies = trimmed_forest.n_topologies()
-            print(
-                "Degenerate ranking criteria: trimmed history DAG contains "
-                f"{trimmed_forest.n_trees} unique trees, with {n_topologies} unique collapsed topologies."
-            )
-            if n_topologies > 10:
-                print(
-                    "A representative from each of the ten most abundant topologies will be sampled randomly for rendering."
-                )
-            else:
-                print(
-                    "A representative of each topology will be sampled randomly for rendering."
-                )
-
-    random.seed(forest.n_trees)
-    ctrees = [
-        topoclass.sample_tree()
-        for _, topoclass in zip(range(10), trimmed_forest.iter_topology_classes())
-    ]
-
     if args.colormapfile is not None:
         with open(args.colormapfile, "r") as f:
             colormap = {}
@@ -273,6 +253,7 @@ def isotype_add(forest):
             chain_split=args.chain_split,
             frame2=args.frame2,
             position_map2=position_map2,
+            show_nuc_muts=args.show_nucleotide_mutations,
         )
         collapsed_tree.newick(f"{args.outbase}.inference.{j}.nk")
         with open(f"{args.outbase}.inference.{j}.p", "wb") as f:
@@ -288,6 +269,8 @@ def isotype_add(forest):
     plt.xlabel("genotype")
     plt.ylabel("abundance")
     plt.savefig(args.outbase + ".inference.abundance_rank." + args.img_type)
+    if args.verbose:
+        print()
 
 
 def simulate(args):
@@ -482,7 +465,7 @@ def simulate(args):
 
 def get_parser():
     parser = argparse.ArgumentParser(
-        description="genotype collapsed tree inference and simulation"
+        description="genotype collapsed tree inference and simulation",
     )
     subparsers = parser.add_subparsers(
         title="subcommands",
@@ -617,8 +600,9 @@ def get_parser():
         type=float,
         default=-1,
         help=(
+            "This argument is deprecated. Use ``--ranking_strategy`` instead. "
             "Coefficient used for branching process likelihood, when ranking trees by a linear "
-            "combination of traits. This value will be ignored if `--ranking_coeffs` argument is not "
+            "combination of traits. This value will be ignored if ``--ranking_coeffs`` argument is not "
             "also provided."
         ),
     )
@@ -628,6 +612,7 @@ def get_parser():
         nargs=3,
         default=None,
         help=(
+            "This argument is deprecated. Use ``--ranking_strategy`` instead. "
             "List of coefficients for ranking trees by a linear combination of traits. "
             "Coefficients are in order: isotype parsimony, mutation model parsimony, number of alleles. "
             "A coefficient of -1 will be applied to branching process likelihood. "
@@ -635,28 +620,65 @@ def get_parser():
             "isotype parsimony, and context-based Poisson likelihood in that order."
         ),
     )
+    parser_infer.add_argument(
+        "--ranking_strategy",
+        type=str,
+        default=None,
+        help=(
+            "Expression describing tree ranking strategy. If provided, takes precedence over all other ranking arguments. "
+            "Two types of expressions are permitted: First are those describing lexicographic orderings, like ``B,C,A``, which means "
+            "choose trees to minimize branching process log loss, then minimize context log loss, then minimize number "
+            "of alleles. Next are expressions describing linear combinations of criteria, like ``B+2C-1.1A``, which means choose "
+            "trees to minimize the specified linear combination of criteria. "
+            "If linear combination expression has leading ``-``, use ``=`` instead of space to separate argument, "
+            "e.g. ``--ranking_strategy=-B+R``. "
+            "These two methods of ranking cannot be combined. For example, ``B+C,A`` is not a valid ranking strategy expression. "
+            "Ranking criteria are specified using the following identifiers. All are by default minimized:\n"
+            "B - branching process log loss,\n"
+            "I - isotype parsimony,\n"
+            "C - context-based Poisson log loss,\n"
+            "M - old mutability parsimony,\n"
+            "A - number of alleles,\n"
+            "R - sitewise reversions to naive sequence.\n"
+            "To compute the value of a criterion on ranked trees without affecting the ranking, include that ranking criterion "
+            "with a coefficient of zero, as in ``B+2C+0A``, or ``B,C,0A``.\n"
+            "To maximize instead of minimizing a criterion in lexicographic ranking, provide a negative coefficient. "
+            "For example, ``B,-A`` will first minimize branching process log loss, then maximize the number of alleles. \n"
+            "A ranking strategy string containing a single ranking criterion identifier will be interpreted as a lexicographic ordering. "
+            "``gctree infer --verbose`` will describe the ranking strategy used. Examine this output to make sure it's as expected."
+        ),
+    )
     parser_infer.add_argument(
         "--use_old_mut_parsimony",
         action="store_true",
         help=(
+            "This argument is deprecated. Use the identifier 'M' with the "
+            "argument ``--ranking_strategy`` instead. "
             "Use old mutability parsimony instead of poisson context likelihood. Not recommended "
             "unless attempting to reproduce results from older versions of gctree. "
             "This argument will have no effect unless an S5F model is provided with the arguments "
-            "`--mutability` and `--substitution`."
+            "``--mutability`` and ``--substitution``."
+        ),
+    )
+    parser_infer.add_argument(
+        "--show_nucleotide_mutations",
+        action="store_true",
+        help=(
+            "If provided, branches in rendered tree will be annotated with nucleotide mutations."
         ),
     )
     parser_infer.add_argument(
         "--summarize_forest",
         action="store_true",
         help=(
-            "write a file `[outbase].forest_summary.log` with a summary of traits for trees in the forest."
+            "write a file ``[outbase].forest_summary.log`` with a summary of traits for trees in the forest."
         ),
     )
     parser_infer.add_argument(
         "--tree_stats",
         action="store_true",
         help=(
-            "write a file `[outbase].tree_stats.log` with stats for all trees in the forest. "
+            "write a file ``[outbase].tree_stats.log`` with stats for all trees in the forest. "
             "For large forests, this is slow and memory intensive."
         ),
     )
diff --git a/gctree/isotyping.py b/gctree/isotyping.py
index 15d25a1a..37d9193a 100644
--- a/gctree/isotyping.py
+++ b/gctree/isotyping.py
@@ -442,7 +442,7 @@ def edge_weight_func(n1: hdag.HistoryDagNode, n2: hdag.HistoryDagNode):
                 "edge_weight_func": edge_weight_func,
                 "accum_func": sum,
             },
-            name="Isotype Pars.",
+            name="IsotypeParsimony",
         ),
         min,
     )
diff --git a/gctree/mutation_model.py b/gctree/mutation_model.py
index 2a29ae38..6fdad98b 100644
--- a/gctree/mutation_model.py
+++ b/gctree/mutation_model.py
@@ -492,7 +492,7 @@ def distance(node1, node2):
                 "edge_weight_func": distance,
                 "accum_func": sum,
             },
-            name="Mut. Pars.",
+            name="MutabilityParsimony",
         ),
         min,
     )
@@ -624,11 +624,11 @@ def _context_poisson_likelihood_dagfuncs(*args, splits: List[int] = [], **kwargs
                 "edge_weight_func": lambda n1, n2: (
                     0
                     if n1.is_ua_node()
-                    else distance(n1.label.sequence, n2.label.sequence)
+                    else -distance(n1.label.sequence, n2.label.sequence)
                 ),
                 "accum_func": sum,
             },
-            name="LogContextLikelihood",
+            name="ContextLikelihoodLogLoss",
         ),
-        max,
+        min,
     )
diff --git a/tests/smalltest.sh b/tests/smalltest.sh
index 5bbc6565..9011f4a8 100755
--- a/tests/smalltest.sh
+++ b/tests/smalltest.sh
@@ -5,20 +5,40 @@ export QT_QPA_PLATFORM=offscreen
 export XDG_RUNTIME_DIR=/tmp/runtime-runner
 export MPLBACKEND=agg
 mkdir -p tests/smalltest_output
-wget -O HS5F_Mutability.csv https://bitbucket.org/kleinstein/shazam/raw/ba4b30fc6791e2cfd5712e9024803c53b136e664/data-raw/HS5F_Mutability.csv
-wget -O HS5F_Substitution.csv https://bitbucket.org/kleinstein/shazam/raw/ba4b30fc6791e2cfd5712e9024803c53b136e664/data-raw/HS5F_Substitution.csv
 
-gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer  --root GL --frame 1 --verbose --idlabel
+mutabilities=HS5F_Mutability.csv
+substitutions=HS5F_Substitution.csv
 
-gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability HS5F_Mutability.csv --substitution HS5F_Substitution.csv --ranking_coeffs 0 1 0 --use_old_mut_parsimony --branching_process_ranking_coeff 0 
+wget -O $mutabilities https://bitbucket.org/kleinstein/shazam/raw/ba4b30fc6791e2cfd5712e9024803c53b136e664/data-raw/HS5F_Mutability.csv
+wget -O $substitutions https://bitbucket.org/kleinstein/shazam/raw/ba4b30fc6791e2cfd5712e9024803c53b136e664/data-raw/HS5F_Substitution.csv
+
+# testing backward compatibility:
 
 gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability HS5F_Mutability.csv --substitution HS5F_Substitution.csv --ranking_coeffs 1 1 0 --use_old_mut_parsimony --branching_process_ranking_coeff 0 
 
-gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability HS5F_Mutability.csv --substitution HS5F_Substitution.csv --ranking_coeffs .01 -1 0 --branching_process_ranking_coeff -1 --summarize_forest --tree_stats
+gctree infer tests/smalltest_output/gctree.infer.inference.parsimony_forest.p --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability HS5F_Mutability.csv --substitution HS5F_Substitution.csv --ranking_coeffs 1 -1 0 --use_old_mut_parsimony
+
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability HS5F_Mutability.csv --substitution HS5F_Substitution.csv --use_old_mut_parsimony
+
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability HS5F_Mutability.csv --substitution HS5F_Substitution.csv --ranking_coeffs -1 1 0 --use_old_mut_parsimony --branching_process_ranking_coeff 0 --ranking_strategy "M"
+
+# End testing backward compatibility ^^
+
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer1  --root GL --frame 1 --verbose --idlabel
+
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer2 --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability $mutabilities --substitution $substitutions --ranking_strategy "M"
+
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer3 --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability $mutabilities --substitution $substitutions --ranking_strategy "I+M"
+
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer4 --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability $mutabilities --substitution $substitutions --ranking_strategy="-B+0.01I+C" --summarize_forest --tree_stats
+
+
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer5 --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt
 
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer6 --root GL --frame 1 --verbose --idlabel --mutability $mutabilities --substitution $substitutions
 
-gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer7 --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability $mutabilities --substitution $substitutions
 
-gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --mutability HS5F_Mutability.csv --substitution HS5F_Substitution.csv
+gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer8 --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability $mutabilities --substitution $substitutions --ranking_strategy "R,-1.0B"
 
-gctree infer tests/small_outfile tests/abundances.csv --outbase tests/smalltest_output/gctree.infer --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability HS5F_Mutability.csv --substitution HS5F_Substitution.csv
+gctree infer tests/smalltest_output/gctree.infer.inference.parsimony_forest.p --outbase tests/smalltest_output/gctree.infer9 --root GL --frame 1 --verbose --idlabel --idmapfile tests/idmap.txt --isotype_mapfile tests/isotypemap.txt --mutability $mutabilities --substitution $substitutions --ranking_strategy "A, 0B, 0R" --tree_stats --summarize_forest