Skip to content

Commit

Permalink
update to documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
rob-p committed Aug 31, 2016
1 parent 454c7d6 commit bacb376
Show file tree
Hide file tree
Showing 2 changed files with 61 additions and 0 deletions.
18 changes: 18 additions & 0 deletions doc/source/file_formats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,24 @@ particularly important piece of information contained in this file is
the inferred library type. Most of the information recorded in this
file should be self-descriptive.

""""""""""""""""""""""""""""""
Observed library format counts
""""""""""""""""""""""""""""""

When run in *mapping-based* mode, the quantification directory will
contain a file called ``lib_format_counts.json``. This JSON file
reports the number of fragments that had at least one mapping compatible
with the designated library format, as well as the number that didn't.
It also records the strand-bias that provides some information about
how strand-specific the computed mappings were.

Finally, this file contains a count of the number of *mappings* that
were computed that matched each possible library type. These are
counts of *mappings*, and so a single fragment that maps to the
transcriptome in more than one way may contribute to multiple library
type counts. **Note**: This file is currently not generated when Salmon
is run in alignment-based mode.


""""""""""""""""""""""""""""
Fragment length distribution
Expand Down
43 changes: 43 additions & 0 deletions doc/source/salmon.rst
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,29 @@ done independently, but future versions of Salmon may provide a script to
generate this unmapped FASTA/Q file from the unmapped file and the original
inputs.


"""""""""""""""""""
``--writeMappings``
"""""""""""""""""""

Passing the ``--writeMappings`` argument to Salmon will have an effect
only in mapping-based mode and *only when using a quasi-index*. When
executed with the ``--writeMappings`` argument, Salmon will write out
the mapping information that it then processes to quantify transcript
abundances. The mapping information will be written in a SAM
compatible format. If no options are provided to this argument, then
the output will be written to stdout (so that e.g. it can be piped to
samtools and directly converted into BAM format). Otherwise, this
argument can optionally be provided with a filename, and the mapping
information will be written to that file.

.. note:: Compatible mappings

The mapping information is computed and written *before* library
type compatibility checks take place, thus the mapping file will
contain information about all mappings of the reads considered by
Salmon, even those that may later be filtered out due to
incompatibility with the library type.

What's this ``LIBTYPE``?
------------------------
Expand All @@ -493,6 +516,26 @@ allow Salmon to infer the library type for you, you should still read
the section below, so that you can interpret how Salmon reports the
library type it discovers.

.. note:: Automatic library type detection in alignment-based mode

The implementation of this feature involves opening the BAM
file, peaking at the first record, and then closing it to
determine if the library should be treated as single-end or
paired-end. Thus, *in alignment-based mode* automatic
library type detection will not work with an input
stream. If your input is a regular file, everything should
work as expected; otherwise, you should provide the library
type explicitly in alignment-based mode.

Also the automatic library type detection is performed *on the
basis of the alignments in the file*. Thus, for example, if the
upstream aligner has been told to perform strand-aware mapping
(i.e. to ignore potential alignments that don't map in the
expected manner), but the actual library is unstranded,
automatic library type detection cannot detect this. It will
attempt to detect the library type that is most consistent *with
the alignment that are provided*.

The library type string consists of three parts: the relative orientation of
the reads, the strandedness of the library, and the directionality of the
reads.
Expand Down

0 comments on commit bacb376

Please sign in to comment.