Skip to content

Running

Will Pitchers edited this page Apr 15, 2019 · 7 revisions

SISTR

To perform Salmonella enterica serotyping we use the tool sistr, which combines a suite of approaches to infer serotype from draft assemblies of WGS data. For the purposes of MDU work, we have validated the use of the combination of antigen detection and cgMLST typing:

  • It uses BLAST to identify the presence of annotated O- and H- antigen sequences. As such, it comes with curated multiFASTA files for the fliC, fliB, and wzx and wzy genes.
  • It has a cgMLST scheme with 330 loci, and a database of 52,790 genomes (mostly comprising subspecies I) that have been typed at these loci and annotated with a serotype. It uses BLAST to genotype the input assembly across as many of the 330 loci, and then calculates the pairwise distance of the input isolate to the database of curated genomes.

Using the CLI

Salmonella_typing can be run (minimally) with salmonella_typing run mdu_samples.csv.

Further arguments can be passed to specify e.g. output format etc.

Welcome to MDU SALMONELLA TYPING WORKFLOW
Running sistr version 1.0.2
All DBs updated on 2018-09-21
stype_cli 0.1.0

Usage:
  command [options] [arguments]

Options:
  -h, --help                      Display this help message
  -q, --quiet                     Do not output any message
  -V, --version                   Display this application version
      --ansi                      Force ANSI output
      --no-ansi                   Disable ANSI output
  -n, --no-interaction            Do not ask any interactive question
  -v|vv|vvv, --verbose[=VERBOSE]  Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug

Available commands:
  clean   Clean up the output from running SISTR workflow
  help    Displays help for a command
  list    Lists commands
  parse   Parse the output from SISTR implementing the MMS136 rules. By default do not output the LIMS Excel.
  run     Run SISTR typing following MMS136
  unlock  If workflow gets interrupted for some reason, unlock it.

The output of SISTR is a spreadsheet comprising three tabs (MMS136, REVIEW, ALL) with the following business logic applied:

MMS136 (final results for acceptance and authorisation):

  • All serovar calls must match (serovar, serovar_antigen and serovar_cgmlst)
  • Must not be Edge case Dublin
  • Must not be Edge case Enteritidis
  • Must not be Edge case monophasic Typhimurium
  • Must not be Edge case Sophia
  • Inferences for all antigens must be present
  • Must be sub-species enterica
  • Must have ≥300 cgMLST matching alleles

REVIEW (results for acceptance but review by Enterics before reporting):

  • All serovar calls that do not match (serovar, serovar_antigen and serovar_cgmlst)
  • Edge case Dublin
  • Edge case Enteritidis
  • Edge case monophasic Typhimurium
  • Edge case Sophia
  • Inferences for where antigens are missing
  • Sub-species other than enterica
  • Samples with >100 cgMLST alleles but <300 cgMLST matching alleles
  • Samples with <100 cgMLST alleles and a defined serogroup
Clone this wiki locally