Skip to content

Releases: justincbagley/piranha

PIrANHA version 0.4-alpha-4

18 Dec 07:04
Compare
Choose a tag to compare
Pre-release

Codacy Badge License Tweet Twitter

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.4a4, a software package that provides a number of utility functions and pipelines for file processing and analysis steps in the (phylo*=) fields of phylogenomics and phylogeography (including population genomics). PIrANHA is fully command line-based and contains a series of functions for automating tasks during evolutionary analyses of genetic data.

PIrANHA v0.4a4 (=v0.4-alpha-4) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Development is ongoing and we need alpha testers, so feel free to download this release and try it out. Email all suggestions, feature requests, and bug fix requests to Justin directly at jbagley (at) jsu (dot) edu (also see Contact page of the wiki) using forms here.

See full description in the PIrANHA README, Quick Guide, and wiki pages.

What's new?

v0.4a4 (v0.4-alpha-4)

This update builds on the previous development release, v0.4a3, by adding minor bug fixes, major bug fixes, new features and improvements, and new functions.

Phylogenomics

In this release, I have worked to further flesh out contributions of PIrANHA to phylogenomics workflows for analyzing targeted sequence capture data (e.g. from Hyb-Seq) by adding the new function assembleReads, a script that automates de novo assembly of cleaned sequence reads (short reads in FASTQ format) from targeted capture HTS experiments using the ABySS assembler. This is a companion script designed to be run before phaseAlleles and alignAlleles. The overall workflow now assembles HTS read data, and phases and aligns consensus sequences based on reads (re)mapped to a reference assembly FASTA file (i.e. following reference-based assembly). This combination of programs was designed to be run 1) in a custom target capture workflow (“Workflow 1” below) or 2) after first conducting cleaning, assembly, locus selection, and reference-based assembly in the SECAPR sequence capture pipeline (Andermann et al. 2018; “Workflow 2” below, tested using output from SECAPR as input for PIrANHA).

There are two recommended workflows:

Workflow 1 (Recommended, most stable):

  1. Cleaning reads using fastp (see here; or similar software).
  2. Read assembly using assembleReads, followed by sequence phasing (phaseAlleles) and alignment of allelic sequences (alignAlleles) in PIrANHA.
  3. Post-processing and phylogenetic inference.

Workflow 2:

  1. Read cleaning, assembly, locus selection, and reference-based assembly (specifically created with SECAPR (Andermann et al. 2018).
  2. Sequence phasing (phaseAlleles) and alignment of allelic sequences (alignAlleles) in PIrANHA.
  3. Post-processing and phylogenetic inference.

New features

  • Tab completion. The most important new feature added in this release is dynamic tab completion of function names after piranha -f (e.g. piranha -f <TAB>). See the GitHub repository README for a cool demonstration of this feature!!
  • Simplified Homebrew install (updated formula)
  • New single install_piranha installer script replaces previous system using two separate installer scripts.
  • New handling of large alignment files keeps dropRandomHap function from dying, while still reducing alignments to one phased allele per sample.
  • Added -t option for specifying number of threads when running batchRunFolders.

Bug fixes

  • Fixed version printing for piranha main script and functions (piranha -V, piranha --version, piranha -f <function> -V, and piranha -f <function> --version each now yield expected behavior (terse output).
  • Bug fixes for bad piping or other minor errors in batchRunFolders, FASTAsummary, and splitFile functions.
  • Bug fixes and updates for assembleReads and phaseAlleles functions of piranha, fixing errors that caused the program to stop due to issues with among other things ls.
  • Bug fix for PHYLIP2NEXUS because failing regex test for hexadecimal characters, if produced, in the resulting (output) NEXUS files. Problem solved by POSIX solution.
  • Bug fixes for FASTA2PHYLIP function, which in aggregate now completely fix previous issues with the single-FASTA, -f 1 option.
  • Updated trimSeqs function to improve performance after issue discussion with Juan Moreira. This updated fixed POSIX space bug, because [:space:] should be [[:space:]].

PIrANHA version 0.4-alpha-3

31 Jul 22:29
b15e1aa
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.4a3, a software package that provides a number of utility functions and pipelines for file processing and analysis in phylogenetics, phylogenomics, and phylogeography. PIrANHA stands for "Phylogenetics and Phylogeography," and v0.4a3 (=v0.4-alpha-3) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Feel free to download this release and try it out; however, PIrANHA is still under active development. Email all suggestions and bug fix requests to Justin directly at bagleyj (at) umsl (dot) edu (also see Contact page of the wiki).

See full description in the PIrANHA README, Quick Guide, and wiki pages.

What's new?

v0.4a3 (v0.4-alpha-3)

This update builds on the previous pre-release, v0.4a2, by adding minor bug fixes and improvements to several functions. With the addition of the new function alignAlleles, a companion script meant to be run directly after phaseAlleles, this release establishes a new workflow for phasing and aligning consensus sequences from HTS (e.g. targeted sequence capture data) based on reads (re)mapped to a reference assembly FASTA file (i.e. following reference-based assembly). This combination of programs was designed to be run on target capture data after first conducting cleaning, assembly, locus selection, and reference-based assembly (specifically created with SECAPR (Andermann et al. 2018) in mind, and tested using output from SECAPR).

Additionally, this update introduces several other new functions. These include a new trimSeqs for trimming DNA sequences in PHYLIP alignments, with custom gap handling options in trimAl, and outputting trimmed results files in FASTA, PHYLIP, or NEXUS formats. There is a geneCounter function that counts and summarizes number of gene copies per tip taxon label in a set of input gene trees in Newick format, given a taxon-species assignment file (this function written to handle output from HybPiper pipeline; see Usage text). And I've also added a new batchRunFolders function that automates splitting a set of input files into different batches (to be run in parallel on a remote supercomputing cluster, or a local machine), starting from file type or list of input files; specifically, this function allows you to prep batch analyses in MAFFT, RAxML, and IQ-TREE.

PIrANHA version 0.4-alpha-2

17 Apr 19:14
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.4a2, a software package that provides a number of utility functions and pipelines for file processing and analysis in phylogenetics, phylogenomics, and phylogeography. PIrANHA stands for "Phylogenetics and Phylogeography," and v0.4a2 (=v0.4-alpha-2) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Feel free to download this release and try it out; however, PIrANHA is still under active development. Email all suggestions and bug fix requests to Justin directly at bagleyj (at) umsl (dot) edu (also see Contact page of the wiki).

See full description in the PIrANHA README, Quick Guide, and wiki pages.

What's new?

v0.4a2 (v0.4-alpha-2)

This update builds on the previous pre-release, v0.4a, by updating the main prianha script (including improvements to messaging, function list, and help text); addition of a new phaseAlleles function that automates phasing of consensus sequences from HTS (e.g. targeted sequence capture) based on reads (re)mapped to a reference assembly FASTA file; as well as minor updates to all functions (improved messaging, minor bug fixes, and minor reformatting).

PIrANHA version 0.4-alpha

13 Apr 16:46
ce6fadb
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.4a, a software package that provides a number of utility functions and pipelines for file processing and analysis in standard phylogenetics, phylogenomics, and phylogeography. PIrANHA stands for "Phylogenetics and Phylogeography," and v0.4a (=v0.4-alpha) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Please feel free to download this release and try it out, as most of the scripts have been verified; however, realize that PIrANHA is still under active development. Please email all suggestions and bug fix requests to Justin directly at bagleyj (at) umsl (dot) edu (also see Contact page of the wiki).

See full description in the PIrANHA README and wiki pages.

What's new?

v0.4a (v0.4-alpha)

This update builds on the previous pre-release, v0.3-alpha.2, by updating the main prianha script including improvements to how arguments are passed and fixes to debug mode, updates to all functions (improved syntax, bug fixes, and help texts), as well as the addition of new functions. The most recent changes include:

PIrANHA v0.4a (official minor pre-release version 0.4-alpha) - April 13, 2020

  • April 12, 2020: Various minor updates to piranha bin/ functions, and important update to options in main piranha script now allows arguments to be passed to the program directly after the function call (after -f flag), without -a|--args flag. This fixes a problem where the previous implementation's reliance on --args='<args>' format (arguments passed in quotes) meant that Bash completion would not work while writing out the arguments.
  • April 6-7, 2020: Major piranha package update, including edits to main script, all functions, dir structure, and other files (e.g. test files). Bug fixes for errors when no arguments and failed rm calls, check and update debug code, plus updates to READMEs and help texts.
  • April 2-3, 2020: Multiple updates. Added new FASTAsummary function that automates summarizing characteristics of one or multiple FASTA files in current working directory, and I also modified calcAlignmentPIS to integrate with this new function, and now both functions work well when run separately or together (the function to calculate PIS is now called within FASTAsummary. Also updated PHYLIPsummary function. Also added new splitFASTA function that splits each tip taxon (individual sequence) in a FASTA file into a separate FASTA file. This set of updates also includes a new piranha script with updated -f list function accommodating new functions, and with an attempt at adding debugging code (but this needs additional testing and fixing (How to best implement debugging?)).
  • March 30, 2020: Multiple updates. Added new nQuireRunner function that automates running nQuire to estimate ploidy levels for samples based on mapped NGS reads (BAM files); updated FASTA2PHYLIP function to have new options (-f and -i) allowing analysis of a single input FASTA or multiple FASTAs (prev. only did multiple FASTAs in cwd); updated MAGNET with minor fixes to v1.1.1 (updated versioning in README as well); and updated piranha function to have complete list function output. Also added test FASTA file 'test.fasta' to test/ subfolder of repository containing test input files.
  • December 12, 2019: Added new BEAST_logThinner function script that downsizes, or 'thins', BEAST2 .log files to every nth line. Tested and working interactively. Outputs new log file in current working directory, without replacement.
  • October 23, 2019: Added new PHYLIPsummary function script that summarizes no. taxa and no. characters for one or multiple PHYLIP DNA sequence alignments in current directory.
  • October 22, 2019: Made minor edits (e.g. fixing versioning) and bug fixes (fixing sed code that caused failures when user had GNU SED installed instead of BSD SED) to all of the following function scripts: PhyloMapperNullProc, PHYLIPsubsampler, PHYLIPcleaner, PHYLIP2PFSubsets, MLEResultsProc, getBootTrees, fastSTRUCTURE, dropRandomHap, dadiUncertainty, dadiRunner, dadiPostProc, calcAlignmentPIS, BEASTRunner, BEAST_PSPrepper, RAxMLRunChecker, RAxMLRunner, SNAPPRunner, SpeciesIdentifier, AnouraNEXUSPrepper, concatenateSeqs, concatSeqsPartitions, FASTA2VCF, getTaxonNames, makePartitions, MrBayesPostProc, phyNcharSumm, pyRAD2PartitionFinder, pyRADLocusVarSites, renameForStarBeast2, renameTaxa, renameTaxa_v1, splitPHYLIP, taxonCompFilter, treeThinner, vcfSubsampler, completeSeqs, RYcoder, RogueNaRokRunner, PHYLIP2NEXUS, PHYLIP2Mega, NEXUS2PHYLIP, NEXUS2MultiPHYLIP, Mega2PHYLIP, BEASTReset, FASTA2PHYLIP, completeConcatSeqs

PIrANHA version 0.3-alpha.2

26 Jul 19:17
2961389
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.3a2, a software package that provides a number of utility functions and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.3a2 (=v0.3-alpha.2) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Please feel free to download this release and try it out, as most of the scripts have been verified; however, realize that PIrANHA is still under active development. Please email all suggestions and bug fix requests to Justin directly at bagleyj (at) umsl (dot) edu (also see Contact page of the wiki).

See full description in the PIrANHA README and wiki pages.

What's new?

v0.3a2 (v0.3-alpha.2)

This update builds on the previous pre-release, v0.3-alpha.1, by adding new functions, rewriting others, and adding updates and bug fixes. The most recent changes include:

PIrANHA v0.3a2 (official minor pre-release version 0.3-alpha.2) - July 26, 2019

  • July 26, 2019: Updated README, repository files, and wiki files for new release.
  • July 25, 2019: Added new RogueNaRokRunner function that reads in a Newick-formatted tree file and runs it through RogueNaRok to identify rogue taxa. Additionally, I conducted a complete rewrite of the NEXUS2PHYLIP function that removes its dependence on N. Takebayashi's Perl script (see previous version, Acknowledgements), and I made minor edits to piranha and edits and bug fixes for other functions including RYcoder.
  • July 24, 2019: Minor updates and bug fixes for PHYLIP2NEXUS function.
  • July 11, 2019: Minor updates and fixes for PHYLIP2Mega function.
  • June 11, 2019: Added new RYcoder function that reads in a PHYLIP or NEXUS DNA sequence alignment and converts it into 'RY'-coded, binary format, with purines (A, G) coded as 0's and pyrimidines (C, T) coded as 1's.

PIrANHA version 0.3-alpha.1

07 May 17:33
1aa7af1
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.3a1, a software package that provides a number of utility functions and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.3a1 (=v0.3-alpha.1) is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.

PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT

PIrANHA tools include interactive/non-interactive functions wrapper scripts focusing on (1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries, e.g. from ddRAD-seq (Peterson et al. 2012) or target capture, and (2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer, and then conducting post-processing of the results. See full description in the README and wiki pages.

What's new?

v0.3a1 (v0.3-alpha.1)

This minor update builds on the previous pre-release, v0.2-alpha.2, by making a variety of changes towards finalizing function rewrites and getting most or all functions working. The most recent changes include:

  • May 7, 2019: Fixed main piranha function so that it correctly reads in all arguments passed with the --args='' flag (should also work with -a), which previously caused several functions to fail and invoke trapExit.
  • April 30 – May 7, 2019: Added bug fixes and updates to dropRandomHap, PHYLIP2NEXUS, PHYLIP2FASTA, PHYLIP2Mega, and splitPHYLIP functions.
  • April 10, 2019: Added new renameTaxa function that renames taxon (sample) names in genetic data files of type FASTA, NEXUS, PHYLIP, and VCF according to user specifications.
  • April 9, 2019: Added updated scripts to fix bugs in FASTA2PHYLIP and getTaxonNames functions.

PIrANHA version 0.2-alpha.2

09 Apr 13:24
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.2-alpha.2, a software package that provides a number of utility scripts and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.2-alpha.2 is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.

PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT

PIrANHA tools include interactive/non-interactive shell scripts and wrapper scripts focusing on (1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries, e.g. from ddRAD-seq (Peterson et al. 2012) or target capture, and (2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer, and then conducting post-processing of the results. See full description in the README and wiki pages.

What's new?

v0.2-alpha.2

This is a minor update to the pre-release version that adds a new FASTA2VCF function that converts a sequential FASTA multiple sequence alignment into a variant call format (VCF) file, and allows subsampling 1 SNP per partition/locus. This update also includes edits to the README, index.html, changeLog.md, and travis.yml files. Importantly, I have also now created a successful homebrew tap for PIrANHA here with a formula that is working with v0.2-alpha, and that is now described in the documentation wiki.

PIrANHA version 0.2-alpha.1c

15 Mar 20:01
1f61d3f
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.2-alpha.1c, a software package that provides a number of utility scripts and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.2-alpha.1c is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.

PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT

PIrANHA tools include interactive/non-interactive shell scripts and wrapper scripts focusing on 1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries (e.g. from ddRAD-seq, Peterson et al. 2012), and 2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer machine, and then conducting post-processing of the results. See full description in the README.

What's new?

v0.2-alpha.1c

This is a minor update to the pre-release version that includes edits to the README and index.html files, and that adds this slightly updated changeLog.md file back into the repository. Other changes include removing bin/trash function due to conflicts with /usr/local/bin/trash symlink belonging to trash on macOS, which caused homebrew install to fail. After fixing this, I have also now created a successful homebrew tap for PIrANHA that is working with this release (more info soon, to be added to the README).

v0.2-alpha.1b

This is a very minor update to the pre-release version removing some PHYLIP and FASTA DNA sequence alignments that I had previously included in the repo for my own testing purposes, and updating README and index.html files.

v0.2-alpha.1

Since v0.2-alpha, the pre-release version of PIrANHA v0.2-alpha.1 added several updates including redos for the PIrANHA etc/ dir, a README for bin/, and new scripts for the MLEResultsProc, getTaxonNames, taxonCompFilter, and SNAPPRunner functions.

v0.2-alpha

Pre-release version PIrANHA v0.2-alpha.1 involved a virtually complete rewrite and reorganization of PIrANHA (with >1,200 additions and >400 deletions). All scripts were converted to 'function' programs in bin/ or bin/MAGNET-1.0.0/ of the repo, and I wrote a new program, piranha, that is now the main program and runs all functions. (See more on v0.2 changes in v0.2 and v0.2-alpha release notes below.)

PIrANHA version 0.2-alpha.1b

15 Mar 18:26
bc8e176
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.2-alpha.1b, a software package that provides a number of utility scripts and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.2-alpha.1b is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.

PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT

PIrANHA tools include interactive/non-interactive shell scripts and wrapper scripts focusing on 1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries (e.g. from ddRAD-seq, Peterson et al. 2012), and 2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer machine, and then conducting post-processing of the results. See full description in the README.

What's new?

v0.2-alpha.1b

This is a very minor update to the pre-release version removing some PHYLIP and FASTA DNA sequence alignments that I had previously included in the repo for my own testing purposes, and updating README and index.html files.

v0.2-alpha.1

Since v0.2-alpha, the pre-release version of PIrANHA v0.2-alpha.1 added several updates including redos for the PIrANHA etc/ dir, a README for bin/, and new scripts for the MLEResultsProc, getTaxonNames, taxonCompFilter, and SNAPPRunner functions.

v0.2-alpha

Pre-release version PIrANHA v0.2-alpha.1 involved a virtually complete rewrite and reorganization of PIrANHA (with >1,200 additions and >400 deletions). All scripts were converted to 'function' programs in bin/ or bin/MAGNET-1.0.0/ of the repo, and I wrote a new program, piranha, that is now the main program and runs all functions. (See more on v0.2 changes in v0.2 and v0.2-alpha release notes below.)

PIrANHA version 0.2-alpha.1

15 Mar 17:23
7fa0c1f
Compare
Choose a tag to compare
Pre-release

Codacy Badge License

Scripts for file processing and analysis in phylogenomics & phylogeography

This is PIrANHA v0.2-alpha.1, a software package that provides a number of utility scripts and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.2-alpha.1 is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.

PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT

PIrANHA tools include interactive/non-interactive shell scripts and wrapper scripts focusing on 1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries (e.g. from ddRAD-seq, Peterson et al. 2012), and 2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer machine, and then conducting post-processing of the results. See full description in the README.

What's new?

v0.2-alpha.1

Since v0.2-alpha, the current pre-release version, PIrANHA v0.2-alpha.1 adds several updates including redos for the PIrANHA etc/ dir, a README for bin/, and new scripts for the MLEResultsProc, getTaxonNames, taxonCompFilter, and SNAPPRunner functions.

v0.2-alpha

Pre-release version, PIrANHA v0.2-alpha.1, involved a virtually complete rewrite and reorganization of PIrANHA (with >1,200 additions and >400 deletions). All scripts were converted to 'function' programs in bin/ or bin/MAGNET-1.0.0/ of the repo, and I wrote a new program, piranha, that is now the main program and runs all functions. I am still in the process of updating the README and all function scripts, but I decided to do a pre-release ratcheted up to v0.2 due to the great improvements in modularization and efficiency that v0.2 update allows (selecting a function and passing all arguments, all from piranha), and because I wanted a new alpha release to use as a starting point to create Debian and Homebrew distribution releases (i.e. brew tap(s) to update as new versions roll out during development). The current organization of PIrANHA is also going to be much better suited for general use, and for adding other collaborators or developers.

The changeLog.md is not yet up to date (not even for v0.2-alpha.1) and the repository is close but still not ready for a v1.0 major release, but we're getting there!!