Releases: justincbagley/piranha
PIrANHA version 0.4-alpha-4
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.4a4, a software package that provides a number of utility functions and pipelines for file processing and analysis steps in the (phylo*=) fields of phylogenomics and phylogeography (including population genomics). PIrANHA is fully command line-based and contains a series of functions for automating tasks during evolutionary analyses of genetic data.
PIrANHA v0.4a4 (=v0.4-alpha-4) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Development is ongoing and we need alpha testers, so feel free to download this release and try it out. Email all suggestions, feature requests, and bug fix requests to Justin directly at jbagley (at) jsu (dot) edu (also see Contact page of the wiki) using forms here.
See full description in the PIrANHA README, Quick Guide, and wiki pages.
What's new?
v0.4a4 (v0.4-alpha-4)
This update builds on the previous development release, v0.4a3, by adding minor bug fixes, major bug fixes, new features and improvements, and new functions.
Phylogenomics
In this release, I have worked to further flesh out contributions of PIrANHA to phylogenomics workflows for analyzing targeted sequence capture data (e.g. from Hyb-Seq) by adding the new function assembleReads
, a script that automates de novo assembly of cleaned sequence reads (short reads in FASTQ format) from targeted capture HTS experiments using the ABySS assembler. This is a companion script designed to be run before phaseAlleles
and alignAlleles
. The overall workflow now assembles HTS read data, and phases and aligns consensus sequences based on reads (re)mapped to a reference assembly FASTA file (i.e. following reference-based assembly). This combination of programs was designed to be run 1) in a custom target capture workflow (“Workflow 1” below) or 2) after first conducting cleaning, assembly, locus selection, and reference-based assembly in the SECAPR sequence capture pipeline (Andermann et al. 2018; “Workflow 2” below, tested using output from SECAPR as input for PIrANHA).
There are two recommended workflows:
Workflow 1 (Recommended, most stable):
- Cleaning reads using
fastp
(see here; or similar software). - Read assembly using
assembleReads
, followed by sequence phasing (phaseAlleles
) and alignment of allelic sequences (alignAlleles
) in PIrANHA. - Post-processing and phylogenetic inference.
Workflow 2:
- Read cleaning, assembly, locus selection, and reference-based assembly (specifically created with SECAPR (Andermann et al. 2018).
- Sequence phasing (
phaseAlleles
) and alignment of allelic sequences (alignAlleles
) in PIrANHA. - Post-processing and phylogenetic inference.
New features
- Tab completion. The most important new feature added in this release is dynamic tab completion of function names after
piranha -f
(e.g.piranha -f <TAB>
). See the GitHub repository README for a cool demonstration of this feature!! - Simplified Homebrew install (updated formula)
- New single
install_piranha
installer script replaces previous system using two separate installer scripts. - New handling of large alignment files keeps
dropRandomHap
function from dying, while still reducing alignments to one phased allele per sample. - Added
-t
option for specifying number of threads when runningbatchRunFolders
.
Bug fixes
- Fixed version printing for
piranha
main script and functions (piranha -V
,piranha --version
,piranha -f <function> -V
, andpiranha -f <function> --version
each now yield expected behavior (terse output). - Bug fixes for bad piping or other minor errors in
batchRunFolders
,FASTAsummary
, andsplitFile
functions. - Bug fixes and updates for
assembleReads
andphaseAlleles
functions ofpiranha
, fixing errors that caused the program to stop due to issues with among other thingsls
. - Bug fix for
PHYLIP2NEXUS
because failing regex test for hexadecimal characters, if produced, in the resulting (output) NEXUS files. Problem solved by POSIX solution. - Bug fixes for
FASTA2PHYLIP
function, which in aggregate now completely fix previous issues with the single-FASTA,-f 1
option. - Updated
trimSeqs
function to improve performance after issue discussion with Juan Moreira. This updated fixed POSIX space bug, because[:space:]
should be[[:space:]]
.
PIrANHA version 0.4-alpha-3
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.4a3, a software package that provides a number of utility functions and pipelines for file processing and analysis in phylogenetics, phylogenomics, and phylogeography. PIrANHA stands for "Phylogenetics and Phylogeography," and v0.4a3 (=v0.4-alpha-3) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Feel free to download this release and try it out; however, PIrANHA is still under active development. Email all suggestions and bug fix requests to Justin directly at bagleyj (at) umsl (dot) edu (also see Contact page of the wiki).
See full description in the PIrANHA README, Quick Guide, and wiki pages.
What's new?
v0.4a3 (v0.4-alpha-3)
This update builds on the previous pre-release, v0.4a2, by adding minor bug fixes and improvements to several functions. With the addition of the new function alignAlleles
, a companion script meant to be run directly after phaseAlleles
, this release establishes a new workflow for phasing and aligning consensus sequences from HTS (e.g. targeted sequence capture data) based on reads (re)mapped to a reference assembly FASTA file (i.e. following reference-based assembly). This combination of programs was designed to be run on target capture data after first conducting cleaning, assembly, locus selection, and reference-based assembly (specifically created with SECAPR (Andermann et al. 2018) in mind, and tested using output from SECAPR).
Additionally, this update introduces several other new functions. These include a new trimSeqs
for trimming DNA sequences in PHYLIP alignments, with custom gap handling options in trimAl, and outputting trimmed results files in FASTA, PHYLIP, or NEXUS formats. There is a geneCounter
function that counts and summarizes number of gene copies per tip taxon label in a set of input gene trees in Newick format, given a taxon-species assignment file (this function written to handle output from HybPiper pipeline; see Usage text). And I've also added a new batchRunFolders
function that automates splitting a set of input files into different batches (to be run in parallel on a remote supercomputing cluster, or a local machine), starting from file type or list of input files; specifically, this function allows you to prep batch analyses in MAFFT, RAxML, and IQ-TREE.
PIrANHA version 0.4-alpha-2
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.4a2, a software package that provides a number of utility functions and pipelines for file processing and analysis in phylogenetics, phylogenomics, and phylogeography. PIrANHA stands for "Phylogenetics and Phylogeography," and v0.4a2 (=v0.4-alpha-2) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Feel free to download this release and try it out; however, PIrANHA is still under active development. Email all suggestions and bug fix requests to Justin directly at bagleyj (at) umsl (dot) edu (also see Contact page of the wiki).
See full description in the PIrANHA README, Quick Guide, and wiki pages.
What's new?
v0.4a2 (v0.4-alpha-2)
This update builds on the previous pre-release, v0.4a, by updating the main prianha
script (including improvements to messaging, function list, and help text); addition of a new phaseAlleles
function that automates phasing of consensus sequences from HTS (e.g. targeted sequence capture) based on reads (re)mapped to a reference assembly FASTA file; as well as minor updates to all functions (improved messaging, minor bug fixes, and minor reformatting).
PIrANHA version 0.4-alpha
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.4a, a software package that provides a number of utility functions and pipelines for file processing and analysis in standard phylogenetics, phylogenomics, and phylogeography. PIrANHA stands for "Phylogenetics and Phylogeography," and v0.4a (=v0.4-alpha) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Please feel free to download this release and try it out, as most of the scripts have been verified; however, realize that PIrANHA is still under active development. Please email all suggestions and bug fix requests to Justin directly at bagleyj (at) umsl (dot) edu (also see Contact page of the wiki).
See full description in the PIrANHA README and wiki pages.
What's new?
v0.4a (v0.4-alpha)
This update builds on the previous pre-release, v0.3-alpha.2, by updating the main prianha
script including improvements to how arguments are passed and fixes to debug mode, updates to all functions (improved syntax, bug fixes, and help texts), as well as the addition of new functions. The most recent changes include:
PIrANHA v0.4a (official minor pre-release version 0.4-alpha) - April 13, 2020
- April 12, 2020: Various minor updates to piranha bin/ functions, and important update to options in main
piranha
script now allows arguments to be passed to the program directly after the function call (after -f flag), without -a|--args flag. This fixes a problem where the previous implementation's reliance on--args='<args>'
format (arguments passed in quotes) meant that Bash completion would not work while writing out the arguments. - April 6-7, 2020: Major
piranha
package update, including edits to main script, all functions, dir structure, and other files (e.g. test files). Bug fixes for errors when no arguments and failedrm
calls, check and update debug code, plus updates to READMEs and help texts. - April 2-3, 2020: Multiple updates. Added new
FASTAsummary
function that automates summarizing characteristics of one or multiple FASTA files in current working directory, and I also modifiedcalcAlignmentPIS
to integrate with this new function, and now both functions work well when run separately or together (the function to calculate PIS is now called withinFASTAsummary
. Also updatedPHYLIPsummary
function. Also added newsplitFASTA
function that splits each tip taxon (individual sequence) in a FASTA file into a separate FASTA file. This set of updates also includes a newpiranha
script with updated-f list
function accommodating new functions, and with an attempt at adding debugging code (but this needs additional testing and fixing (How to best implement debugging?)). - March 30, 2020: Multiple updates. Added new
nQuireRunner
function that automates runningnQuire
to estimate ploidy levels for samples based on mapped NGS reads (BAM files); updatedFASTA2PHYLIP
function to have new options (-f and -i) allowing analysis of a single input FASTA or multiple FASTAs (prev. only did multiple FASTAs in cwd); updatedMAGNET
with minor fixes to v1.1.1 (updated versioning in README as well); and updatedpiranha
function to have completelist
function output. Also added test FASTA file 'test.fasta' to test/ subfolder of repository containing test input files. - December 12, 2019: Added new
BEAST_logThinner
function script that downsizes, or 'thins', BEAST2 .log files to every nth line. Tested and working interactively. Outputs new log file in current working directory, without replacement. - October 23, 2019: Added new
PHYLIPsummary
function script that summarizes no. taxa and no. characters for one or multiple PHYLIP DNA sequence alignments in current directory. - October 22, 2019: Made minor edits (e.g. fixing versioning) and bug fixes (fixing
sed
code that caused failures when user had GNU SED installed instead of BSD SED) to all of the following function scripts:PhyloMapperNullProc
,PHYLIPsubsampler
,PHYLIPcleaner
,PHYLIP2PFSubsets
,MLEResultsProc
,getBootTrees
,fastSTRUCTURE
,dropRandomHap
,dadiUncertainty
,dadiRunner
,dadiPostProc
,calcAlignmentPIS
,BEASTRunner
,BEAST_PSPrepper
,RAxMLRunChecker
,RAxMLRunner
,SNAPPRunner
,SpeciesIdentifier
,AnouraNEXUSPrepper
,concatenateSeqs
,concatSeqsPartitions
,FASTA2VCF
,getTaxonNames
,makePartitions
,MrBayesPostProc
,phyNcharSumm
,pyRAD2PartitionFinder
,pyRADLocusVarSites
,renameForStarBeast2
,renameTaxa
,renameTaxa_v1
,splitPHYLIP
,taxonCompFilter
,treeThinner
,vcfSubsampler
,completeSeqs
,RYcoder
,RogueNaRokRunner
,PHYLIP2NEXUS
,PHYLIP2Mega
,NEXUS2PHYLIP
,NEXUS2MultiPHYLIP
,Mega2PHYLIP
,BEASTReset
,FASTA2PHYLIP
,completeConcatSeqs
PIrANHA version 0.3-alpha.2
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.3a2, a software package that provides a number of utility functions and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.3a2 (=v0.3-alpha.2) is a new development pre-release, or 'minor version' that is greatly improved and ready for alpha testing! Please feel free to download this release and try it out, as most of the scripts have been verified; however, realize that PIrANHA is still under active development. Please email all suggestions and bug fix requests to Justin directly at bagleyj (at) umsl (dot) edu (also see Contact page of the wiki).
See full description in the PIrANHA README and wiki pages.
What's new?
v0.3a2 (v0.3-alpha.2)
This update builds on the previous pre-release, v0.3-alpha.1, by adding new functions, rewriting others, and adding updates and bug fixes. The most recent changes include:
PIrANHA v0.3a2 (official minor pre-release version 0.3-alpha.2) - July 26, 2019
- July 26, 2019: Updated README, repository files, and wiki files for new release.
- July 25, 2019: Added new
RogueNaRokRunner
function that reads in a Newick-formatted tree file and runs it through RogueNaRok to identify rogue taxa. Additionally, I conducted a complete rewrite of theNEXUS2PHYLIP
function that removes its dependence on N. Takebayashi's Perl script (see previous version, Acknowledgements), and I made minor edits topiranha
and edits and bug fixes for other functions includingRYcoder
. - July 24, 2019: Minor updates and bug fixes for
PHYLIP2NEXUS
function. - July 11, 2019: Minor updates and fixes for
PHYLIP2Mega
function. - June 11, 2019: Added new
RYcoder
function that reads in a PHYLIP or NEXUS DNA sequence alignment and converts it into 'RY'-coded, binary format, with purines (A, G) coded as 0's and pyrimidines (C, T) coded as 1's.
PIrANHA version 0.3-alpha.1
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.3a1, a software package that provides a number of utility functions and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.3a1 (=v0.3-alpha.1) is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.
PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT
PIrANHA tools include interactive/non-interactive functions wrapper scripts focusing on (1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries, e.g. from ddRAD-seq (Peterson et al. 2012) or target capture, and (2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer, and then conducting post-processing of the results. See full description in the README and wiki pages.
What's new?
v0.3a1 (v0.3-alpha.1)
This minor update builds on the previous pre-release, v0.2-alpha.2, by making a variety of changes towards finalizing function rewrites and getting most or all functions working. The most recent changes include:
- May 7, 2019: Fixed main
piranha
function so that it correctly reads in all arguments passed with the--args=''
flag (should also work with-a
), which previously caused several functions to fail and invoketrapExit
. - April 30 – May 7, 2019: Added bug fixes and updates to
dropRandomHap
,PHYLIP2NEXUS
,PHYLIP2FASTA
,PHYLIP2Mega
, andsplitPHYLIP
functions. - April 10, 2019: Added new
renameTaxa
function that renames taxon (sample) names in genetic data files of type FASTA, NEXUS, PHYLIP, and VCF according to user specifications. - April 9, 2019: Added updated scripts to fix bugs in
FASTA2PHYLIP
andgetTaxonNames
functions.
PIrANHA version 0.2-alpha.2
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.2-alpha.2, a software package that provides a number of utility scripts and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.2-alpha.2 is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.
PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT
PIrANHA tools include interactive/non-interactive shell scripts and wrapper scripts focusing on (1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries, e.g. from ddRAD-seq (Peterson et al. 2012) or target capture, and (2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer, and then conducting post-processing of the results. See full description in the README and wiki pages.
What's new?
v0.2-alpha.2
This is a minor update to the pre-release version that adds a new FASTA2VCF
function that converts a sequential FASTA multiple sequence alignment into a variant call format (VCF) file, and allows subsampling 1 SNP per partition/locus. This update also includes edits to the README, index.html, changeLog.md, and travis.yml files. Importantly, I have also now created a successful homebrew tap for PIrANHA here with a formula that is working with v0.2-alpha, and that is now described in the documentation wiki.
PIrANHA version 0.2-alpha.1c
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.2-alpha.1c, a software package that provides a number of utility scripts and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.2-alpha.1c is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.
PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT
PIrANHA tools include interactive/non-interactive shell scripts and wrapper scripts focusing on 1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries (e.g. from ddRAD-seq, Peterson et al. 2012), and 2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer machine, and then conducting post-processing of the results. See full description in the README.
What's new?
v0.2-alpha.1c
This is a minor update to the pre-release version that includes edits to the README and index.html files, and that adds this slightly updated changeLog.md file back into the repository. Other changes include removing bin/trash
function due to conflicts with /usr/local/bin/trash
symlink belonging to trash on macOS, which caused homebrew install to fail. After fixing this, I have also now created a successful homebrew tap for PIrANHA that is working with this release (more info soon, to be added to the README).
v0.2-alpha.1b
This is a very minor update to the pre-release version removing some PHYLIP and FASTA DNA sequence alignments that I had previously included in the repo for my own testing purposes, and updating README and index.html files.
v0.2-alpha.1
Since v0.2-alpha, the pre-release version of PIrANHA v0.2-alpha.1 added several updates including redos for the PIrANHA etc/ dir, a README for bin/, and new scripts for the MLEResultsProc
, getTaxonNames
, taxonCompFilter
, and SNAPPRunner
functions.
v0.2-alpha
Pre-release version PIrANHA v0.2-alpha.1 involved a virtually complete rewrite and reorganization of PIrANHA (with >1,200 additions and >400 deletions). All scripts were converted to 'function' programs in bin/ or bin/MAGNET-1.0.0/ of the repo, and I wrote a new program, piranha
, that is now the main program and runs all functions. (See more on v0.2 changes in v0.2 and v0.2-alpha release notes below.)
PIrANHA version 0.2-alpha.1b
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.2-alpha.1b, a software package that provides a number of utility scripts and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.2-alpha.1b is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.
PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT
PIrANHA tools include interactive/non-interactive shell scripts and wrapper scripts focusing on 1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries (e.g. from ddRAD-seq, Peterson et al. 2012), and 2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer machine, and then conducting post-processing of the results. See full description in the README.
What's new?
v0.2-alpha.1b
This is a very minor update to the pre-release version removing some PHYLIP and FASTA DNA sequence alignments that I had previously included in the repo for my own testing purposes, and updating README and index.html files.
v0.2-alpha.1
Since v0.2-alpha, the pre-release version of PIrANHA v0.2-alpha.1 added several updates including redos for the PIrANHA etc/ dir, a README for bin/, and new scripts for the MLEResultsProc
, getTaxonNames
, taxonCompFilter
, and SNAPPRunner
functions.
v0.2-alpha
Pre-release version PIrANHA v0.2-alpha.1 involved a virtually complete rewrite and reorganization of PIrANHA (with >1,200 additions and >400 deletions). All scripts were converted to 'function' programs in bin/ or bin/MAGNET-1.0.0/ of the repo, and I wrote a new program, piranha
, that is now the main program and runs all functions. (See more on v0.2 changes in v0.2 and v0.2-alpha release notes below.)
PIrANHA version 0.2-alpha.1
Scripts for file processing and analysis in phylogenomics & phylogeography
This is PIrANHA v0.2-alpha.1, a software package that provides a number of utility scripts and pipelines for file processing and analysis in phylogenetics and phylogeography. PIrANHA stands for "PhylogenetIcs ANd pHylogeogrAphy," and v0.2-alpha.1 is a new development pre-release, or 'minor version' that is greatly improved but not yet ready for production use.
PLEASE DO NOT DOWNLOAD THIS RELEASE. IT IS UNDER ACTIVE DEVELOPMENT
PIrANHA tools include interactive/non-interactive shell scripts and wrapper scripts focusing on 1) analyses of DNA sequence data and SNPs or RAD loci generated from massively parallel sequencing runs on reduced representation genomic libraries (e.g. from ddRAD-seq, Peterson et al. 2012), and 2) automating running these software programs on the user's personal machine (e.g. MAGNET pipeline and pyRAD2PartitionFinder scripts) or a remote supercomputer machine, and then conducting post-processing of the results. See full description in the README.
What's new?
v0.2-alpha.1
Since v0.2-alpha, the current pre-release version, PIrANHA v0.2-alpha.1 adds several updates including redos for the PIrANHA etc/ dir, a README for bin/, and new scripts for the MLEResultsProc
, getTaxonNames
, taxonCompFilter
, and SNAPPRunner
functions.
v0.2-alpha
Pre-release version, PIrANHA v0.2-alpha.1, involved a virtually complete rewrite and reorganization of PIrANHA (with >1,200 additions and >400 deletions). All scripts were converted to 'function' programs in bin/ or bin/MAGNET-1.0.0/ of the repo, and I wrote a new program, piranha
, that is now the main program and runs all functions. I am still in the process of updating the README and all function scripts, but I decided to do a pre-release ratcheted up to v0.2 due to the great improvements in modularization and efficiency that v0.2 update allows (selecting a function and passing all arguments, all from piranha
), and because I wanted a new alpha release to use as a starting point to create Debian and Homebrew distribution releases (i.e. brew tap(s) to update as new versions roll out during development). The current organization of PIrANHA is also going to be much better suited for general use, and for adding other collaborators or developers.
The changeLog.md is not yet up to date (not even for v0.2-alpha.1) and the repository is close but still not ready for a v1.0 major release, but we're getting there!!