Releases: nanoporetech/dorado
v0.9.1
[0.9.1] (21 Jan 2025)
This release of Dorado brings significant basecalling speed improvements for Nvidia GPUs with compute capabilities 8.6 (Ampere – e.g., RTX A6000), 8.7 (Ampere – e.g., Orin family), and 8.9 (Ada Lovelace). Additionally, dorado polish
receives major enhancements, including the introduction of the --bacteria
flag, which optimizes basecalling for native bacterial and methylated DNA. The updated dorado polish
is now compatible with data basecalled using v4.3 and v4.2 models and serves as a beta-stage replacement for Medaka.
- d46286e - Upgraded to Koi v0.5.3 for optimised basecalling on CC 8.6, 8.7 & 8.9
- 9c83c49 - Added VCF output to the Dorado
polish
- d859977 - Improve messaging when an unrecognised model complex is specified
- 06f6239 - Support for legacy count models in Dorado
polish
- e51d64f - Fixed precision issue with Dorado
polish
qscore normalisation - d50494d - Added
--bacteria
support for Doradopolish
- f3fa447 - Added ULK114 kits to adapter/primer trimming LUT
- 1a4316e - Dorado
polish
algorithm improvements and refactoring - a0fc960 - Dorado
polish
option-o
now writes to a directory - 7ae31a8 - Dorado
polish
fix for parsing FASTA file inputs - b382344 - Dorado
polish
models are now included in Doradodownload
--list
,--list-yaml
, and--list-structured
- adebcc5 - Ensure Q-score and read length filters are applied to reads after trimming
- 96d574d - Enforce use of
--kit-name
for Doradodemux
custom barcodes to prevent unclassified reads - 62c86d0 - Support
CUDA_VISIBLE_DEVICES
for UUIDs or integers - c39f906 - Prevent crashing in auto batchsize due to short chunksizes
v0.9.0
[0.9.0] (16 Dec 2024)
This major release of Dorado introduces several new features and enhancements. The polish
command, currently experimental, is optimised for refining draft assemblies of human genomes. This release also adds faster DNA modification calling models and improved 6mA false positive rate (FPR) in native human samples. Barcode demultiplexing accuracy has been significantly enhanced for kits with barcodes at both ends, including SQK-NBD114
. Note that using custom barcode kits now requires the --kit-name
option. A feature has been added to enable running dorado correct
in blocks, allowing work to be divided into smaller pieces for easy submission to a compute cluster. Additional updates include the qs
tag for mean basecall Q-scores in FASTQ output, an upgrade to POD5 to support systems with large page sizes, improvements to Poly(A) tail length estimation, and various bug fixes to enhance stability and functionality.
- 2b96c0b - New Dorado
polish
feature for assembly polishing - 0bab166 - Faster modified base models for DNA
4mC_5mC
,5mC_5hmC
,5mCG_5hmCG
, and6mA
- e637166 - Enable running dorado correct in blocks, for easy submission to a compute cluster
- 40296da - Reduced false positive classification rates for kits with barcodes at both ends
- 35da003 - Improve barcode classification when barcodes can be on either end
- cbcdf38 - Only classify barcodes which are present on sample sheet if provided
- 2449d03 - Correct
AF02F_14
andAH10R_80
barcodes fromTWIST-96A-UDI
- 631e94c - Prevent Dorado
demux
from stripping alignment information when--no-trim
is specified - affea85 - Prevent missing filenames when using
--emit-summary
with Doradodemux
- 3dec15a - Improve poly(A) tail estimation accuracy, including with interrupted tails
- df57d34 - Limit poly(A) estimation to reads with plausible signal to prevent stalls in calculation
- 6cf701a - Add
min_primer_separation
option to custom poly(A) configuration - bf51bd4 - Add
qs
tag with mean basecall Q-score to FASTQ output - dac076d - Upgrade to POD5 v0.3.23 to support systems with large page sizes for POD5 and .fast5
- c7a7a58 - Prevent silent failure or segfault on Windows with bad custom barcode files
- 1e829d5 - Do not allow basecalling if target directory includes both POD5 and .fast5 files
- 05d0981 - Fix modified base trim for reverse-aligned BAM records
- afdb068 - Fix invalid
MM
tag after trimming when no mods are present - 0d788d7 - Prevent crash when insufficient permissions to read an input file/folder
- dbece01 - Update custom barcoding documentation to accurately reflect demultiplexing logic
- 6db40ec - Correct model context info shown in
dorado download --list-structured
- 03acc12 - Use the
-o
short option only for--output-dir
and not for--overlap
- 8d9c017 - Added support for reading gzipped compressed FASTQ files
v0.7.4
[0.7.4] (11 Dec 2024)
This release of Dorado matches the version included in MinKNOW 24.11 and Dorado Basecall Server 7.6.7. We only recommend this release to users who require matching performance to MinKNOW 24.11. Other users should use the latest version of Dorado (≥0.8.3) to benefit from all available improvements. Changes from 0.7.3 in this release include:
- 40296da - Reduced false positive classifications for NBD barcode kits
- 687f234 - Improved GPU batch submission timeout logic to improve basecall performance with low data throughput
- 03e5eec - Fix incorrect basecalls being emitted from SUP models on Apple Silicon
- 9a81cbc - Fix errors when running
dorado basecaller
on multi-GPU systems - 3cc4de3 - Prevent "Too many open files" error when using
--sort-bam
withdorado demux
- f35c8cc - Fix bug when downloading models for
dorado correct
- 4a28d58 - Always trim DNA adapter signal before processing RNA reads
- adc60ba - Package
libcupti.so
into ARM Linux builds - c65b3fb - Added EXP-NBD114-24 alias for SQK-NBD114-24
- 69cb260 - Fix bug causing intermittent crashing with v5 SUP models
- a674dad - Update
--help
documentation forbasecaller
,duplex
, andcorrect
- 6ec77c8 - Fix errors when performing duplex calling with modified bases
- cb6eee1 - Decouple alignment and inference stages in
dorado correct
- 762e886 - Cache batch sizes to significantly reduce basecaller startup time on supported GPUs
- 9e5db84 - Fix duplicated alignment tags in re-aligned files
- 966c2ca - Update POD5 version to v0.3.15
- e9281fa - Emit an error message if header from input HTS file cannot be read
- fcb9d53 - Include run name in output files from
dorado demux
even if input files are FASTQ - 7f42b8f - Warn and exit instead of crashing if a model path does not exist
- 7d74246 - Improve index file error handling
- 022901e - Fix JSON output when using
--list-structured
withdorado download
v0.8.3
[0.8.3] (11 Nov 2024)
This release of Dorado adds fixes and improvements to the Dorado 0.8.2 release, including a fix to SUP basecalling on Apple Silicon.
v0.8.2
[0.8.2] (21 Oct 2024)
This release of Dorado includes fixes and improvements to the Dorado 0.8.1 release.
- 9a81cbc - Fix errors when running
dorado basecaller
on multi-GPU systems - bac4469 - Fix error when passing invalid option to basecaller
--modified-bases
argument - 7f40154 - Fix bug when running on GPUs with less than the recommended VRAM
- a993a32 - Prevent loss of small numbers of reads from
dorado correct
- bbbeb9a - Fix error when
dorado demux
is provided an empty input directory - 3a8094a - Clarify "Unable to find chunk benchmarks" warning to indicate that it is not an error
- 543ccc1 - Improve documentation on running Dorado in PowerShell on Windows
v0.8.1
[0.8.1] (03 Oct 2024)
This release of Dorado includes fixes and improvements to the Dorado 0.8.0 release, including corrected configuration for DNA v5 SUP-compatible 5mC_5hmC and 5mCG_5hmCG models, improved cDNA poly(A) tail estimation for data from MinION flow cells, reduced basecaller startup time on supported GPUs, and more.
- f74d891 - Corrected bug causing [email protected]_5mC_5hmC@v2 to call CpG contexts only and [email protected]_5mCG_5hmCG@v2 to call all contexts
- eb46494 - Improve cDNA poly(A) tail estimation for MinION flow cells
- 762e886 - Cache batch sizes to significantly reduce basecaller startup time on supported GPUs
- 22269a8 - Prevent "Trim interval is invalid for sequence" error when performing trimming
- f156ae6 - Prevent write permission error for model download folder when file write is not required
- fcb9d53 - Include run name in output files from
dorado demux
even if input files are FASTQ - a4c9649 - BED file handling: only split columns on tabs, not spaces; load files with spaces in region names
- e62cbc8 - Allow comment lines in the middle of the BED file
- f15c0b3 - Fix compilation in AppleClang 16
v0.8.0
[0.8.0] (16 Sept 2024)
This release of Dorado adds v5.1 RNA models with new inosine_m6A
and m5C
RNA modified base models, updates existing modified base models, improves the speed of v5 SUP basecalling models on A100/H100 GPUs, and enhances the flexibility and stability of dorado correct
. It also introduces per-barcode configuration for poly(A) estimation with interrupted tails, adds new --output-dir
and --bed-file
arguments to Dorado basecalling commands, and includes a variety of other improvements for stability and usability.
- a69c0a2 - Add v5.1.0 RNA basecalling models, including new
inosine_m6A
andm5C
modified base models, and updated existing DNA and RNA modified base models - 8e3a870 - Improve speed of v5 SUP basecalling models on A100 and H100 GPUs
- 6ee9018 - Reduce false positive calls from v5 DNA modifed base models
- 69cb260 - Fix bug causing intermittent crashing with v5 SUP models
- e9dec49 - Add
--resume-from
functionality todorado correct
- cb6eee1 - Decouple alignment and inference stages in
dorado correct
- df861db - Prevent segfaults in
dorado correct
- f35c8cc - Fix bug when downloading models for
dorado correct
- 6646701 - Add per-barcode poly(A) configuration for interrupted tails
- 0b79407 - Improve poly(A) length estimation for RNA and DNA
- df614ab - Add
--output-dir
argument todorado basecaller
anddorado duplex
- f9beb39 - Add
--bed-file
argument todorado basecaller
anddorado duplex
- 1fc6f1e - Add
--models-directory
option tobasecaller
,duplex
, anddownload
to download and reuse models - 966c2ca - Update POD5 version to v0.3.15
- 6ec77c8 - Fix errors when performing duplex calling with modified bases
- 4a28d58 - Always trim DNA adapter signal before processing RNA reads
- a90fbf9 - Fix loading of FASTQ files containing RNA with U bases
- 9e5db84 - Fix duplicated alignment tags in re-aligned files
- 3cc4de3 - Prevent "Too many open files" error when using
--sort-bam
withdorado demux
- b531918 - Prevent
dorado basecaller
crash when signal-space trimming removes all raw data - adc60ba - Package
libcupti.so
into ARM Linux builds - 667d160 - Remove kit name requirement in custom barcode configuration
- e9281fa - Emit an error message if header from input HTS file cannot be read
- 7f42b8f - Warn and exit instead of crashing if a model path does not exist
- 7d74246 - Improve index file error handling
- c77733a - Add a mechanism to cache auto batch size calculations
- a674dad - Update
--help
documentation forbasecaller
,duplex
, andcorrect
- 022901e - Fix JSON output when using
--list-structured
withdorado download
- db73e5d - Add
run_id
to filenames output bydemux
v0.7.3
[0.7.3] (1 Aug 2024)
This release of Dorado updates dorado correct
to fix handling of high copy repeats and avoid shutdown hanging. It also includes dorado demux
improvements to reduce false matches in midstrand barcode detection and ensure correct file naming, along with other fixes.
- 5dc78ab - Remove limit on number of overlaps considered during all-vs-all alignment in
dorado correct
- 2741de7 - Prevent hang during shutdown of
dorado correct
and prevent out of memory errors - 37d316c - Remove unused
--read-ids
and--threads
parameters fromdorado correct
- ddb13de - Increase the threshold for midstrand barcode detection to reduce false matches
- 845a3ad - Fix misnaming by
dorado demux
of barcode file for barcodes ending in a letter (e.g.,12a
) - 56d3e8e - Fix seq/qual orientation when demultiplexing aligned BAMs
- 5ddfc2f - Fix bug causing CUDA illegal memory access with v5 RNA SUP and mods
v0.6.3
[0.6.3] (31 July 2024)
This release matches the version of Dorado in MinKNOW 24.06 and Dorado Basecall Server 7.4.12.
v0.7.2
[0.7.2] (18 June 2024)
This release of Dorado resolves basecalling failures when running v5 SUP models on CPU-only devices or v5 RNA HAC on Apple silicon. It also fixes bugs in dorado demux
and dorado correct
, and corrects sm
and sd
tags to match the Dorado SAM specification.
- 3835272 - Fix bug causing v5 SUP models to fail when running on CPU-only devices
- c36f444 - Fix bug causing RNA v5 HAC basecalling to fail on Apple silicon
- 3621800 - Fix bug causing segfault in
dorado demux
- 3b51c1b - Fix sub-par alignments in
dorado correct
- d0df79c - Correct shift and scale (
sm
andsd
) SAM tags to match SAM specification