-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
186 additions
and
49 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,13 @@ | ||
--- | ||
sort: 6 | ||
permalink: /about | ||
sort: 3 | ||
permalink: /notes | ||
--- | ||
|
||
# About SeqFu | ||
|
||
This page contains a small selection of examples for getting started using **seqfu**. | ||
|
||
Check the complete documentation for each [tool]({{site.baseurl}}/tools), that contains the detailed | ||
documentation. | ||
The main parsing library is `klib.nim` by Heng Li ([lh3/biofast](https://github.com/lh3/biofast)), that provides good performances. | ||
|
||
For some utilities the *readfq* library has been used ([andreas-wilm/nimreadfq](https://github.com/andreas-wilm/nimreadfq)). This is based on the | ||
C version of Heng Li's parsed, wrapped in an object oriented module. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
--- | ||
sort: 3 | ||
--- | ||
# seqfu count | ||
|
||
*count* (or *cnt*) is one of the core subprograms of *SeqFu*. | ||
It's used to count the sequences in FASTA/FASTQ files, and it's _paired-end_ aware so | ||
it will print the count of both files in a single line, but checking that both | ||
files have the same number of sequences. | ||
|
||
```text | ||
Usage: count [options] [<inputfile> ...] | ||
Options: | ||
-a, --abs-path Print absolute paths | ||
-b, --basename Print only filenames | ||
-u, --unpair Print separate records for paired end files | ||
-f, --for-tag R1 Forward tag [default: auto] | ||
-r, --rev-tag R2 Reverse tag [default: auto] | ||
-v, --verbose Verbose output | ||
-h, --help Show this help | ||
``` | ||
|
||
|
||
### Streaming | ||
|
||
Input from stream is supported. | ||
|
||
### Example output | ||
|
||
Output is a TSV text with three columns: sample name, number of reads and type ("SE" for Single End, "Paired" for Paired End) | ||
|
||
```text | ||
data/test.fastq 3 SE | ||
data/comments.fastq 5 SE | ||
data/test2.fastq 3 SE | ||
data/qualities.fq 5 SE | ||
data/illumina_1.fq.gz 7 Paired | ||
``` | ||
|
||
In case of errors will print a warning: | ||
```text | ||
ERROR: Different counts in data/longerone_R1.fq.gz and data/longerone_R2.fq.gz | ||
# data/longerone_R1.fq.gz: 7 | ||
# data/longerone_R2.fq.gz: 2 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
--- | ||
sort: 4 | ||
--- | ||
# seqfu stats | ||
|
||
*stats* is one of the core subprograms of *SeqFu*. | ||
|
||
```text | ||
Usage: stats [options] [<inputfile> ...] | ||
Options: | ||
-a, --abs-path Print absolute paths | ||
-b, --basename Print only filenames | ||
-n, --nice Print nice terminal table | ||
--csv Separate with commas (default: tabs) | ||
-v, --verbose Verbose output | ||
-h, --help Show this help | ||
``` | ||
|
||
|
||
### Example output | ||
|
||
Output is a TSV text with three columns (or CSV using `--csv`): | ||
```text | ||
File,#Seq,Sum,Avg,N50,N75,N90,Min,Max | ||
data/filt.fa.gz,78730,24299931,308.6,316,316,220,180,485 | ||
``` | ||
|
||
### Screen friendly output | ||
|
||
When using `-n` (`--nice`) output: | ||
|
||
```text | ||
seqfu stats data/filt.fa.gz -n | ||
┌─────────────────┬───────┬──────────┬───────┬─────┬─────┬─────┬─────┬─────┐ | ||
│ File │ #Seq │ Total bp │ Avg │ N50 │ N75 │ N90 │ Min │ Max │ | ||
├─────────────────┼───────┼──────────┼───────┼─────┼─────┼─────┼─────┼─────┤ | ||
│ data/filt.fa.gz │ 78730 │ 24299931 │ 308.6 │ 316 │ 316 │ 220 │ 180 │ 485 │ | ||
└─────────────────┴───────┴──────────┴───────┴─────┴─────┴─────┴─────┴─────┘ | ||
``` | ||
|