Skip to content

Commit

Permalink
Merge pull request #13 from COMBINE-lab/dev
Browse files Browse the repository at this point in the history
merge dev into main
  • Loading branch information
rob-p authored Jan 24, 2025
2 parents 8ed7edc + f68bcf0 commit 2f7b913
Show file tree
Hide file tree
Showing 4 changed files with 170 additions and 861 deletions.
50 changes: 25 additions & 25 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
[package]
name = "roers"
version = "0.3.0"
version = "0.4.0"
edition = "2021"
authors = [
"Dongze He <[email protected]>",
"Rob Patro <[email protected]>",
]
authors = ["Dongze He <[email protected]>", "Rob Patro <[email protected]>"]
description = "A tool to prepare augmented annotations for single-cell RNA-seq analysis."
license-file = "LICENSE"
readme = "README.md"
Expand All @@ -19,13 +16,7 @@ include = [
"/README.md",
"/LICENSE",
]
keywords = [
"genomics",
"GTF-GFF",
"splici",
"scRNA-seq",
"augmented-reference",
]
keywords = ["genomics", "GTF-GFF", "splici", "scRNA-seq", "augmented-reference"]
categories = ["science", "data-structures"]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
Expand All @@ -38,16 +29,25 @@ name = "roers"
path = "src/lib/lib.rs"

[dependencies]
anyhow = "1.0.75"
clap = { version = "4.4.7", features = ["derive", "wrap_help","cargo"] }
grangers = { git = "https://github.com/COMBINE-lab/grangers.git", branch="main", version = "0.4.0" }
polars = { version = "0.34.2", features = ["lazy","dataframe_arithmetic","sort_multiple", "checked_arithmetic","rows","dtype-struct", "dtype-categorical", "list_eval","concat_str", "strings"]}
peak_alloc = "0.2.0"
tracing = "0.1.40"
tracing-subscriber = { version = "0.3.17", features = ["env-filter"] }
noodles = { version = "0.56.0", features = ["gtf","gff","fasta", "core"] }
serde = {version = "1.0.190", features = ["derive"]}
serde_json = "1.0.107"
itertools = "0.11.0"
oomfi = "0.1.2"
xxhash-rust = { version = "0.8.7", features = ["xxh3"] }
anyhow = "1.0.95"
clap = { version = "4.5.27", features = ["derive", "wrap_help", "cargo"] }
grangers = { git = "https://github.com/COMBINE-lab/grangers.git", branch = "dev", version = "0.5.0" }
polars = { version = "0.45.1", features = [
"lazy",
"dataframe_arithmetic",
"checked_arithmetic",
"rows",
"dtype-struct",
"dtype-categorical",
"list_eval",
"concat_str",
"strings",
] }
peak_alloc = "0.2.1"
tracing = "0.1.41"
tracing-subscriber = { version = "0.3.19", features = ["env-filter"] }
noodles = { version = "0.90.0", features = ["gtf", "gff", "fasta", "core"] }
serde = { version = "1.0.217", features = ["derive"] }
serde_json = "1.0.137"
itertools = "0.14.0"
xxhash-rust = { version = "0.8.15", features = ["xxh3"] }
49 changes: 49 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,52 @@
# roers

A rust library for preparing expanded transcriptome references for quantification with [`alevin-fry`](https://alevin-fry.readthedocs.io/en/latest/).

To use outside of simpleaf, follow the following help message:

```bash
build the (expanded) reference index

Usage: roers make-ref [OPTIONS] <GENOME> <GENES> <OUT_DIR>

Arguments:
<GENOME> The path to a genome fasta file
<GENES> The path to a gene annotation gtf/gff3 file
<OUT_DIR> The path to the output directory (will be created if it doesn't exist)
Options:
-a, --aug-type <AUG_TYPE>
Comma separated types of augmented sequences to include in the output FASTA file on
top of spliced transcripts. Available options are `intronic` (or `i` for short),
`gene-body` (or `g`), and `transcript-body` (or `t`)
--dedup
Indicates whether identical sequences will be deduplicated
-p, --filename-prefix <FILENAME_PREFIX>
The file name prefix of the generated output files [default: roers_ref]
--no-transcript
A flag of not including spliced transcripts in the output FASTA file. (usually there
should be a good reason to do so)
--gff3
Denotes that the input annotation is a GFF3 (instead of GTF) file
-h, --help
Print help
-V, --version
Print version
Intronic Sequence Options:
-r, --read-length <READ_LENGTH>
The read length of the single-cell experiment being processed (determines flank size)
[default: 91]
--flank-trim-length <FLANK_TRIM_LENGTH>
Determines the length of sequence subtracted from the read length to obtain the flank
length [default: 5]
--no-flanking-merge
Indicates whether flank lengths will be considered when merging introns
Extra Spliced Sequence File:
--extra-spliced <EXTRA_SPLICED> The path to an extra spliced sequence fasta file
Extra Unspliced Sequence File:
--extra-unspliced <EXTRA_UNSPLICED> The path to an extra unspliced sequence fasta file
```
Loading

0 comments on commit 2f7b913

Please sign in to comment.