TENNIS is an evolution-based model to predict unannotated isoforms and refine existing transcriptome annotations without requiring additional data.
The easiest way to install TENNIS is using pip
.
pip install tennis-transcriptome
- Python >= 3.7
- PySAT
The only dependency of TENNIS is PySAT, which can be installed with pip
. Manual compilation of PySAT is described in their github repo here.
TENNIS can be installed by directly cloning this repository.
# install PySAT
pip install python-sat[aiger,approxmc,cryptosat,pblib]
# install TENNIS
git clone https://github.com/Shao-Group/TENNIS
cd TENNIS
chmod +x src/tennis
This repository also modified and re-distributes GTF.py codes (retrieved from here) developed by Kamil Slowikowski. Users don't have to re-download it.
# display help message
./src/tennis -h
# run TENNIS on an example dataset
mkdir test
cd test
./src/tennis -o tennis_example ../example/example.gtf
If installed with conda or pip, tennis
executable should be ready to use in $PATH
.
If installed manually, the tennis
executable is in ./src/
dir.
tennis [options] -o <output_prefix> <gtf_file>
The program outputs two files: output_prefix.stats
and output_prefix.pred.gtf
.
More about the output format is available here.
gtf_file
: str
Input GTF file in standard format containing transcript annotations.
-h
, --help
-o
, --output_prefix
: str
Default: "tennis"
-p
, --PctIn_threshold
: float
A threshold in range [0, 1]. Predicted isoforms with PctIn value lower than this threshold will be filtered out. If -p 0.0
, all isoforms are retained.
Default: 0.5
-x
, --exclude_group_size
: int
Skip analysis of transcript groups that have more isoforms than this threshold.
Default: 100
-m
, --max_novel_isoform
: int
Maximum number of novel isoforms to predict per transcript group.
Default: 4
--time_out
: int
Time limit in seconds for each SAT solver instance.
Default: 900 (15 minutes)
output_prefix.stats
:
Statistical summary. T1 (T2, T3, ...) is the collection of transcript groups that need 1 (2, 3, ..) novel isoforms to satisfy the evolution model.
output_prefix.pred.gtf
:
GTF format file with predicted novel isoforms.
More about the output format is available here.
For bug reports or feature requests, please open an issue on the GitHub repository here.
TENNIS is freely available under BSD 3-Clause License.
Copyright (c) 2024, Xiaofei Carl Zang, Ke Chen, Mingfu Shao, and The Pennsylvania State University.
The preprint of TENNIS is available on bioRxiv here.
@article {TENNIS,
author = {Zang, Xiaofei Carl and Chen, Ke and Khan, Irtesam Mahmud and Shao, Mingfu},
title = {Augmenting Transcriptome Annotations through the Lens of Splicing Evolution},
year = {2024},
doi = {10.1101/2024.11.04.621892},
journal = {bioRxiv}
}