Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge Bakta refactor into Dev #102

Closed
wants to merge 76 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
69d7bdd
parameter for running the bakta annotation and the docker file for this
ankushkgupta2 Aug 10, 2023
3455f49
entrypoint for running bakta annotation added
ankushkgupta2 Aug 10, 2023
52174fa
checks for the bakta annotation parameter, docker parameter for this …
ankushkgupta2 Aug 10, 2023
92604ea
same parameters within test_parameters set added with default params
ankushkgupta2 Aug 10, 2023
52519cd
module/process for calling BAKTA annotation directly created
ankushkgupta2 Aug 10, 2023
39df7db
module/process for calling BAKTA annotation directly created
ankushkgupta2 Aug 10, 2023
cdec147
entrypoint subworkflow for BAKTA created
ankushkgupta2 Aug 10, 2023
2be515c
latest changes to gitignore and some changes for wget/yum aka downloa…
ankushkgupta2 Aug 25, 2023
e227920
new diphtheria data to test with BAKTA
ankushkgupta2 Aug 25, 2023
d43af05
relevant variola files - references fasta/gff + sequence fasta file
Sep 7, 2023
1e8e307
bakta integration
Sep 18, 2023
2a79916
Merge branch '48-internal-bakta' into master
Swarnali3 Sep 19, 2023
8e8c63a
Merge pull request #74 from Swarnali3/master
Swarnali3 Sep 19, 2023
9768d5e
test:changes to bakta modules
Sep 19, 2023
de7c2f3
changes for metadata validation
Sep 20, 2023
e41e61b
test_params.config modified
Sep 21, 2023
af40a25
splitting multi-fasta file upstream bakta
Oct 3, 2023
ee7afdb
splitting test config file into test_virus_params and test_bacteria_p…
Oct 11, 2023
39cab38
bakta re-programmed to run with single fasta file
Oct 16, 2023
cd1e756
changes made to module validate params to run bakta using bacterial t…
Oct 17, 2023
2e6f55e
gff post-transformation and integration in script
Oct 30, 2023
ab7a80d
change output gff3 file to gff file
Nov 2, 2023
a414969
changed the name of the workflow to main.nf vs mpxv.nf before and mad…
ankushkgupta2 Nov 6, 2023
24b70f7
adding comments to python script for bakta
Nov 10, 2023
7deea8e
adding module for repeatmasker and connecting to module for liftoff
Nov 14, 2023
8cba15f
condition for variola or mpox specific repeat library added
Nov 17, 2023
1780996
added liftoff cli and concat modules; implemented new python script t…
Nov 17, 2023
8391e97
added more instructions to config files
Nov 17, 2023
a834240
changed the name of the file from mpxv.nf to main.nf + moved out the …
ankushkgupta2 Dec 6, 2023
a1f4357
separate module for params help print out under general utility direc…
ankushkgupta2 Dec 6, 2023
c116867
manually adding bc scicomp wont push
kyleoconnell Dec 6, 2023
a3e6034
Add files via upload
kyleoconnell Dec 6, 2023
8076972
Add files via upload
kyleoconnell Dec 6, 2023
7d3a9b1
updated docker container for pandas on concat_gff
Dec 6, 2023
cbd3314
Merge pull request #91 from CDCgov/65-external-feature-running-variol…
kyleoconnell Dec 6, 2023
15bbff8
updated repeatmasker
Dec 6, 2023
74b25e9
Merge pull request #84 from CDCgov/65-external-feature-running-variol…
kyleoconnell Dec 6, 2023
de4405c
removed .ipynb checkpoints
Dec 6, 2023
48e30b0
removed ipynb checkpoints;
Dec 6, 2023
b683160
updated bakta download module
Dec 8, 2023
312df46
updated bakta workflow
Dec 8, 2023
cbbbb09
added new bakta functionality
Dec 8, 2023
6318a46
added two new parameters within metadata validation python script: cu…
ankushkgupta2 Sep 20, 2023
81b8f45
added new parameter: samples_to_apply_custom, which is a list type an…
ankushkgupta2 Sep 20, 2023
3a7132e
latest code for metadata validation dynamic fields + new / modified i…
ankushkgupta2 Oct 6, 2023
6b7f937
changes to the sample metadata sheet continaing the custom fields, fo…
ankushkgupta2 Oct 20, 2023
9b03e87
refined / refactored some of the code + fixed issues with passing cer…
ankushkgupta2 Oct 20, 2023
9b0eb34
placed validation_outputs directory output within the base dir, aka /…
ankushkgupta2 Oct 20, 2023
bd93790
created a separate directory for custom meta fields related files, in…
ankushkgupta2 Dec 11, 2023
47b6ded
general refactoring + based on tests, made changes to handle if a col…
ankushkgupta2 Dec 11, 2023
3eb3484
moved files to separate directory
ankushkgupta2 Dec 11, 2023
bf3b322
updated the parameter sets for standard and test to include validate_…
ankushkgupta2 Dec 11, 2023
0ccc472
added a default path for the custom fields file
ankushkgupta2 Dec 11, 2023
5731ab0
added new nextflow params for dynamic metadata fields to remaining pa…
ankushkgupta2 Dec 11, 2023
676dd41
merged latest variola/repeat masker updates that were pushed into dev
ankushkgupta2 Dec 11, 2023
35f24a6
old information / commits added back to file
ankushkgupta2 Dec 11, 2023
d774c84
Merge pull request #88 from CDCgov/77-internal-move-help-message
ankushkgupta2 Dec 11, 2023
2d15096
Merge remote-tracking branch 'origin/dev' into 58-internal-dynamic-me…
ankushkgupta2 Dec 11, 2023
3eca13b
removed lingering piece of code for checking outputs
ankushkgupta2 Dec 11, 2023
33e1dec
had the wrong file name for the default path value for the custom_fie…
ankushkgupta2 Dec 11, 2023
231a2e5
removed uncessary explicit listing of .nextflow.log files and just us…
ankushkgupta2 Dec 11, 2023
f52f276
under some comments, put some encoding information, might potentially…
ankushkgupta2 Dec 11, 2023
7970afd
changed the default path for custom_fields_file to the correct one, w…
ankushkgupta2 Dec 11, 2023
895acba
added parameters for custom fields
ankushkgupta2 Dec 11, 2023
f7db7d1
latest changes
ankushkgupta2 Dec 11, 2023
867bd62
refactoring bakta
Dec 11, 2023
e2204a7
Merge pull request #92 from CDCgov/58-internal-dynamic-metadata
kyleoconnell Dec 11, 2023
ae9b4ff
added samplesheet
Dec 11, 2023
aac7eb1
added input channel for bakta db
Dec 11, 2023
760154f
Merge branch 'dev' of github.com:CDCgov/tostadas into 48-internal-bakta
kyleoconnell Dec 11, 2023
28f6916
Merge pull request #93 from CDCgov/48-internal-bakta
kyleoconnell Dec 11, 2023
4a4603e
patched bakta entrypoint
kyleoconnell Dec 11, 2023
f8d1162
Merge branch 'dev' into bakta_refactor_kao
kyleoconnell Dec 12, 2023
e6e86e3
Merge pull request #94 from CDCgov/bakta_refactor_kao
ankushkgupta2 Dec 13, 2023
de7ef72
fixed formatting error
kyleoconnell Dec 15, 2023
5b7d470
syncing local changes
kyleoconnell Jan 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 16 additions & 41 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
/vcs.xml
/webServers.xml
/workspace.xml
/assets/tostadas.code-workspace

# Editor-based HTTP Client requests
/httpRequests/
Expand All @@ -22,41 +23,19 @@

# nextflow
/.nextflow/
/.nextflow
/.nextflow.log
/.nextflow.log.1
/.nextflow.log.2
/.nextflow.log.3
/.nextflow.log.4
/.nextflow.log.5
/.nextflow.log.6
/.nextflow.log.7
/.nextflow.log.8
/.nextflow.log*
/nf_test_results
/tests/work
/var_work/
/var_work
app/environment.yml
/tests/.nextflow
/tests/.nextflow
/tests/.nextflow.log
/tests/.nextflow.log.1
/tests/.nextflow.log.2
/tests/.nextflow.log.3
/tests/.nextflow.log.4
/tests/.nextflow.log.5
/tests/.nextflow.log.6
/tests/.nextflow.log.7
/tests/.nextflow.log.8
/tests/.nextflow.log*
/submission_scripts/upload_log.csv

/bin/.nextflow
/bin/.nextflow.log
/bin/.nextflow.log.1
/bin/.nextflow.log.2
/bin/.nextflow.log.3
/bin/.nextflow.log.4
/bin/.nextflow.log.5
/bin/.nextflow.log.6
/bin/.nextflow.log.7
/bin/.nextflow.log.8
/bin/.nextflow.log*
/bin/upload_log.csv

/submission_scripts/__pycache__/
Expand All @@ -77,6 +56,12 @@ tests/__pycache__
/bin/vadr_outputs
/bin/vadr_outputs/

# bakta related
/assets
/assets/bakta_database
/assets/bakta_database/db_light
/assets/bakta_database/db_light/

# add singularity related stuff
/app/singularity/containers/

Expand All @@ -91,18 +76,7 @@ tests/__pycache__
/submission_scripts/config_files

# add aspen related stuff
/aspen/.nextflow.log.1
/aspen/.nextflow.log.2
/aspen/.nextflow.log.3
/aspen/.nextflow.log.4
/aspen/.nextflow.log.5
/aspen/.nextflow.log.6
/aspen/.nextflow.log.7
/aspen/.nextflow.log.8
/aspen/.nextflow.log.9
/aspen/.nextflow.log.10
/aspen/.nextflow.log.11
/aspen/.nextflow.log.12
/aspen/.nextflow.log*

/tests/nf_test_results

Expand All @@ -127,4 +101,5 @@ bin/.DS_Store
/test_submission

# Testing related
/test_main_workflow
/test_main_workflow
/validation_outputs/
15 changes: 10 additions & 5 deletions app/singularity/singularity_docker_boot.def
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,20 @@ From: continuumio/miniconda3

%post
# install mamba and create .yml file
/opt/conda/bin/conda install -c conda-forge mamba
/opt/conda/bin/mamba env create -f environment.yml
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh
bash Mambaforge-$(uname)-$(uname -m).sh -b -p $HOME/mambaforge
export PATH="$HOME/mambaforge/bin:$PATH"
# source $HOME/mambaforge/etc/profile.d/conda.sh
# conda activate
mamba env create -f environment.yml

# change the singularity environment variable to the conda env (start env by default)
echo ". /opt/conda/etc/profile.d/conda.sh" >> $SINGULARITY_ENVIRONMENT
echo "conda activate tostadas" >> $SINGULARITY_ENVIRONMENT
# echo ". /opt/conda/etc/profile.d/conda.sh" >> $SINGULARITY_ENVIRONMENT
# echo "conda activate tostadas" >> $SINGULARITY_ENVIRONMENT

# check if it worked
. /opt/conda/etc/profile.d/conda.sh
. $HOME/mambaforge/etc/profile.d/conda.sh
conda activate
conda activate tostadas
printf "\n\n******** LIST OF PACKAGES IN TOSTADAS ENV ********\n\n"
conda list
Expand Down
Binary file added assets/Diphtheria_test_1.xlsx
Binary file not shown.
2 changes: 2 additions & 0 deletions assets/VARV_RZ10_3587.fasta

Large diffs are not rendered by default.

13,663 changes: 13,663 additions & 0 deletions assets/VARVs_fixed.fasta

Large diffs are not rendered by default.

Binary file not shown.
20 changes: 20 additions & 0 deletions assets/custom_meta_fields/example_custom_fields.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"test_field_1": {
"type": "String ",
"samples": ["Fl0004", "IL0005", "FL0015", "FL00234", 8],
"replace_empty_with": "not populated",
"new_field_name": "new_field_name"
},
"test_field_2": {
"type": "float",
"samples": ["Fl0004"],
"replace_empty_with": "",
"new_field_name": "new_field_name2"
},
"test_field_3": {
"type": "Boolean",
"samples": ["All ", "any random sample name"],
"replace_empty_with": "",
"new_field_name": ""
}
}
34,091 changes: 34,091 additions & 0 deletions assets/diphtheria.fasta

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions assets/feature_types.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
misc_feature
4 changes: 4 additions & 0 deletions assets/lib/MPOX_repeats_lib.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
>MPOX_ITR1#UNKOWN
GATAGATCGAAAAAAGCCACTAATAAAAACAGTTGGTTAATTACTTCAGGCTTTAGACTACAAAAATGGTTCGATAGCGAAGATTGTATAATTTATCTCAGATCTTTAGTTAGAAGAATGGAAGACAGTAACAAAAACAGTAAAAAAACTTAGTACTTAGATATCGAAAAAATATATTTTTGTAGACTCTTGAGAATAGAAGGAAAACATGTACATAATTATAAAAAATGAAAATCAATGGCGAATAAGACAGTGCGATTCGCACCATGGAGTCGGTAGATTTCATGGCTGTCGATGAGCAGTTTCACGACGACCTCGATCTTTGGTCATTATCTTTGGTAGATGATTATAAAAAACATGGATTAGGTGTTGACTGTTATGTTCTAGAACCAGTTGTTGACAGGAAAATATTTGATAGATTTCTCCTTGAACCAATTTGTGATCCTGTAGATGTTCTGTATGATTATTTTAGGATTCATAGAGATAATATTGATCAGTATATAGTAGATAGACTGTTTGCATATATTACATATAAAGATATTATATCTGCATTAGTGTCAAAGAATTATATGGAAGATATTTTCTCTATAATTATTAAGAATTGTAATTCTGTGCAAGATCTCTTACTTTACTATCTATCTAATGCATATGTAGAAATAGACATTGTTGATCTTATGGTAGATCATGGGGCTGTAATATATAAAATAGAATGCTTGAATGCCTATTTTAGGGGAATATGTAAAAAGGAAAGTAGTGTTGTTGAGTTTATTTTGAATTGTGGTATCCCAGATGAAAATGATGTTAAATTAGATCTATATAAAATAATTCAGTATACTAGGGGATTCCTTGTAGATGAACCCACAGTATTAGAAATTTATAAGCTTTGTATCCCATATATTGAAGATATCAATCAACTAGATGCTGGTGGAAGGACCTTGCTTTATCGCGCTATCTATGCAGGTTATATAGATTTAGTATCATGGCTATTAGAAAATGGAGCAAATGTCAACGCAGTAATGAGTAATGGATATACATGTCTTGACGTGGCCGTGGATAGGGGATCTGTCATCGCCCGTAGGGAAGCACATCTTAAAATATTAGAAATATTGCTTAGAGAACCATTGTCTATTGACTGTATAAAATTAGCTATACTTAATAATACAATTGAAAACCATGATGTGATAAAGCTCTGTATCAAGTATTTTATGATGGTAGATTATTCACTTTGTAATGTGTATGCATCATCACTCTTTGATTATATAATTGATTGTAAACAAGAATTGGAGTACATTAGGCAGATGAAAATTCATAATACAACCATGTATGAGTTAATCTATAATAGAGACAAAAACAAGCATGCTTCCCATATTCTACATAGGTATTCTAAACATCCAGTTTTGACACAGTGTATCACTAAAGGATTCAAGATTTACACAGAAGTAACCGAGCAGGTCACTAAAGCTCTAAACAGACGTGCTCTAATAGATGAGATAATAAACAATGTATCAACTGATGACAATCTCCTATCAAAACTTCCATTAGAAATTAGGGATCTAATTGTTTCACAAGCTGTCATATAGAGTTCTATCCACCCACCTTTCTTGAAATGAGTTAATAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTTATAGTCTAACACTTCTAATTTTTATACCTTGATCTTTTTCTCTAATTATGAAAAAGTAAATCATTATGAAGATGGATGAAATGGACGAGATTGTGCGCATCGTTAACGATAGTATGTGGTACGTACCTAACGCATTTATGGACGACGGTGATAATGAAGGTCACATTTCTGTCAATAATGTCTGTCATATGTATCTCGCATTCTTTGATGTGGATATATCATCTCATCTGTTTAAATTAGTTATTAAACACTGCGATCTGAATAAACGACTAAAATGTGGTAACTCTCCATTACATTGCTATACGATGAATACACGATTTAATCCATCTGTATTAAAGATATTGTTACGCCACGGCATGCGTAACTTTGATAGCAAGGATAAAAAAGGACATATTCCTCTACACCACTATCTGATTCATTCACTATCAATCGATAACAAGATCTTTGATATACTAACGGACCCCATTGATGACTTTAGTAAATCATCCGATCTATTGCTGTGTTATCTTAGATATAAATTCAATGGGAGCTTAAACTATTACGTTCTGTACAAATTATTGACTAAAGGATCTGACCCTAATTGCGTCGATGAGGATGGACTCACTTCTCTTCATTACTACTGTAAACACATATCCGCGTTCCACGAAAGCAATTATTACAAGTCAAAGAGTCACACTAAGATGCGAGCTGAGAAGCGATTCATCTACGCGATAATAGATCATGGAGCAAACATTAACGCGGTTACGAAAATCGGAAATACGCCGTTACACACTTACCTTCAACAGTATACCAAACATAGTCCTCGTGTGGTGTATGCTCTTTTATCTCGAGGAGCCGATACGAGGATACGTAATAATCTTGATTGTACACCCATCATGGAATACATAAAGAACGATTGTGCAACAGGTCATATTCTCATAATGTTACTCAATTGGCACGAACAAAAATACGGGAAATTACAAAAGGAAGAAGGACAACATCTACTTTATCTATTCATAAAACATAATCAAGGATATGGAAGTCGCTCTCTCAATATACTACGGTATCTACTAGATAGATTCGACATTCAGAAAGACGAATACTATAATACAATGACTCCTCTTCATACCGCCTTCCAGAATTGCAATAACAATGTTGCCTCATACCTCGTATACATCGGATACGACATCAACCTTCCGACTAAAGACGATAAGACAGTATTCGACTTGGTGTTTGAAAACAGAAACATTATATACAAGGCGGATGTCGTTAATGACATTATCCACCACAGACTGAAAGTATCTCTACCTATGATTAAATCGTTGTTCTACAAGATGTCGGAGTTCTCTCCCTACGACGATCACTACGTAAAGAAGATAATAGCCTACTGCCTATTAAGGGACGAGTCATTTGCGGAACTACATACTAAATTCTGTTTAAACGAGGACTATAAAAGTGTATTTATGAAAAATATATCATTCGATAAGATAGATTCCATCATCGAAAAATGTAGTCGTGACATAAGTCTCCTCAAAGAGATTCGAATCTCAGACACCGACTTGTATACGGTATTGAGAACAGAAGACATCCGGTATCACACATATCTCGAAGCCATACATTCAGACAAACGCATTTCATTTCCCATGTACGACGATCTCATAGAACAGTGTCATCTATCGATGGAGCATAAAAGTAAACTCGTCGACAAAGCACTCAATAAATTAGAGTCTACCATCGATAGTCAATCTAGACTATCGTATTTGCCTCCGGAAATTATGCGCAATATCATAACCAAGCTAAGCGACTACCATCTAAACAGTATGTTGTACGGAAAGAACCATTACAAATATTATCCATGATAGAAAGAAAATATTTAAAAAATAATCTATATGATTGGAGAAGTAGGAAACAAACAGTAACAAGACGACGATTACTACATTATTAAATCATGAGGTCCGTATTATACTCGTATATATTGTTTCTCTCATGTATAATAATAAACGGAAGAGATATAGCACCACATGCACCATCCAATGGAAAGTGTAAAGACAACGAATACAGAAGCCGTAATCTATGTTGTCTATCGTGTCCTCCGGGAACTTACGCTTCCAGATTATGTGATAGCAAGACTAATACACAATGTACACCGTGTGGTTCGGATACCTTTACATCTCACAATAATCATTTACAGGCTTGTCTAAGTTGTAACGGAAGATGTGATAGTAATCAGGTAGAGACGCGATCGTGTAACACGACTCACAATAGAATCTGTGAATGCTCTCCAGGATATTATTGTCTTCTCAAAGGAGCATCAGGGTGTAGAACATGTATTTCTAAAACAAAGTGTGGAATAGGATACGGAGTATCCGGATACACGTCTACCGGAGACGTCATCTGTTCTCCGTGTGGTCCCGGAACATATTCTCACACCGTCTCTTCCACAGATAAATGCGAACCCGTCGTAACCAGCAATACATTTAACTATATCGATGTGGAAATTAACCTGTATCCAGTCAACGACACATCGTGTACTCGGACGACCACTACCGGTCTCAGCGAATCCATCTCAACGTCGGAACTAACTATTACCATGAATCATAAAGATTGTGATCCAGTCTTTCGTGCAGAATACTTCTCTGTCCTTAATAATGTAGCAACTTCAGGATTCTTTACAGGAGAAAATAGATATCAGAATACTTCAAAGATATGTACTCTGAATTTCGAGATTAAATGTAACAACAAAGATTCATCTTCCAAACAGTTAACGAAAACAAAGAATGATACTATCATGCCGCATTCAGAGACGGTAACTCTAGTGGGCGACTGTCTATCTAGCGTCGACATCTACATACTATATAGTAATACCAATACTCAAGACTACGAAAATGATACAATCTCTTATCATATGGGTAATGTTCTCGATGTCAATAGCCATATGCCCGCTAGTTGCGATATACATAAACTGATCACTAATTCCCAGAATCCCACCCACTTATAGTAAGTTTTTTTACCTATAAATAATAAATACAATAATTAATTTCTCGTAAAAGTAGAAAATATATTCTAATTTATTATATGGTAAGAAAGTAGAATCATCTAGAACAGTAATCAATCAATAGCAATCATGAAACAATATATTGTCCTGGCATGCATGTGCCTAGTGGCAGCTGCTATGCCTACTAGTCTTCAACAATCTTCATCCTCGTGTACTGAAGAAGAAAACAAACATCATATGGGAATCGATGTTATTATCAAAGTCACAAAGCAAGACCAAACACCGACCAATGATAAGATTTGTCAATCCGTAACGGAAGTTACAGAGACCGAAGATGATGAGGTATCCGAAGAAGTTGTAAAAGGAGATCCCACCACTTATTACACTATCGTCGGTGCGGGTCTTAACATGAACTTTGGATTCACCAAATGCCCAAAGATTTCATCCATCTCCGAATCCTCTGATGGAAACACTGTGAATACTAGATTGTCCAGCGTGTCACCGGGACAAGGTAAGGACTCTCCCGCGATCACGCGTGAAGAAGCTCTGGCTATGATCAAAGACTGTGAGATGTCTATCGACATCAGATGTAGCGAAGAAGAGAAAGACAGTGACATCAAGACCCATCCAGTACTTGGGTCTAACATCTCACATAAGAAAGTGAGTTACAAAGATATCATCGGTTCAACGATCGTTGATACAAAATGTGTCAAGAACCTAGAGTTTAGCGTACGTATCGGAGACATGTGTGAGGAATCATCTGAACTTGAAGTCAAGGATGGATTCAAGTATGTCGACGGATCGGCATCTGAAGGTGCAACCGATGATACTTCACTCATCGATTCAACAAAACTCAAAGCATGTGTCTGAATCGATAACTCTATTCATCTGAAAATGGATGAGTTGGGTTAATCGAACGATTCAGACACCGCACCACGAATTAAAAAAGACCGGGCACTATATTCCGGTTTGCAAAACAAAAATATTTAACTACATTCACAAAAAGTTACCTCTCGTTACTTCTTCTTTCTGTTTCAATATGTGATACGATATGATCACTATTCGTATTCTCTTGGTCTCATAAAAAAGTTTTACAAAAAAAAAAAAAAAAATATTTTTATTCTCTTTCTCTCTTCGATGGTCTCACAAAAATATTAAACCTCTTTCTGATGTCTCAACTATTTCGTAAACGATAACGTCCAACAATATATTCTCGTAGAGCTTATCAACATCCTTATACCAATCTAGGTTGTCAGACAATTGCATCATAAAATAATGTTTATAATTTACACGTTAACATCATATAATAAACGTATATAGTTAATATTTTTGGAATATAAATGATCTGTAAAATCCATGTAGGGGACACTGCTCACGTTTTTTCTCTAGTACATAATTTCACACAAGTTTTTATACAGACAAATTAATTCTCGTCCATATATTTTAAAACATTGACTTTTGTACTAAGAAAAATATCTTGACTAACCATCTCTTTCTCTCTTCGATGGGTCTCACAAAAATATTAAACCTCTTTCTGATGGAGTCGTAAAAAGTTTTTATCCTTTCTCTCTTCGATAGGTCTCACAAAAATATTAAACCTCTTTCTGATGGTCTCTATAAACGATTGATTTTTCTTACCCTCTAGAGTTTCCTACGGTCGTGGGTCACACATTTTTTTCTAGACACTAAATAAAATAGTAAAAT
>MPOX_ITR2#UNKNOWN
GATAGATCGAAAAAAGCCACTAATAAAAACAGTTGGTTAATTACTTCAGGCTTTAGACTACAAAAATGGTTCGATAGCGAAGATTGTATAATTTATCTCAGATCTTTAGTTAGAAGAATGGAAGACAGTAACAAAAACAGTAAAAAAACTTAGTACTTAGATATCGAAAAAATATATTTTTGTAGACTCTTGAGAATAGAAGGAAAACATGTACATAATTATAAAAAATGAAAATCAATGGCGAATAAGACAGTGCGATTCGCACCATGGAGTCGGTAGATTTCATGGCTGTCGATGAGCAGTTTCACGACGACCTCGATCTTTGGTCATTATCTTTGGTAGATGATTATAAAAAACATGGATTAGGTGTTGACTGTTATGTTCTAGAACCAGTTGTTGACAGGAAAATATTTGATAGATTTCTCCTTGAACCAATTTGTGATCCTGTAGATGTTCTGTATGATTATTTTAGGATTCATAGAGATAATATTGATCAGTATATAGTAGATAGACTGTTTGCATATATTACATATAAAGATATTATATCTGCATTAGTGTCAAAGAATTATATGGAAGATATTTTCTCTATAATTATTAAGAATTGTAATTCTGTGCAAGATCTCTTACTTTACTATCTATCTAATGCATATGTAGAAATAGACATTGTTGATCTTATGGTAGATCATGGGGCTGTAATATATAAAATAGAATGCTTGAATGCCTATTTTAGGGGAATATGTAAAAAGGAAAGTAGTGTTGTTGAGTTTATTTTGAATTGTGGTATCCCAGATGAAAATGATGTTAAATTAGATCTATATAAAATAATTCAGTATACTAGGGGATTCCTTGTAGATGAACCCACAGTATTAGAAATTTATAAGCTTTGTATCCCATATATTGAAGATATCAATCAACTAGATGCTGGTGGAAGGACCTTGCTTTATCGCGCTATCTATGCAGGTTATATAGATTTAGTATCATGGCTATTAGAAAATGGAGCAAATGTCAACGCAGTAATGAGTAATGGATATACATGTCTTGACGTGGCCGTGGATAGGGGATCTGTCATCGCCCGTAGGGAAGCACATCTTAAAATATTAGAAATATTGCTTAGAGAACCATTGTCTATTGACTGTATAAAATTAGCTATACTTAATAATACAATTGAAAACCATGATGTGATAAAGCTCTGTATCAAGTATTTTATGATGGTAGATTATTCACTTTGTAATGTGTATGCATCATCACTCTTTGATTATATAATTGATTGTAAACAAGAATTGGAGTACATTAGGCAGATGAAAATTCATAATACAACCATGTATGAGTTAATCTATAATAGAGACAAAAACAAGCATGCTTCCCATATTCTACATAGGTATTCTAAACATCCAGTTTTGACACAGTGTATCACTAAAGGATTCAAGATTTACACAGAAGTAACCGAGCAGGTCACTAAAGCTCTAAACAGACGTGCTCTAATAGATGAGATAATAAACAATGTATCAACTGATGACAATCTCCTATCAAAACTTCCATTAGAAATTAGGGATCTAATTGTTTCACAAGCTGTCATATAGAGTTCTATCCACCCACCTTTCTTGAAATGAGTTAATAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTAAGTCATAAGTTAGTTTATAGTCTAACACTTCTAATTTTTATACCTTGATCTTTTTCTCTAATTATGAAAAAGTAAATCATTATGAAGATGGATGAAATGGACGAGATTGTGCGCATCGTTAACGATAGTATGTGGTACGTACCTAACGCATTTATGGACGACGGTGATAATGAAGGTCACATTTCTGTCAATAATGTCTGTCATATGTATCTCGCATTCTTTGATGTGGATATATCATCTCATCTGTTTAAATTAGTTATTAAACACTGCGATCTGAATAAACGACTAAAATGTGGTAACTCTCCATTACATTGCTATACGATGAATACACGATTTAATCCATCTGTATTAAAGATATTGTTACGCCACGGCATGCGTAACTTTGATAGCAAGGATAAAAAAGGACATATTCCTCTACACCACTATCTGATTCATTCACTATCAATCGATAACAAGATCTTTGATATACTAACGGACCCCATTGATGACTTTAGTAAATCATCCGATCTATTGCTGTGTTATCTTAGATATAAATTCAATGGGAGCTTAAACTATTACGTTCTGTACAAATTATTGACTAAAGGATCTGACCCTAATTGCGTCGATGAGGATGGACTCACTTCTCTTCATTACTACTGTAAACACATATCCGCGTTCCACGAAAGCAATTATTACAAGTCAAAGAGTCACACTAAGATGCGAGCTGAGAAGCGATTCATCTACGCGATAATAGATCATGGAGCAAACATTAACGCGGTTACGAAAATCGGAAATACGCCGTTACACACTTACCTTCAACAGTATACCAAACATAGTCCTCGTGTGGTGTATGCTCTTTTATCTCGAGGAGCCGATACGAGGATACGTAATAATCTTGATTGTACACCCATCATGGAATACATAAAGAACGATTGTGCAACAGGTCATATTCTCATAATGTTACTCAATTGGCACGAACAAAAATACGGGAAATTACAAAAGGAAGAAGGACAACATCTACTTTATCTATTCATAAAACATAATCAAGGATATGGAAGTCGCTCTCTCAATATACTACGGTATCTACTAGATAGATTCGACATTCAGAAAGACGAATACTATAATACAATGACTCCTCTTCATACCGCCTTCCAGAATTGCAATAACAATGTTGCCTCATACCTCGTATACATCGGATACGACATCAACCTTCCGACTAAAGACGATAAGACAGTATTCGACTTGGTGTTTGAAAACAGAAACATTATATACAAGGCGGATGTCGTTAATGACATTATCCACCACAGACTGAAAGTATCTCTACCTATGATTAAATCGTTGTTCTACAAGATGTCGGAGTTCTCTCCCTACGACGATCACTACGTAAAGAAGATAATAGCCTACTGCCTATTAAGGGACGAGTCATTTGCGGAACTACATACTAAATTCTGTTTAAACGAGGACTATAAAAGTGTATTTATGAAAAATATATCATTCGATAAGATAGATTCCATCATCGAAAAATGTAGTCGTGACATAAGTCTCCTCAAAGAGATTCGAATCTCAGACACCGACTTGTATACGGTATTGAGAACAGAAGACATCCGGTATCACACATATCTCGAAGCCATACATTCAGACAAACGCATTTCATTTCCCATGTACGACGATCTCATAGAACAGTGTCATCTATCGATGGAGCATAAAAGTAAACTCGTCGACAAAGCACTCAATAAATTAGAGTCTACCATCGATAGTCAATCTAGACTATCGTATTTGCCTCCGGAAATTATGCGCAATATCATAACCAAGCTAAGCGACTACCATCTAAACAGTATGTTGTACGGAAAGAACCATTACAAATATTATCCATGATAGAAAGAAAATATTTAAAAAATAATCTATATGATTGGAGAAGTAGGAAACAAACAGTAACAAGACGACGATTACTACATTATTAAATCATGAGGTCCGTATTATACTCGTATATATTGTTTCTCTCATGTATAATAATAAACGGAAGAGATATAGCACCACATGCACCATCCAATGGAAAGTGTAAAGACAACGAATACAGAAGCCGTAATCTATGTTGTCTATCGTGTCCTCCGGGAACTTACGCTTCCAGATTATGTGATAGCAAGACTAATACACAATGTACACCGTGTGGTTCGGATACCTTTACATCTCACAATAATCATTTACAGGCTTGTCTAAGTTGTAACGGAAGATGTGATAGTAATCAGGTAGAGACGCGATCGTGTAACACGACTCACAATAGAATCTGTGAATGCTCTCCAGGATATTATTGTCTTCTCAAAGGAGCATCAGGGTGTAGAACATGTATTTCTAAAACAAAGTGTGGAATAGGATACGGAGTATCCGGATACACGTCTACCGGAGACGTCATCTGTTCTCCGTGTGGTCCCGGAACATATTCTCACACCGTCTCTTCCACAGATAAATGCGAACCCGTCGTAACCAGCAATACATTTAACTATATCGATGTGGAAATTAACCTGTATCCAGTCAACGACACATCGTGTACTCGGACGACCACTACCGGTCTCAGCGAATCCATCTCAACGTCGGAACTAACTATTACCATGAATCATAAAGATTGTGATCCAGTCTTTCGTGCAGAATACTTCTCTGTCCTTAATAATGTAGCAACTTCAGGATTCTTTACAGGAGAAAATAGATATCAGAATACTTCAAAGATATGTACTCTGAATTTCGAGATTAAATGTAACAACAAAGATTCATCTTCCAAACAGTTAACGAAAACAAAGAATGATACTATCATGCCGCATTCAGAGACGGTAACTCTAGTGGGCGACTGTCTATCTAGCGTCGACATCTACATACTATATAGTAATACCAATACTCAAGACTACGAAAATGATACAATCTCTTATCATATGGGTAATGTTCTCGATGTCAATAGCCATATGCCCGCTAGTTGCGATATACATAAACTGATCACTAATTCCCAGAATCCCACCCACTTATAGTAAGTTTTTTTACCTATAAATAATAAATACAATAATTAATTTCTCGTAAAAGTAGAAAATATATTCTAATTTATTATATGGTAAGAAAGTAGAATCATCTAGAACAGTAATCAATCAATAGCAATCATGAAACAATATATTGTCCTGGCATGCATGTGCCTAGTGGCAGCTGCTATGCCTACTAGTCTTCAACAATCTTCATCCTCGTGTACTGAAGAAGAAAACAAACATCATATGGGAATCGATGTTATTATCAAAGTCACAAAGCAAGACCAAACACCGACCAATGATAAGATTTGTCAATCCGTAACGGAAGTTACAGAGACCGAAGATGATGAGGTATCCGAAGAAGTTGTAAAAGGAGATCCCACCACTTATTACACTATCGTCGGTGCGGGTCTTAACATGAACTTTGGATTCACCAAATGCCCAAAGATTTCATCCATCTCCGAATCCTCTGATGGAAACACTGTGAATACTAGATTGTCCAGCGTGTCACCGGGACAAGGTAAGGACTCTCCCGCGATCACGCGTGAAGAAGCTCTGGCTATGATCAAAGACTGTGAGATGTCTATCGACATCAGATGTAGCGAAGAAGAGAAAGACAGTGACATCAAGACCCATCCAGTACTTGGGTCTAACATCTCACATAAGAAAGTGAGTTACAAAGATATCATCGGTTCAACGATCGTTGATACAAAATGTGTCAAGAACCTAGAGTTTAGCGTACGTATCGGAGACATGTGTGAGGAATCATCTGAACTTGAAGTCAAGGATGGATTCAAGTATGTCGACGGATCGGCATCTGAAGGTGCAACCGATGATACTTCACTCATCGATTCAACAAAACTCAAAGCATGTGTCTGAATCGATAACTCTATTCATCTGAAAATGGATGAGTTGGGTTAATCGAACGATTCAGACACCGCACCACGAATTAAAAAAGACCGGGCACTATATTCCGGTTTGCAAAACAAAAATATTTAACTACATTCACAAAAAGTTACCTCTCGTTACTTCTTCTTTCTGTTTCAATATGTGATACGATATGATCACTATTCGTATTCTCTTGGTCTCATAAAAAAGTTTTACAAAAAAAAAAAAAAAAATATTTTTATTCTCTTTCTCTCTTCGATGGTCTCACAAAAATATTAAACCTCTTTCTGATGTCTCAACTATTTCGTAAACGATAACGTCCAACAATATATTCTCGTAGAGCTTATCAACATCCTTATACCAATCTAGGTTGTCAGACAATTGCATCATAAAATAATGTTTATAATTTACACGTTAACATCATATAATAAACGTATATAGTTAATATTTTTGGAATATAAATGATCTGTAAAATCCATGTAGGGGACACTGCTCACGTTTTTTCTCTAGTACATAATTTCACACAAGTTTTTATACAGACAAATTAATTCTCGTCCATATATTTTAAAACATTGACTTTTGTACTAAGAAAAATATCTTGACTAACCATCTCTTTCTCTCTTCGATGGGTCTCACAAAAATATTAAACCTCTTTCTGATGGAGTCGTAAAAAGTTTTTATCCTTTCTCTCTTCGATAGGTCTCACAAAAATATTAAACCTCTTTCTGATGGTCTCTATAAACGATTGATTTTTCTTACCCTCTAGAGTTTCCTACGGTCGTGGGTCACACATTTTTTTCTAGACA
6 changes: 6 additions & 0 deletions assets/lib/varv_repeats_lib.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
>VARV_ITR#UNKNOWN
CATTCTGATGCATCAACTATTTCTTAAACAATAACGTTCAACAACATATACTCTCGAGCTTATCAACATCCCCTATGTCCCAACTAGGTTACCAAACAATTGTATATCATAAAATAATGTTTATAATTTACACGTTAAAATCATATAATAAAACGTAGATCGTATAATATTTTTTGGTATATAAATGATCTAGTAAAATCCATGTAGGGGATACTGTTCACGTTTTTTGGTACAAAATTTCTCACAAGTTTTTATACAGACAAATTCTTGTCCATATATTTTAAAACATTGACTTTTGCACTAAGAAAAATATATAGACTAACTATCTCTTTCTCTCTTCGATGGTCTCACAAAAATATTAAACCTCTTTCTGATGGAGTCGTAAAAAGTTTTATCTCTTTCTCTCTTCGATGGTCTCACAAAAATATTAAACCTCTTTCTGATGGAGTCGTAAAAAGTTTTATCTCTTTCTCTCTTCGATGGTCTCACAAAAATATTAAACCTCTTTCTGATGGAGTCGTAAAAAGTTTTATCTCTTTCTCTCTTCGATGGTCTCACAAAAATATTAAACCTCTTTCTGATGGAGTCGTAAAAAGTTTTATCTCTTTCTCTCTTCGATGGTCTCACAAAAATATTAAACCTCTTTCTGATGGTCTCTATAAAGCGATTGATTTTTTTACCCTCTAGAGTTTCCTACAGTCGTGGGTCACACATTTTTTTCTAGACAC
>VARV_AT_repeat_region#UNKNOWN
ATATATATATATATATATATATATATATATATATATATAT
>VARV_T_repeat_region#UNKNOWN
TTTTTTTTTT
Loading
Loading