Public Remus instance is available at http://remus.btm.umed.pl
Remus is a web tool helping in identification of regulatory regions relevant to a given monogenic disease phenotype. Starting from a small set of genes implicated in the disease, Remus allows iterative building of a tissue-specific set of regions that likely play a role in regulating expression of the input genes. After the list is finalized, it can be downloaded as a BED file with genomic coordinates or used directly to filter variants in a VCF file.
The growing inventory of regulatory data available in Remus at the moment includes coordinates of:
- tissue-specific enhancers from FANTOM5 and ENCODE repositories, including the SCREEN datasets
- promoters (SCREEN) and transcription start sites (SlideBase / FANTOM5)
- regions of accessible chromatin (ENCODE and SCREEN)
- microRNA - mRNA interactions from miRTarBase and miRWalk
In upcoming releases Remus will also enable inclusion of other regulatory features.
-
Select genome build (hg19, GRCh38). Regulatory features available in the primary sources only in one genome build (e.g. hg19 for FANTOM5 data) have been liftedOver to the other genome build.
-
Type-in symbols of genes relevant for the phenotype.
-
Choose organs, tissues and/or cell-types relevant for the phenotype. In parenthesis next to name of the organ/tissue/cell-type, symbols for available datatypes are shown, e.g. ENH_E - enhancers from ENCODE, PR_F5 - FANTOM5 promoters, CHR_S - accessible chromatin from SCREEN.
-
Select types of regulatory features to include. Set maximal distance, up- and downstream from the transcript start (RefSeq transcript coordinates are used). Choose if regulatory features present in any of selected tissues (permissive) or in all of them (strict) should be used.
Please note that miRNA-gene interactions are filtered against accessible chromatin regions in selected tissues, i.e. only miRNAs encoded in accessible parts of the genome will be included.
-
Download the results as a BED file, Excel file, view it in Genome Browser, or filter your own VCF file with variants (for details, see below). Note that the VCF file is filtered in your browser - it is NOT sent or uploaded anywhere.
Remus allows for in-browser filtering of a VCF file using the output BED file with regulatory regions. Variants falling into the regions are selected and returned in a plain text VCF file. The input must be provided as sorted plain-text VCF, and filtering large files takes only a few seconds (~5s on 500M VCF). Filtering BGZipped & Tabix'ed files was considerably slower in tests, and although implemented, has been disabled for the time being.
In-browser filtering means that the variant file does not leave your computer - great feature if you are working with sensitive data. The downside is that the functionality can be somewhat limited. Currently the VCF file is read in one piece (to be changed), and empirically tested size limitations were following:
- plain text VCF in Chrome 68 can be upto 1GB,
- plain text VCF limit in Firefox 64.0 was ~250MB
Please note, that Remus does not annotate nor filter variants based on population frequency, evolutionary conservation, or pathogenicity scores. This type of filtering is advised before (or after) using Remus, and can be done with the help of tools such as VEP, SnpEff or Annovar.
-
In the Remus repo (REMUS_DIR), build the docker image:
docker build -t remus .
-
To prepare data for Remus, either:
(shorter version)
Download archive with Remus data files from here, and extract the archive in REMUS_DIR.
or (longer version):
Start Remus container interactively with write access to the repo directory (REMUS_DIR):
docker run --rm --name remus_databuild -v REMUS_DIR:/var/www/remus:rw -ti remus
After reading
exterenal_resources/README.md
, download liftOver and chains by:cd external_resources && ./download.sh && cd ..
Next, launch
./make_data_tree.sh
This will download necessary files and fill REMUS_DIR/data with all Remus data. Now you can exit the container.
-
Start docker container with the app available at
http://localhost:LOCAL_PORT
docker run --rm -d --name remus_app -v REMUS_DIR:/var/www/remus -p LOCAL_PORT:80 remus apachectl -D FOREGROUND
pip install -r requirements.txt
or if development mode:
pip install -r requirements-dev.txt
After reading exterenal_resources/README.md
, download liftOver and chain files.
In application root run:
bash external_resources/download.sh
Next download the data. This step can take long time because of large amount of data needed to be downloaded.
bash make_data_tree.sh
In application root run:
python3 app.py
The application is available at 127.0.0.1:5000
Remus has been developed at BTM, Medical Univeristy of Lodz, Poland. Application's UI and initial work on its internals was done by Damian Skrzypczak as part of his MSc project. Since then, it has been extended by me.
Code for in-browser filtering of tabixed VCF files was adopted from js-local-vcf written by Jon Anthony. Liftover of genome coordinates is done using liftOver tool developed by Jim Kent. Note that liftOver is free only for academic use. Data used in Remus is downloaded from public databases and primary sources attributed in the description on top of the page.
This project is funded by NCN Polonez grant no 2016/23/P/NZ2/04251. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 665778.