Merge pull request #7 from a-slide/dev

Dev
a-slide · Jan 13, 2020 · 9cd41f3 · 9cd41f3
2 parents 336819c + f4bbc3f
commit 9cd41f3
Show file tree

Hide file tree

Showing 9 changed files with 34 additions and 36 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -11,7 +11,7 @@ First of all, thanks for considering contributing to `pycoMeth`! 👍 It's peopl
 
 ## Code of conduct
 
-Please note that this project is released with a [Contributor Code of Conduct][code_of_conduct.md]. By participating in this project you agree to abide by its terms.
+Please note that this project is released with a [Contributor Code of Conduct][code_of_conduct]. By participating in this project you agree to abide by its terms.
 
 ## How you can contribute
 

diff --git a/README.md b/README.md
@@ -21,12 +21,12 @@
 
 ### pycoMeth workflow
 
-![Workflow](pictures/pycoMeth_package.png)
+![Workflow](docs/pictures/pycoMeth_package.png)
 
 
 ### pycoMeth example output IGV rendering
 
-![](pictures/pycoMeth_all.png)
+![](docs/pictures/pycoMeth_all.png)
 
 ### Authors
 

diff --git a/docs/CGI_Finder/usage.md b/docs/CGI_Finder/usage.md
@@ -7,9 +7,13 @@
 * [Python API usage](https://a-slide.github.io/pycoMeth/CGI_Finder/API_usage/)
 * [Shell CLI usage](https://a-slide.github.io/pycoMeth/CGI_Finder/CLI_usage/)
 
-## Output format
+## Input file
 
-CGI_Finder can generates 2 files, a standard BED file and a tabulated file containing extra information
+### Reference FASTA file
+
+FASTA reference file containing sequences in which CpG islands needs to be found.
+
+## Output files
 
 ### Tabulated TSV file
 
@@ -18,13 +22,13 @@ This tabulated file contains the following fields for each CpG island found:
 * chromosome / start / end : Genomic coordinates
 * length: Length of the interval
 * num_CpG: Number of CpGs found
-* CG_freq: G+C nucleotide frequency 
+* CG_freq: G+C nucleotide frequency
 * obs_exp_freq: Observed versus expected CpG frequency
 
 ### BED file
 
-Standard genomic BED3 (https://genome.ucsc.edu/FAQ/FAQformat.html#format1) format indicating the coordinates of putative CpG islands.
+Minimal standard genomic [BED3](https://genome.ucsc.edu/FAQ/FAQformat.html#format1) format listing the coordinates of putative CpG islands.
 
-The picture below shows the putative CpG islands found (grey boxes) in an example sequence overlayed with C+G frequency and observed versus expected CpG frequency 
+The picture below shows the putative CpG islands found (grey boxes) in an example sequence, overlaid with C+G frequency and observed/expected CpG frequency
 
 ![Example](../pictures/CGI_Finder.png)
diff --git a/docs/CpG_Aggregate/usage.md b/docs/CpG_Aggregate/usage.md
@@ -15,11 +15,9 @@
 
 ### Reference FASTA file
 
-FASTA reference file used for read alignment and Nanopolish. This file is required and used to sort the CpG sites by coordinates 
+FASTA reference file used for read alignment and Nanopolish. This file is required and used to sort the CpG sites by coordinates
 
-## Output format
-
-CpG_Aggregate can generates 2 files, a standard BED file and a tabulated file containing extra information
+## Output files
 
 ### Tabulated TSV file
 
@@ -33,15 +31,15 @@ This tabulated file contains the following fields:
 
 ### BED file
 
-Standard genomic [BED6](https://genome.ucsc.edu/FAQ/FAQformat.html#format1). The score correspond to the median log likelyhood ratio.
+Standard genomic [BED9 format](https://genome.ucsc.edu/FAQ/FAQformat.html#format1) including an RGB color field. The score correspond to the median log likelihood ratio.
 The file is already sorted by coordinates and can be rendered with a genome browser such as IGV
 
 The sites are color-coded as follow:
 
-- Median log likelihood ratio higher than 2 (Methylated):  Colorscale from orange (llr = 2) to deep red (llr >=6) 
-- Median log likelihood ratio lower than 2 (Unmethylated):  Colorscale from green (llr = -2) to deep blue (llr <= -6) 
+- Median log likelihood ratio higher than 2 (Methylated):  Colorscale from orange (llr = 2) to deep red (llr >=6)
+- Median log likelihood ratio lower than 2 (Unmethylated):  Colorscale from green (llr = -2) to deep blue (llr <= -6)
 - Grey: Median log likelihood ration between -2 and 2 (ambiguous methylation status)
 
 Here is an example of multiple methylation bed files rendered with IGV
 
-![Example Bed Files](../pictures/CpG_Aggregate_2.png)
+![Example Bed Files](../pictures/CpG_Aggregate_2.png)
diff --git a/docs/Interval_Aggregate/usage.md b/docs/Interval_Aggregate/usage.md
@@ -15,15 +15,13 @@
 
 ### Reference FASTA file
 
-FASTA reference file used for read alignment and Nanopolish. This file is required and used to sort the CpG sites by coordinates 
+FASTA reference file used for read alignment and Nanopolish. This file is required and used to sort the CpG sites by coordinates
 
 ### BED file containing intervals
 
-Optional **sorted** and BED file containing **non-overlapping** intervals to bin CpG data into. If this file is not provided, then the program use a sliding customizable window to bim data along the entire genome.
+Optional **sorted** and BED file containing **non-overlapping** intervals to bin CpG data into. If this file is not provided, then the program use a sliding customizable window to bin data along the entire genome.
 
-## Output format
-
-CpG_Aggregate can generates 2 files, a standard BED file and a tabulated file containing extra information
+## Output files
 
 ### Tabulated TSV file
 
@@ -36,13 +34,13 @@ This tabulated file contains the following fields:
 
 ### BED file
 
-Standard genomic [BED6](https://genome.ucsc.edu/FAQ/FAQformat.html#format1). The score correspond to the median log likelyhood ratio.
+Standard genomic [BED9 format](https://genome.ucsc.edu/FAQ/FAQformat.html#format1) including an RGB color field. The score correspond to the median log likelihood ratio.
 The file is already sorted by coordinates and can be rendered with a genome browser such as IGV
 
 The sites are color-coded as follow:
 
-* Median log likelihood ratio higher than 2 (Methylated):  Colorscale from orange (llr = 2) to deep red (llr >=6) 
-* Median log likelihood ratio lower than 2 (Unmethylated):  Colorscale from green (llr = -2) to deep blue (llr <= -6) 
+* Median log likelihood ratio higher than 2 (Methylated):  Colorscale from orange (llr = 2) to deep red (llr >=6)
+* Median log likelihood ratio lower than 2 (Unmethylated):  Colorscale from green (llr = -2) to deep blue (llr <= -6)
 * Grey: Median log likelihood ration between -2 and 2 (ambiguous methylation status)
 
 Here is an example of multiple methylation bed files rendered with IGV

diff --git a/docs/Meth_Comp/usage.md b/docs/Meth_Comp/usage.md
@@ -17,9 +17,7 @@ A list of `pycoMeth CpG_Aggregate` or `pycoMeth Interval_Aggregate` **tsv** outp
 
 FASTA reference file used for read alignment and Nanopolish. This file is required and used to sort the CpG sites by coordinates.
 
-## Output format
-
-Meth_Comp can generates 2 files, a standard BED file and a tabulated file containing extra information.
+## Output files
 
 ### Tabulated TSV file
 
@@ -29,18 +27,18 @@ This tabulated file contains the following fields:
 * n_samples: Number of valid samples compared for position
 * pvalue / statistic: pvalue /statistic for positions obtained by Kruskal Wallis or Mann_Withney test
 * adj_pvalue: FDR adjusted pValue using the Benjamini & Hochberg procedure  
-* neg_med / pos_med / ambiguous_med: Number of samples with a median below the negative llr threshold / above the positive llr threshold or with and ambiguous median between the 2 thresholds 
+* neg_med / pos_med / ambiguous_med: Number of samples with a median below the negative llr threshold / above the positive llr threshold or with and ambiguous median between the 2 thresholds
 * labels: labels of the samples tested, matching the order of values in med_llr_list and raw_llr_list
 * med_llr_list: List of median llr values for each samples compared.
 * raw_llr_list: List of the list of raw llr values for each samples compared
 
 ### BED file
 
-Standard genomic BED6 (https://genome.ucsc.edu/FAQ/FAQformat.html#format1). The score correspond to the -log10(Adjusted Pvalue) capped to 1000. The file is sorted by coordinates and can be rendered with a genome browser such as IGV
+Standard genomic [BED9 format](https://genome.ucsc.edu/FAQ/FAQformat.html#format1) including an RGB color field. The score correspond to the -log10(Adjusted Pvalue) capped to 1000. The file is sorted by coordinates and can be rendered with a genome browser such as IGV
 
 The sites are color-coded as follow:
 
-* Significant differential methylation Adjusted pValue:  Colorscale from orange (pValue=0.01) to deep purple (pValue<=0.000001) 
+* Significant differential methylation Adjusted pValue:  Colorscale from orange (pValue=0.01) to deep purple (pValue<=0.000001)
 * Non-significant: Grey
 
 Here is an example of multiple methylation bed files with  rendered with IGV

diff --git a/docs/installation.md b/docs/installation.md
@@ -17,13 +17,13 @@ With [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.
 conda create -n pycoMeth python=3.6
 ```
 
-You might also want to install [Nanopolish](https://github.com/jts/nanopolish) in the same virtual environment so you can pipe nanopolish output directly into pycoMeth
+You might also want to install [Nanopolish](https://github.com/jts/nanopolish) in the same virtual environment so you can pipe nanopolish output directly into `pycoMeth`
 
 ## Dependencies
 
 [Nanopolish 0.10+](https://github.com/jts/nanopolish) is not a direct dependency but is required to generate the files used by several commands from this package
 
-Nanocompore relies on a the following robustly maintained third party python libraries:
+`pycoMeth` relies on a the following robustly maintained third party python libraries:
 
 * numpy>=1.14.0
 * tqdm>=4.23.4
@@ -43,7 +43,7 @@ pip install pycoMeth
 pip install pycoMeth --upgrade
 ```
 
-If you feel adventurous you can install the development version from test.pypi
+If you feel more adventurous you can install the development version from test.pypi
 
 ```bash
 pip install --index-url https://test.pypi.org/simple/ pycoMeth

diff --git a/pycoMeth/__init__.py b/pycoMeth/__init__.py
@@ -1,5 +1,5 @@
 # -*- coding: utf-8 -*-
 
 # Define self package variable
-__version__ = "0.2.6"
+__version__ = "0.2.7"
 __description__ = 'Python package for nanopore DNA methylation analysis downstream to Nanopolish'
diff --git a/setup.py b/setup.py
@@ -5,7 +5,7 @@
 
 # Define package info
 name = "pycoMeth"
-version = "0.2.6"
+version = "0.2.7"
 description = 'Python package for nanopore DNA methylation analysis downstream to Nanopolish'
 with open("README.md", "r") as fh:
     long_description = fh.read()
@@ -26,7 +26,7 @@
         'Development Status :: 3 - Alpha',
         'Intended Audience :: Science/Research',
         'Topic :: Scientific/Engineering :: Bio-Informatics',
-        'License :: OSI Approved :: MIT License',
+        'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
         'Programming Language :: Python :: 3'],
     install_requires = [
         'numpy>=1.14.0',