Skip to content

Commit

Permalink
Improve vignettes and README
Browse files Browse the repository at this point in the history
  • Loading branch information
adrientaudiere committed Oct 25, 2023
1 parent 367d497 commit 8fb8d6c
Show file tree
Hide file tree
Showing 7 changed files with 89 additions and 11 deletions.
6 changes: 3 additions & 3 deletions R/krona.R
Original file line number Diff line number Diff line change
Expand Up @@ -104,9 +104,9 @@ krona <-
#' \dontrun{
#' data("GlobalPatterns")
#' GA <- subset_taxa(GlobalPatterns, Phylum == "Acidobacteria")
#' # krona(GA, "Number.of.sequences.html")
#' # krona(GA, "Number.of.ASVs.html", nb_seq = FALSE)
#' # merge_krona(c("Number.of.sequences.html", "Number.of.ASVs.html"))
#' krona(GA, "Number.of.sequences.html")
#' krona(GA, "Number.of.ASVs.html", nb_seq = FALSE)
#' merge_krona(c("Number.of.sequences.html", "Number.of.ASVs.html"))
#' }
#' @return A html file
#' @seealso \code{\link{krona}}
Expand Down
7 changes: 6 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
output: github_document
always_allow_html: yes
bibliography: paper/bibliography.bib
---

![R](https://img.shields.io/badge/r-%23276DC3.svg?style=for-the-badge&logo=r&logoColor=white)
Expand All @@ -23,9 +24,13 @@ knitr::opts_chunk$set(

# MiscMetabar

The goal of MiscMetabar is to complete the great packages [dada2](https://benjjneb.github.io/dada2/index.html), [phyloseq](https://joey711.github.io/phyloseq/) and [targets](https://books.ropensci.org/targets/).
See the pkdown site [here](https://adrientaudiere.github.io/MiscMetabar/).

Biological studies, especially in ecology, health sciences and taxonomy, need to describe the biological composition of samples. During the last twenty years, (i) the development of DNA sequencing, (ii) reference databases, (iii) high-throughput sequencing (HTS), and (iv) bioinformatics resources have allowed the description of biological communities through metabarcoding. Metabarcoding involves the sequencing of millions (*meta*-) of short regions of specific DNA (*-barcoding*, @valentini2009) often from environmental samples (eDNA, @taberlet2012) such as human stomach contents, lake water, soil and air.

`MiscMetabar` aims to facilitate the **description**, **transformation**, **exploration** and **reproducibility** of metabarcoding analysis using R. The development of `MiscMetabar` relies heavily on the R packages [`dada2`](https://benjjneb.github.io/dada2/index.html), [`phyloseq`](https://joey711.github.io/phyloseq/) and [`targets`](https://books.ropensci.org/targets/).


## Installation

There is no CRAN or bioconductor version of MiscMetabar for now (work in progress).
Expand Down
44 changes: 40 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,28 @@ v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](http://www.gnu.org/

# MiscMetabar

The goal of MiscMetabar is to complete the great packages
[dada2](https://benjjneb.github.io/dada2/index.html),
[phyloseq](https://joey711.github.io/phyloseq/) and
[targets](https://books.ropensci.org/targets/). See the pkdown site
See the pkdown site
[here](https://adrientaudiere.github.io/MiscMetabar/).

Biological studies, especially in ecology, health sciences and taxonomy,
need to describe the biological composition of samples. During the last
twenty years, (i) the development of DNA sequencing, (ii) reference
databases, (iii) high-throughput sequencing (HTS), and (iv)
bioinformatics resources have allowed the description of biological
communities through metabarcoding. Metabarcoding involves the sequencing
of millions (*meta*-) of short regions of specific DNA (*-barcoding*,
Valentini, Pompanon, and Taberlet (2009)) often from environmental
samples (eDNA, Taberlet et al. (2012)) such as human stomach contents,
lake water, soil and air.

`MiscMetabar` aims to facilitate the **description**,
**transformation**, **exploration** and **reproducibility** of
metabarcoding analysis using R. The development of `MiscMetabar` relies
heavily on the R packages
[`dada2`](https://benjjneb.github.io/dada2/index.html),
[`phyloseq`](https://joey711.github.io/phyloseq/) and
[`targets`](https://books.ropensci.org/targets/).

## Installation

There is no CRAN or bioconductor version of MiscMetabar for now (work in
Expand Down Expand Up @@ -111,3 +127,23 @@ sudo apt-get install ncbi-blast+
``` sh
sudo apt-get install vsearch
```

<div id="refs" class="references csl-bib-body hanging-indent">

<div id="ref-taberlet2012" class="csl-entry">

Taberlet, Pierre, Eric Coissac, Mehrdad Hajibabaei, and Loren H
Rieseberg. 2012. “Environmental Dna.” *Molecular Ecology*. Wiley Online
Library. <https://doi.org/10.1002/(issn)2637-4943>.

</div>

<div id="ref-valentini2009" class="csl-entry">

Valentini, Alice, François Pompanon, and Pierre Taberlet. 2009. “DNA
Barcoding for Ecologists.” *Trends in Ecology & Evolution* 24 (2):
110–17. <https://doi.org/10.1016/j.tree.2008.09.011>.

</div>

</div>
6 changes: 6 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,14 @@ navbar:
- text: Tengeler
href: articles/tengeler.html
- text: -------
- text: "R ecosystem for metabarcoding"
- text: "Metabarcoding with R"
href: articles/states_of_fields_in_R.html
- text: -------
- text: "For developpers"
- text: Rules of code
href: articles/Rules.html


development:
mode: auto
4 changes: 2 additions & 2 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Describing communities of living organisms increasingly relies on massive DNA se

Biological studies, especially in ecology, health sciences and taxonomy, need to describe the biological composition of samples. During the last twenty years, the development of (i) high-throughput DNA sequencing, (ii) reference databases and (iii) bioinformatics resources have allowed the description of biological communities through metabarcoding. Metabarcoding involves the sequencing of millions (*meta*-) of short regions of specific DNA (*-barcoding*, @valentini2009) often from environmental samples (eDNA, @taberlet2012) such as human stomach contents, lake water, soil and air.

Several plateforms (referenced in @tedersoo2022) such as QIIME2 (@bolyen2019), mothur (@schloss2020), and Galaxy (@jalili2020) allow complete analysis from raw fastq sequences to statistical analysis and visualization. However, the R ecosystem (@rcran), is very rich (Table 1) and more flexible than these platforms.
Several plateforms (referenced in @tedersoo2022) such as QIIME2 [@bolyen2019], mothur [@schloss2020], and Galaxy [@jalili2020] allow complete analysis from raw fastq sequences to statistical analysis and visualization. However, the R ecosystem [@rcran], is very rich (Table 1) and more flexible than these platforms.

`MiscMetabar` aims to facilitate the **description**, **transformation**, **exploration** and **reproducibility** of metabarcoding analysis using R. The development of `MiscMetabar` relies heavily on the R packages `dada2`, `phyloseq` and `targets`.

Expand All @@ -36,7 +36,7 @@ The metabarcoding ecosystem in the R language is mature, well-constructed, and r

R package [`dada2`](http://bioconductor.org/packages/release/bioc/html/dada2.html) [@callahan2016] provides a highly cited and recommended clustering method [@pauvert2019]. [`phyloseq`](http://bioconductor.org/packages/release/bioc/html/phyloseq.html) [@mcmurdie2013] facilitate metagenomics analysis by providing a way to store data (the `phyloseq` class) and provides graphical and statistical functions. `MiscMetabar` is based on the `phyloseq` class from `phyloseq`, the most cited package in metagenomics [@wen2023]. For a description and comparison of other integrated packages competing with phyloseq, see @wen2023. Some packages already extend the phyloseq packages, in particular [`microbiome`](https://microbiome.github.io/) package collection [@ernst2023], the `speedyseq` package [@mclaren2020] and the package [phylosmith](https://schuyler-smith.github.io/phylosmith/) [@smith2019].

![Table 1 : Important functions of MiscMetabar with their equivalent when available in other R packages: 1. Mia [@ernst2023]; 2. microViz [@Barnett2021]; 3. MicrobiotaProcess [@xu2023]; 4 Phylosmith [@smith2019].](figures_svg/table1.svg)
![Table 1 : Important functions of MiscMetabar with their equivalent when available in other R packages: 1. Mia [@ernst2023]; 2. microViz [@Barnett2021]; 3. MicrobiotaProcess [@xu2023]; 4 Phylosmith [@smith2019].](figures_svg/table1.svg){width="100%"}

`MiscMetabar` enriches this R ecosystem by providing functions to (i) **describe** your dataset visually, (ii) **transform** your data, (iii) **explore** biological diversity (alpha, beta, and taxonomic diversity), and (iv) simplify **reproducibility**. `MiscMetabar` is already used by the scientific community in several teams [@Vieira2021; @Pleic2022; @McCauley2022; @McCauley2023; @bouilloud2023; @vieira2023].

Expand Down
2 changes: 1 addition & 1 deletion paper/paper_old.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ affiliations:
index: 1

date: 23 October 2023
bibliography: paper_old.bib
bibliography: ../paper/bibliography.bib
---

# Summary
Expand Down
31 changes: 31 additions & 0 deletions vignettes/states_of_fields_in_R.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
title: "Metabarcoding with R"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Reclustering}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: ../paper/bibliography.bib
---

This is a short introduction to other R packages in the field of metabarcoding analysis.

# State of the Field in R

The metabarcoding ecosystem in the R language is mature, well-constructed, and relies on a very active community in both the [bioconductor](https://www.bioconductor.org/) and [cran](https://cran.r-project.org/) projects. The [bioconductor](https://www.bioconductor.org/) even creates specific task views in [Metagenomics](http://bioconductor.org/packages/release/BiocViews.html#___Metagenomics) and [Microbiome](http://bioconductor.org/packages/release/BiocViews.html#___Microbiome).

R package [`dada2`](http://bioconductor.org/packages/release/bioc/html/dada2.html) [@callahan2016] provides a highly cited and recommended clustering method [@pauvert2019]. `dada2` also provides tools to complete the metabarcoding analysis pipeline, including chimera detection and taxonomic assignment. `phyloseq` [@mcmurdie2013] (http://bioconductor.org/packages/release/bioc/html/phyloseq.html) facilitate metagenomics analysis by providing a way to store data (the `phyloseq` class) and both graphical and statistical functions.

The phyloseq package introduces the S4 class object (class physeq), which contains (i) an OTU sample matrix, (ii) a taxonomic table, (iii) a sample metadata table, and two optional slots for (iv) a phylogenetic tree and (v) reference sequences.

Some packages already extend the phyloseq packages. For example, the [`microbiome`](https://microbiome.github.io/) package collection [@ernst2023] provides some scripts and functions for manipulating microbiome datasets. The `speedyseq` package [@mclaren2020] provides faster versions of phyloseq's plotting and taxonomic merging functions, some of which are used in `MiscMetabar`. The [phylosmith](https://schuyler-smith.github.io/phylosmith/) [smith2023](https://joss.theoj.org/papers/10.21105/joss.01442) package already provides some functions to extend and simplify the use of the phyloseq packages.

Other packages ([`mia`](https://github.com/microbiome/mia/) forming the [`microbiome`](https://microbiome.github.io/) package collection and [`MicrobiotaProcess`](https://github.com/YuLab-SMU/MicrobiotaProcess) [@xu2023]) extend a new data structure using the comprehensive Bioconductor ecosystem of the `SummarizedExperiment` family.

`MiscMetabar` enriches this R ecosystem by providing functions to (i) **describe** your dataset visually, (ii) **transform** your data, (iii) **explore** biological diversity (alpha, beta, and taxonomic diversity), and (iv) simplify **reproducibility**. `MiscMetabar` is designed to complement and not compete with other R packages mentioned above. For example. The `mia` package is recommended for studies focusing on phylogenetic trees, and `phylosmith` allows easy visualization of co-occurrence networks. Using the `MicrobiotaProcess::as.MPSE` function, most of the utilities in the `MicrobiotaProcess` package are available with functions from the `MiscMetabar`.

I do not try to reinvent the wheel and prefer to rely on existing packages and classes rather than building a new framework. `MiscMetabar` is based on the phyloseq class from phyloseq, the most cited package in metagenomics [@wen2023]. For a description and comparison of these integrated packages competing with phyloseq (e.g. [microeco](https://github.com/ChiLiubio/microeco) by @liu2020, [EasyAmplicon](https://github.com/YongxinLiu/EasyAmplicon) by @liu2023 and [MicrobiomeAnalystR](https://www.microbiomeanalyst.ca) by @lu2023) see @wen2023. Note that some limitations of the phyloseq packages are circumvented thanks to [phylosmith](https://schuyler-smith.github.io/phylosmith/) [@smith2023], [`microViz`](https://david-barnett.github.io/microViz/) ([@Barnett2021]) and [`MiscMetabar`](https://adrientaudiere.github.io/MiscMetabar/).

Some packages provide an interactive interface useful for rapid exploration and for code-beginner biologists. [Animalcules](https://github.com/compbiomed/animalcules) [@zhao2021] and [`microViz`](https://david-barnett.github.io/microViz/) [@Barnett2021] provides shiny interactive interface whereas [MicrobiomeAnalystR](https://www.microbiomeanalyst.ca) [@lu2023] is a web-based platform.

# References

0 comments on commit 8fb8d6c

Please sign in to comment.