This repository contains scripts and workflows for conducting microbiome analyses using QIIME2 and R. The workflows are designed to handle metagenomic and taxonomic data, performing tasks such as sequence processing, differential abundance analysis, and visualization.
This repository supports two primary analysis pipelines:
- QIIME2 Workflow: Focused on sequence processing, taxonomic classification, and phylogenetic diversity analysis.
- R Workflow: Performs differential abundance analysis using ANCOMBC2 and creates detailed taxonomic visualizations.
These workflows are complementary and provide end-to-end support for microbiome data analysis.
- QIIME2 (v2021.8 or later)
- Ensure all plugins like
dada2
,feature-classifier
,phylogeny
, andtaxa
are installed.
- Ensure all plugins like
- R (v4.0 or later)
- Required R packages:
tidyverse
,phyloseq
,qiime2R
,microbiome
,ANCOMBC
,microViz
,ggplot2
,lme4
.
- Required R packages:
- QIIME2 Workflow:
- Raw paired-end sequences in
CasavaOneEightSingleLanePerSampleDirFmt
format. - A pre-trained classifier for taxonomic classification (e.g., Silva database).
- Metadata file in
.tsv
format.
- Raw paired-end sequences in
- R Workflow:
- Exported
.qza
and.qzv
files from QIIME2. - Metadata file with relevant grouping columns.
- Exported
The QIIME2 pipeline includes the following steps:
- Data Import: Converts raw paired-end sequence data into QIIME2's
SampleData[PairedEndSequencesWithQuality]
format. - Denoising with DADA2: Removes noise and generates a feature table and representative sequences.
- Taxonomic Classification: Classifies sequences using a pre-trained classifier (e.g., Silva database).
- Diversity Analysis: Computes alpha and beta diversity metrics based on phylogenetic trees.
- Visualization: Produces bar plots, tables, and diversity summaries.
The R pipeline is designed for downstream statistical and visualization tasks:
- Import Data: Converts exported QIIME2
.qza
files intophyloseq
objects. - Differential Abundance Analysis:
- Performs ANCOMBC2 at various taxonomic levels (e.g., Phylum, Genus).
- Identifies significant taxa with adjusted p-values and log fold changes.
- Visualization:
- Generates bar plots with error bars for log fold changes.
- Outputs results as
.tsv
files and high-resolution images.
- Update the
QIIME2
paths in the Bash script (qiime2_workflow.sh
). - Execute the script:
bash qiime2_workflow.sh _-=====-_ / \ ___ ___ // UNIPI \\ ___ ___ | || | ------------ | || | |___||___| ___ |___||___| ( o o ) ( o ) ( o o )