Scripts used in the Imperial College MRes Ecology, Evolution and Conservation Programme summer project, Identifying hub genes by weighted gene co-expression network analysis to better understand molecular functional responses to insecticide exposure in bumblebee (Bombus terrestris)
a summary of bash scripts submitted to HPC for bioinformatic analysis can be found in bash_commands.sh section includes:
- download RNA-seq raw data from NCBI database using SRA-toolkit
- quality assessment of the .fastq files using fastqc
- align the fastq files to the reference genome iyBomTerr1.2
- sort .sam files and turn them into .bam format using samtools
- check the quality of alignments using qualimap
- count transcript reads for each genes using HTseq
R codes can be found in R_codes.R, including the following sections:
- normalize htseq count matrix using DESeq2
- perform WGCNA using WGCNA
- calculate module-wise relations in WGCNA
- modules-treatments correlation in WGCNA
- export the network to cytoscape
- identify hub genes using dgha
- gene ontology enrichment analysis using TopGO, including GO enrichment analysis for modules GO enrichment analysis for hub genes