Some biosample metadata scraping efforts from the Pond lab for the ArgosDB project.
Recommend miniconda and mamba. Choose your environment (currently m1
or linux
), then
mamba env create -f environment-$ENVIRONMENT.yml
conda activate fdaargos
It is also recommended that you set environment variables ENTREZ_EMAIL
and ENTREZ_API_KEY
via
export ENTREZ_EMAIL=$ENTREZ_EMAIL
export ENTREZ_API_KEY=$ENTREZ_API_KEY
Fetch all biosamples from a given bioproject accession:
snakemake -j 1 bioprojects/$BP_ACCESSION/biosample_links.txt
snakemake -j 1 bioprojects/$BP_ACCESSION/biosamples/all.txt
For instance, try PRJNA231221, the original ArgosDB bioproject.