Running AlphaFold on the Della cluster at Princeton University

This repo contains a SLURM job submission script to run AlphaFold on the Della cluster at Princeton University. This script loads the AlphaFold module and then calls the run_alphafold.sh wrapper, which in turn executes the run_alphafold.py script. These scripts have been kindly written by Matthew Cahn (MOL) and can be found at /tigress/MOLBIO/local/src/alphafold-<version>/run_alphafold.sh. He also has SLURM script examples at example-alphafold-<version>-norelax.cmd

AlphaFold version: 2.3.1
Database date: 2022-12-20

Check that you have access to the Della computing cluster and the Della cryo-EM compute nodes

Open a Terminal window on your computer (Mac), wsl window (Windows Subsystem for Linux), or internet browser and go to http://mydella.princeton.edu

For Terminal or wsl, try to log onto Della: ssh <netid>@della.princeton.edu. If there are no errors, you have access to Della
For MyDella, click on the tabs at the top for 'Clusters' and then 'Della Cluster Shell Access'. If there are no errors, you have access to Della After connecting to Della, check your group association: groups <netid>. If you see 'cryoem', you likely have access to the compute nodes.

If you don't have access to the Della computing cluster or the Della cryo-EM compute nodes

Email Research Computing: [email protected]

Send them your name, netid, sponsor/PI
Ask for Della access if you don't already have it, and provide your primary group affiliation ('molbio' for MOL/Ploss, 'genomics' for LSI/Adamson)
If you are in Ploss lab, also ask to be added to the 'ploss' group
Let them know that you will be using AlphaFold, and ask them to add you to the 'cryoem' group and for access to QOS della-cryoem

Dear Research Computing,

I am a <student/postdoc/staff> (<netid>, sonpsor: Alexander Ploss). I will be working with AlphaFold on the Della cluster, so could you please create an account for me on Della, with the primary group 'molbio', additional groups 'ploss' and 'cryoem', and permit me access to the QOS della-cryoem?

Thank you,
<name>

Log onto the Della cluster and create a project directory

Connect to the Princeton VPN using GlobalProtect (follow 'Installing GlobalProtect Software': https://workcontinuity.princeton.edu/remoteaccess) even if you are on campus
Log onto Della (ssh <netid>@della.princeton.edu for Terminal (Mac) or wsl (Windows Subsystem for Linux) or using http://mydella.princeton.edu
Navigate to your personal directory on the scratch volume, typically cd /scratch/gpfs/<netid>
Create a new project directory with mkdir <project-directory>; cd <project-directory>
Clone this github repo into the directory with git clone https://github.com/09aaronl/alphafold_princeton-della/ .
List the contents of your project directory with ls. You should see 6 files:

alphafold.slurm  EXAMPLE_protein1.fa  EXAMPLE_proteins12.fa  EXAMPLE_proteins.txt  LICENSE  README.md

Set up your proteins to fold

RECOMMENDED: As a test run, you can use the EXAMPLE files provided. Enter the command cp EXAMPLE_proteins.txt proteins.txt and then SKIP steps 7 and 8 until the test run completes successfully.
Using a text editor like nano or vim, create a text file named proteins.txt, containing one file name per line, a comma, and the mode to use. Names can be individual proteins ('monomer' mode) or multiple proteins to co-fold together ('multimer' mode). For example:

protein1.fa,monomer
protein2.fa,monomer
proteins12.fa,multimer
...
proteinN.fa,monomer

For each fold, create a FASTA file matching the file name in proteins.txt. EMBOSS has a nice tools for converting text to FASTA format: https://www.ebi.ac.uk/jdispatcher/sfc/emboss_seqret

A monomer (protein1.fa) should look like:

>protein1
MPRTEINSEQWITHTESTWRAPPINGAFTEREIGHTYCHARACTERSPRTEINSEQWITH
TESTWRAPPINGAFTEREIGHTYCHARACTERSPRTEINSEQWITHTESTWRAPPINGAF
TEREIGHTYCHARACTERS

For multimers (proteins12.fa), add all FASTA headers and sequences in the same file:

>protein1
MPRTEINSEQWITHTESTWRAPPINGAFTEREIGHTYCHARACTERSPRTEINSEQWITH
TESTWRAPPINGAFTEREIGHTYCHARACTERSPRTEINSEQWITHTESTWRAPPINGAF
TEREIGHTYCHARACTERS
>protein2
MSECNDPRTEINSEQFRCFLDINGSECNDPRTEINSEQFRCFLDINGSECNDPRTEINSE
QFRCFLDINGSECNDPRTEINSEQFRCFLDING

Edit the SLURM job submission script

Edit the file alphafold.slurm with a text editor:

Change the line #SBATCH --mail-user=<netid>@princeton.edu to include your netid
Change the line #SBATCH --time=06:00:00 to the maximum expected runtime, typically 1 h per 250 residues
- This does not need to be edited for the test run with the EXAMPLE files
Change the line #SBATCH --array=1-2 to specify which lines to fold from the proteins.txt file, 1-indexed, inclusive
- This does not need to be edited for the test run with the EXAMPLE files

Submit the job with sbatch alphafold.slurm and you can monitor the job status periodically with squeue --me
When the job ends due to failure or success, you will receive an email notification, and squeue --me should return nothing
List the contents of your project directory with ls. You should see a new folder for each fold, containing PDB files with your predicted structures!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Running AlphaFold on the Della cluster at Princeton University

Check that you have access to the Della computing cluster and the Della cryo-EM compute nodes

If you don't have access to the Della computing cluster or the Della cryo-EM compute nodes

Log onto the Della cluster and create a project directory

Set up your proteins to fold

Edit the SLURM job submission script

About

Releases 3

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
EXAMPLE_protein1.fa		EXAMPLE_protein1.fa
EXAMPLE_proteins.txt		EXAMPLE_proteins.txt
EXAMPLE_proteins12.fa		EXAMPLE_proteins12.fa
LICENSE		LICENSE
README.md		README.md
alphafold.slurm		alphafold.slurm

License

09aaronl/alphafold_princeton-della

Folders and files

Latest commit

History

Repository files navigation

Running AlphaFold on the Della cluster at Princeton University

Check that you have access to the Della computing cluster and the Della cryo-EM compute nodes

If you don't have access to the Della computing cluster or the Della cryo-EM compute nodes

Log onto the Della cluster and create a project directory

Set up your proteins to fold

Edit the SLURM job submission script

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages