-
Notifications
You must be signed in to change notification settings - Fork 2
Lazy Launcher
If you are not interested in how HPC operates, and just want to set up a python environment to run your code, then use the following steps to get started.
Interactive session allows you to run your python code in the terminal just like you do on your local machine. This is ideal for testing your code and debugging. However the session will be terminated if you close the terminal or vscode.
Remember to connect to NYU VPN/Wifi.
ssh <netid>@greene.hpc.nyu.edu
Replace <netid>
with your own.
You should always save your data in scratch directory.
cd /scratch/<netid>
git clone https://github.com/RicercarG/NYU-Greene-HPC-Cheatsheet.git
Change to the cheatsheet directory
cd NYU-Greene-HPC-Cheatsheet
Grant the permission to the run_setup.sh
script
chmod +rx run_setup.sh
Run the script for setting up some essential commands to ~/.bashrc
file, which will make your life easier.
./run_setup.sh
It's always a good practice to request for a CPU/GPU node before running any code, check Best Practice on Greene for more.
Download the shell script I wrote for requesting CPU/GPU nodes.
Run the script to request a CPU/GPU node. (fun fact: chs stands for cheatsheet. Typing the full name is too tiring.)
chsdevice.sh
or using the alias shortcut
cdv
Here chsdevice.sh
and cdv
command calls the same script. Note that you can call this script wherever directory you are in.
What runtime configuration shall I use?
CPU number
: In most cases, having 1 or 2 is sufficient.GPU number
: Should be based on your project. If you don't know GPU parallel computing, then require for 1, or 0 for no GPU.GPU Type
: A100 40GB is the fastest but you have to wait for a long time to get allocated; V100 32GB is in the middle, RTX8000 48GB is the slowest, but easy to get access.Memory (GB)
: This is the memory for CPUs. 64 works in most cases.Time (hours)
: This is the maximum time you can use the CPU/GPU node. I recommend 4 or 6 hours.
You can consider singularity
as a container that wraps up all small programs in python libraries into one large file. In this way, you won't be bothered with errors cause by exceeding quota of file number.
Good news is that you don't have to setup singularity with conda installed from scratch any more.
Download the shell script for setting and launching singularity.
Run the script to setup singularity and conda.
chslauncher.sh
or using the alias shortcut
clc
Here chslauncher.sh
and clc
command calls the same script. Note that you can call this script wherever directory you are in. For convenience, I will use clc
in the following steps.
What do these prompted options mean during installation?
Name Your Singularity Folder
: Since you can have multiple singularity environments, you should give a unique name to your singularity folder. You will use this name to activate your singularity environment. It's a good practice to set up a new singularity environment for each project.cuda version
: This should be based on your project. If not specified, cuda 11.8 works for most cases.Size of overlay
: This decides how large and how many python libraries you can install. For LLM or Diffusers projects, I empirically recommandoverlay-50G-10M
.Open on demand jupyter notebook?
: The way of running jupyter notebook on hpc is usingopen on demand
. Some operations need to be done to enable your conda environment to be recognized by notebook. Type 'y' to let the script do the work for you.
Run the sample script again, and type in your singularity folder name that you created in the previous step.
clc
What's the difference between Read and Write mode?
Read and Write
: You can add files into the singularity. This is useful when you are setting up your conda environment. However, one singularity overlay can only be written by one process at a time.Read only
: You can only read the files in the singularity environment. This is useful when you want to use a pre-built singularity environment.
If you see your terminal prompt changes to singularity:~$
, then you are successfully activated the singularity environment.
When this script is finished, you will see a new folder in your scratch directory named as your singularity folder name. That's where everything used for running python is stored.
Now you can activate conda by typing
source /ext3/env.sh
or using the alias shortcut
se
And then, Check your conda path by
which conda
-
If you see
/ext3/miniconda3/bin/conda
, then you are good to go. -
If you get message like
Illegal option --
Usage: /usr/bin/which [-a] args
, Don' panic, rununset -f which
, then typewhich conda
again. -
If you can't see anything after typing
which conda
, check this part of troubleshooting for help.
You can also check python and pip using which python
and which pip
. Their path should be /ext3/miniconda3/bin/python
and /ext3/miniconda3/bin/pip
respectively.
Now you are all set. Install your python libaries, and run python using python file.py
, just like you do in the terminal on your local machine.
Note that vscode python debugger won't work in HPC, so you have to test the code in vscode integrated terminal.
If you want to quit the singularity, or meet any other problems, check the trouble shooting guide.
The next time you login to HPC after setting up, all you need to do are:
[1] Change to your scratch directory:
cd /scratch/<netid>
[2] Request a CPU/GPU node:
cdv
or chsdevice.sh
[3] Activate/Create the singularity environment:
clc
or chslauncher.sh
[4] Activate conda inside singularity (and also activate/create your conda environment if necessary):
se
or source /ext3/env.sh
Then you can start testing your python scripts. Just that easy.
Tips: If you are new to using conda on linux, you can google or prompt chatgpt with
How to create conda environment on linux?
for help.
You need to create a python environment that can be recognized by the notebook. Everything is the same as setting up/opening singularity environment for interactive sessions.
The only thing you should notice that type 'y' if prompted Do you want to use this python environment in open on demand jupyter notebook?
when setting up a new singularity, or select setup this environment for jupyter notebook in OOD
when launching existing singularity environment.
This should only be done once for each singularity.
With the singularity environment activated, activate conda using
se
or source /ext3/env.sh
Then activate conda base environment before installing packages
conda activate base
You might find this redundant. However, empirically, if you don't, your package will not be recognized in jupyter notebook. (I'm not sure why this happens).
Also never use !pip install
in jupyter notebook, as it will install packages in your home directory rather than in singularity, which will exceed your quota.
To run jupyter notebook on HPC, you have to use open on demand, and start an interactive juptyer notebook session.
The official ood guide has a nice illustration on how to use the gui.
You don't need to be bothered with steps for setting up singularity and conda environment in the offical guide, since we've already done that in Step0.
SLURM is a job scheduler that allows you to run your code in the background, and will keep running even if you close the terminal or vscode. This is ideal for running large-scale experiments that take a long time to finish. However the session will be terminated if your code has bugs, so make sure to test your code in interactive sessions first.
Same as all above, you should have your singularity environment ready before submitting a batch job.
I can't help you setup everything automatically, since this time you are actually writing a automation script yourself. I did write a template for you to start with.
wget https://raw.githubusercontent.com/RicercarG/NYU-Greene-HPC-Cheatsheet/main/sbatch_template.slurm
Download this to whatever folder that is convenient for you, and rename it as you wish.
Open the template with your favorate text editor, and modify the following lines:
[1] Replace YourEXT3PATH.ext3
with the path to your singularity overlay.
[2] Replace YourSIFPATH.sif
with the path to your singularity image.
These two paths will be printed when you activate singulairty environment using launcher script of this cheatsheet:
clc
orchslauncher.sh
After entering your singularity folder name, you will see paths to your overlay and singularity printed in green color. Paste them into the template accordingly.
[3] Replace REPLACE THIS WITH YOUR COMMANDS
with your actual python commands. All commands should be the same as you type in terminal in an interactive session.
For example, if you want to run
mycode.py
, then replace it withpython mycode.py
.
To submit the job, run the command like below:
sbatch sbatch_template.slurm
After you submit the job, you can check the status of your job by
squeue -u $USER
Anything printed in your code will be written to slurm-******.out