From 95c40e61db52eb5269aa71d30324261010b5420b Mon Sep 17 00:00:00 2001 From: Mats Rynge Date: Thu, 1 Aug 2024 10:41:18 -0700 Subject: [PATCH 1/2] Software exercises - 2024 updates --- docs/materials/software/part1-ex1-run-apptainer.md | 4 ++-- docs/materials/software/part1-ex2-apptainer-jobs.md | 4 +++- docs/materials/software/part1-ex3-docker-jobs.md | 6 ++++-- .../materials/software/part1-ex4-apptainer-build.md | 6 +++--- .../software/part2-ex1-build-executable.md | 6 +++--- .../software/part3-ex1-apptainer-recipes.md | 10 +++++----- docs/materials/software/part3-ex2-docker-build.md | 6 +++--- docs/materials/software/part4-ex1-download.md | 6 +++--- docs/materials/software/part4-ex2-wrapper.md | 4 ++-- docs/materials/software/part4-ex3-arguments.md | 10 +++++----- docs/materials/software/part5-ex1-prepackaged.md | 13 ++++++------- docs/materials/software/part5-ex2-python.md | 4 ++-- 12 files changed, 41 insertions(+), 38 deletions(-) diff --git a/docs/materials/software/part1-ex1-run-apptainer.md b/docs/materials/software/part1-ex1-run-apptainer.md index 13424e1..bb51033 100644 --- a/docs/materials/software/part1-ex1-run-apptainer.md +++ b/docs/materials/software/part1-ex1-run-apptainer.md @@ -33,12 +33,12 @@ Exploring Apptainer Containers First, let's try to run a container from the [OSG-Supported List](https://portal.osg-htc.org/documentation/htc_workloads/using_software/available-containers-list/). -1. Find the full path for the `ubuntu 20.04` container image. +1. Find the full path for the `ubuntu 22.04` container image. 1. To run it, use this command: :::console - $ apptainer shell /cvmfs/singularity.opensciencegrid.org/opensciencegrid/osgvo-ubuntu-20.04:latest + $ apptainer shell /cvmfs/singularity.opensciencegrid.org/htc/ubuntu:22.04 It may take a few minutes to start - don't worry if this happens. diff --git a/docs/materials/software/part1-ex2-apptainer-jobs.md b/docs/materials/software/part1-ex2-apptainer-jobs.md index 59423ff..bd37c7b 100644 --- a/docs/materials/software/part1-ex2-apptainer-jobs.md +++ b/docs/materials/software/part1-ex2-apptainer-jobs.md @@ -60,7 +60,9 @@ Now, let's try running that same script inside a container. :::file universe = container - container_image = /cvmfs/singularity.opensciencegrid.org/opensciencegrid/osgvo-ubuntu-20.04:latest + container_image = /cvmfs/singularity.opensciencegrid.org/htc/ubuntu:22.04 + +1. If the submit file you copied has something like `requirements = (OSGVO_OS_STRING == "RHEL 9")`, remove that. When you use containers, you should not specify an OS in the requirements as that will unnecessarily limit the number of resources you can run on. 1. Submit the job and read the standard output file when it completes. What version of Linux was used for the job? What is the version of `gcc`? or Python? diff --git a/docs/materials/software/part1-ex3-docker-jobs.md b/docs/materials/software/part1-ex3-docker-jobs.md index c929a33..bb4366b 100644 --- a/docs/materials/software/part1-ex3-docker-jobs.md +++ b/docs/materials/software/part1-ex3-docker-jobs.md @@ -41,7 +41,7 @@ Submit File and Executable 1. Use the same executable as the [previous exercise](../part1-ex2-apptainer-jobs). -1. Once these steps are done, submit the job. +1. Once these steps are done, submit the job. You might get a warning about using OSDF for contaier transfers - ignore this warning for now. Finding Docker Containers ------------- @@ -54,9 +54,11 @@ created equal. Anyone can create an account on Docker Hub and share container im - There is a Dockerfile or other listing of what has been installed to the container image. - The container image page has documentation on how to use the container image. [^1] +Given these indicators: + 1. Can you find a container on [Docker Hub](https://hub.docker.com/) that would be useful for running Jupyter notebooks that use tensorflow? 1. Does your chosen image meet at least 2 of the criteria above? -[^1]: This list and previous text taken from [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/) \ No newline at end of file +[^1]: This list and previous text taken from [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/) diff --git a/docs/materials/software/part1-ex4-apptainer-build.md b/docs/materials/software/part1-ex4-apptainer-build.md index fed1f7e..db83555 100644 --- a/docs/materials/software/part1-ex4-apptainer-build.md +++ b/docs/materials/software/part1-ex4-apptainer-build.md @@ -53,7 +53,7 @@ when it builds the container image. :::file Bootstrap: docker - From: opensciencegrid/osgvo-ubuntu-20.04:latest + From: hub.opensciencegrid.org/htc/ubuntu:22.04 %post apt-get update -y @@ -93,12 +93,12 @@ allow us to test our new container. 1. Try running: :::console - $ singularity shell py-cowsay.sif + $ apptainer shell py-cowsay.sif 1. Then try running the `hello-cow.py` script: :::console - Singularity< ./hello-cow.py + apptainer> ./hello-cow.py 1. If it produces an output, our container works! We can now exit (by typing `exit`) and submit a job. diff --git a/docs/materials/software/part2-ex1-build-executable.md b/docs/materials/software/part2-ex1-build-executable.md index 9ebc29b..a022222 100644 --- a/docs/materials/software/part2-ex1-build-executable.md +++ b/docs/materials/software/part2-ex1-build-executable.md @@ -18,7 +18,7 @@ Setup 1. Download and unzip a set of Protein Data Bank (PDB) files: :::console - $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2023/alkanes.tar.gz + $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/alkanes.tar.gz $ tar -xzf alkanes.tar.gz 1. For these exercises, we are going to run a command that counts the number of @@ -43,7 +43,7 @@ run the script, we will add the following header on the first line: ``#!/bin/bas The "header" of `#!/bin/bash` will tell the computer that this is a bash shell script and can be run in the same way that you would run individual commands on the command line. We use `/bin/bash` instead of just `bash` because that is the full path to the `bash` - software file. (Run `which bash` to check!) + software file. !!! note "Other languages" We can use the same principle for any scripting language. For example, the header for a Python script @@ -147,4 +147,4 @@ Your Work If so, what should it be? 1. What items in your main code or commands are changing? Do you need to add arguments -to your code? \ No newline at end of file +to your code? diff --git a/docs/materials/software/part3-ex1-apptainer-recipes.md b/docs/materials/software/part3-ex1-apptainer-recipes.md index 8b64afe..30c1efa 100644 --- a/docs/materials/software/part3-ex1-apptainer-recipes.md +++ b/docs/materials/software/part3-ex1-apptainer-recipes.md @@ -25,12 +25,12 @@ Where to start :::file Bootstrap: docker - From: opensciencegrid/osgvo-ubuntu-20.04:latest + From: hub.opensciencegrid.org/htc/ubuntu:22.04 A custom container always is always built on an existing container. It is common -to use a container on Docker Hub. These lines tell Apptainer to pull the -pre-existing image from Docker Hub, and to use it as the base for the -container that will be built using this definition file. +to use a container on Docker Hub, or in this case, hub.opensciencegrid.org. +These lines tell Apptainer to pull the pre-existing image from the hub, and to +use it as the base for the container that will be built using this definition file. When choosing a base container, try to find one that has most of what you need - for example, if you want to install R packages, try to find a container that already @@ -95,4 +95,4 @@ Environment To set environment variables (especially useful for software in a custom location), use the `%environment` section of the definition file. -[^1]: This text and previous list taken from [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/) \ No newline at end of file +[^1]: This text and previous list taken from [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/) diff --git a/docs/materials/software/part3-ex2-docker-build.md b/docs/materials/software/part3-ex2-docker-build.md index 57ffa61..c9872d7 100644 --- a/docs/materials/software/part3-ex2-docker-build.md +++ b/docs/materials/software/part3-ex2-docker-build.md @@ -91,14 +91,14 @@ new container, we will want to use a similar naming scheme of: In what follows, you will want to replace `USERNAME` with your DockerHub user name. The `CONTAINER` name and `VERSIONTAG` are your choice; in what follows, we will - use `py3-numpy` as the container name and `2021-08` as the version tag. + use `py3-numpy` as the container name and `2024-08` as the version tag. 1. To build and name the new container, open a command line window on your computer where you can run Docker commands. Use the `cd` command to change your working directory to the build directory with the `Dockerfile` inside. :::console - $ docker build -t USERNAME/py3-numpy:2021-08 . + $ docker build -t USERNAME/py3-numpy:2024-08 . Note the `.` at the end of the command! This indicates that we're using the current directory as our build environment, including the `Dockerfile` inside. @@ -113,7 +113,7 @@ elsewhere, it needs to be added to a public registry like Docker Hub. command line: :::console - $ docker push USERNAME/py3-numpy:2021-08 + $ docker push USERNAME/py3-numpy:2024-08 If the push doesn't work, you may need to run `docker login` first, enter your Docker Hub username and password and then try the push again. diff --git a/docs/materials/software/part4-ex1-download.md b/docs/materials/software/part4-ex1-download.md index 5819f6e..dc28efa 100644 --- a/docs/materials/software/part4-ex1-download.md +++ b/docs/materials/software/part4-ex1-download.md @@ -63,7 +63,7 @@ it. If you want to do this all from the command line, the sequence will look like this (using `wget` as the download command.) :::console - user@login $ wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.15.0+-x64-linux.tar.gz + user@login $ wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.15.0/ncbi-blast-2.15.0+-x64-linux.tar.gz user@login $ tar -xzf ncbi-blast-2.15.0+-x64-linux.tar.gz 1. We're going to be using the `blastx` binary in our job. Where is it @@ -80,8 +80,8 @@ we'll use an abbreviated fasta file with mouse genome information. 1. Download these files to your current directory: :::console - username@login $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2023/pdbaa.tar.gz - username@login $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2023/mouse.fa + username@login $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/pdbaa.tar.gz + username@login $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/mouse.fa 1. Untar the `pdbaa` database: diff --git a/docs/materials/software/part4-ex2-wrapper.md b/docs/materials/software/part4-ex2-wrapper.md index c297ed4..dfbc8e8 100644 --- a/docs/materials/software/part4-ex2-wrapper.md +++ b/docs/materials/software/part4-ex2-wrapper.md @@ -34,7 +34,7 @@ Our wrapper script will be a bash script that runs several commands. :::bash #!/bin/bash - ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results.txt + ./ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results.txt Submit File Changes @@ -76,7 +76,7 @@ Now that our database and BLAST software are being transferred to the job as `ta tar -xzf ncbi-blast-2.15.0+-x64-linux.tar.gz tar -xzf pdbaa.tar.gz - ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt + ./ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt 1. While not strictly necessary, it's a good idea to enable executable permissions on the wrapper script, like so: diff --git a/docs/materials/software/part4-ex3-arguments.md b/docs/materials/software/part4-ex3-arguments.md index 6d71fbd..43a907c 100644 --- a/docs/materials/software/part4-ex3-arguments.md +++ b/docs/materials/software/part4-ex3-arguments.md @@ -50,7 +50,7 @@ and third arguments, respectively. Thus, in the main command of the script, replace the various names with these variables: :::bash - ncbi-blast-2.15.0+/bin/blastx -db $1/$1 -query $2 -out $3 + ./ncbi-blast-2.15.0+/bin/blastx -db $1/$1 -query $2 -out $3 > If your wrapper script is in a different language, you should use that language's syntax for reading in variables from the command line. @@ -71,12 +71,12 @@ One of the downsides of this approach, is that our command has become harder to read. The original script contains all the information at a glance: :::bash - ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt + ./ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt But our new version is more cryptic -- what is `$1`?: :::bash - ncbi-blast-2.15.1+/bin/blastx -db $1 -query $2 -out $3 + ./ncbi-blast-2.15.0+/bin/blastx -db $1 -query $2 -out $3 One way to overcome this is to create our own variable names inside the wrapper script and assign the argument values to them. Here is an example for our @@ -89,10 +89,10 @@ BLAST script: INFILE=$2 OUTFILE=$3 - tar -xzf ncbi-blast-2.15.1+-x64-linux.tar.gz + tar -xzf ncbi-blast-2.15.0+-x64-linux.tar.gz tar -xzf pdbaa.tar.gz - ncbi-blast-2.15.1+/bin/blastx -db $DATABASE/$DATABASE -query $INFILE -out $OUTFILE + ./ncbi-blast-2.15.0+/bin/blastx -db $DATABASE/$DATABASE -query $INFILE -out $OUTFILE Here, we are assigning the input arguments (`$1`, `$2` and `$3`) to new variable names, and then using **those** names (`$DATABASE`, `$INFILE`, and `$OUTFILE`) in the command, diff --git a/docs/materials/software/part5-ex1-prepackaged.md b/docs/materials/software/part5-ex1-prepackaged.md index afe8eb1..09c8168 100644 --- a/docs/materials/software/part5-ex1-prepackaged.md +++ b/docs/materials/software/part5-ex1-prepackaged.md @@ -23,8 +23,7 @@ For this exercise, we will be using the bioinformatics package HMMER. HMMER is a 1. Do an internet search to find the HMMER software downloads page and the installation instructions page. On the installation page, there are short instructions for how to install HMMER. There are two options shown for installation -- which should we use? -1. For the purposes of this example, we are going to use the instructions under the heading "...to obtain and compile from source." Download the HMMER source as shown in -these instructions (command should start with `wget`) +1. For the purposes of this example, we are going to use the instructions under the "Current version" heading, with the "Source" link. Download the HMMER source using wget. 1. Go back to the installation documentation page and look at the steps for compiling from source. This process @@ -44,7 +43,7 @@ for this example, we are going to compile directly on the Access Point 1. Now run the commands to unpack the source code: :::console - username@host $ tar -zxf hmmer.tar.gz + username@host $ tar -zxf hmmer-3.4.tar.gz username@host $ cd hmmer-3.4 1. Now we can follow the second set of installation instructions. For the prefix, we'll use the variable `$PWD` to capture the name of our current working directory and then a relative path to the `hmmer-build` directory we created in step 1: @@ -75,7 +74,7 @@ Note that we now have two tarballs in our directory -- the *source* tarball (`hm Wrapper Script -------------- -Now that we've created our portable installation, we need to write a script that opens and uses the installation, similar to the process we used in a [previous exercise](../part4-ex2-wrapper). These steps should be performed back on the submit server (`ap1.facility.path-cc.io`). +Now that we've created our portable installation, we need to write a script that opens and uses the installation, similar to the process we used in a [previous exercise](../part4-ex2-wrapper). These steps should be performed back on the access point. 1. Create a script called `run_hmmer.sh`. @@ -100,7 +99,7 @@ Now that we've created our portable installation, we need to write a script that 1. Make sure the wrapper script has executable permissions: :::console - username@ap1 $ chmod u+x run_HMMER.sh + username@login $ chmod u+x run_hmmer.sh Run a HMMER job @@ -112,7 +111,7 @@ We're almost ready! We need two more pieces to run a HMMER job. run the job. You already have these files back in the directory where you unpacked the source code: :::console - username@ap1 $ ls hmmer-3.4/tutorial + username@login $ ls hmmer-3.4/tutorial 7LESS_DROME fn3.hmm globins45.fa globins4.sto MADE1.hmm Pkinase.hmm dna_target.fa fn3.sto globins4.hmm HBB_HUMAN MADE1.sto Pkinase.sto @@ -139,4 +138,4 @@ run the job. You already have these files back in the directory where you unpack !!! note For a very similar compiling example, see this guide on how to - compile `samtools`: [Example Software Compilation](https://support.opensciencegrid.org/support/solutions/articles/12000074984-example-software-compilation) + compile `samtools`: [Example Software Compilation](https://portal.osg-htc.org/documentation/htc_workloads/using_software/example-compilation/) diff --git a/docs/materials/software/part5-ex2-python.md b/docs/materials/software/part5-ex2-python.md index acb5417..849ae39 100644 --- a/docs/materials/software/part5-ex2-python.md +++ b/docs/materials/software/part5-ex2-python.md @@ -17,7 +17,7 @@ Background **Why learn this?**: This is very similar to the [previous exercise](part5-ex1-prepackaged.md). -Interactive Job for Pre-Building +Pre-Building -------------------------------- The first step in our job process is building a Python installation that we can package up. @@ -26,7 +26,7 @@ The first step in our job process is building a Python installation that we can 1. Download the Python source code from . :::console - username@ap1 $ wget https://www.python.org/ftp/python/3.10.5/Python-3.10.5.tgz + username@login $ wget https://www.python.org/ftp/python/3.10.5/Python-3.10.5.tgz 1. First, we have to determine how to install Python to a specific location in our working directory. 1. Untar the Python source tarball (`tar -xzf Python-3.10.5.tgz`) and look at the `README.rst` file in the `Python-3.10.5` directory (`cd Python-3.10.5`). You'll want to look for the "Build Instructions" header. What will the main installation steps be? What command is required for the final installation? Once you've tried to answer these questions, move to the next step. From 73375a4420e942e1dbcc3285380743f0de081f17 Mon Sep 17 00:00:00 2001 From: Amber Lim <59936462+xamberl@users.noreply.github.com> Date: Thu, 1 Aug 2024 13:11:09 -0500 Subject: [PATCH 2/2] Update part1-ex3-docker-jobs.md minor typo --- docs/materials/software/part1-ex3-docker-jobs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/materials/software/part1-ex3-docker-jobs.md b/docs/materials/software/part1-ex3-docker-jobs.md index bb4366b..ab1aaf8 100644 --- a/docs/materials/software/part1-ex3-docker-jobs.md +++ b/docs/materials/software/part1-ex3-docker-jobs.md @@ -41,7 +41,7 @@ Submit File and Executable 1. Use the same executable as the [previous exercise](../part1-ex2-apptainer-jobs). -1. Once these steps are done, submit the job. You might get a warning about using OSDF for contaier transfers - ignore this warning for now. +1. Once these steps are done, submit the job. You might get a warning about using OSDF for container transfers - ignore this warning for now. Finding Docker Containers -------------