Skip to content

Commit

Permalink
Merge pull request #25 from rynge/rynge-software
Browse files Browse the repository at this point in the history
Software exercises - 2024 updates
  • Loading branch information
xamberl authored Aug 1, 2024
2 parents 0f63781 + 73375a4 commit 239f005
Show file tree
Hide file tree
Showing 12 changed files with 41 additions and 38 deletions.
4 changes: 2 additions & 2 deletions docs/materials/software/part1-ex1-run-apptainer.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,12 @@ Exploring Apptainer Containers

First, let's try to run a container from the [OSG-Supported List](https://portal.osg-htc.org/documentation/htc_workloads/using_software/available-containers-list/).

1. Find the full path for the `ubuntu 20.04` container image.
1. Find the full path for the `ubuntu 22.04` container image.

1. To run it, use this command:

:::console
$ apptainer shell /cvmfs/singularity.opensciencegrid.org/opensciencegrid/osgvo-ubuntu-20.04:latest
$ apptainer shell /cvmfs/singularity.opensciencegrid.org/htc/ubuntu:22.04

It may take a few minutes to start - don't worry if this happens.

Expand Down
4 changes: 3 additions & 1 deletion docs/materials/software/part1-ex2-apptainer-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,9 @@ Now, let's try running that same script inside a container.

:::file
universe = container
container_image = /cvmfs/singularity.opensciencegrid.org/opensciencegrid/osgvo-ubuntu-20.04:latest
container_image = /cvmfs/singularity.opensciencegrid.org/htc/ubuntu:22.04

1. If the submit file you copied has something like `requirements = (OSGVO_OS_STRING == "RHEL 9")`, remove that. When you use containers, you should not specify an OS in the requirements as that will unnecessarily limit the number of resources you can run on.

1. Submit the job and read the standard output file when it completes.
What version of Linux was used for the job? What is the version of `gcc`? or Python?
Expand Down
6 changes: 4 additions & 2 deletions docs/materials/software/part1-ex3-docker-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Submit File and Executable

1. Use the same executable as the [previous exercise](../part1-ex2-apptainer-jobs).

1. Once these steps are done, submit the job.
1. Once these steps are done, submit the job. You might get a warning about using OSDF for container transfers - ignore this warning for now.

Finding Docker Containers
-------------
Expand All @@ -54,9 +54,11 @@ created equal. Anyone can create an account on Docker Hub and share container im
- There is a Dockerfile or other listing of what has been installed to the container image.
- The container image page has documentation on how to use the container image. [^1]

Given these indicators:

1. Can you find a container on [Docker Hub](https://hub.docker.com/) that would be
useful for running Jupyter notebooks that use tensorflow?

1. Does your chosen image meet at least 2 of the criteria above?

[^1]: This list and previous text taken from [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/)
[^1]: This list and previous text taken from [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/)
6 changes: 3 additions & 3 deletions docs/materials/software/part1-ex4-apptainer-build.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ when it builds the container image.

:::file
Bootstrap: docker
From: opensciencegrid/osgvo-ubuntu-20.04:latest
From: hub.opensciencegrid.org/htc/ubuntu:22.04

%post
apt-get update -y
Expand Down Expand Up @@ -93,12 +93,12 @@ allow us to test our new container.
1. Try running:

:::console
$ singularity shell py-cowsay.sif
$ apptainer shell py-cowsay.sif

1. Then try running the `hello-cow.py` script:

:::console
Singularity< ./hello-cow.py
apptainer> ./hello-cow.py

1. If it produces an output, our container works! We can now exit (by typing `exit`)
and submit a job.
Expand Down
6 changes: 3 additions & 3 deletions docs/materials/software/part2-ex1-build-executable.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Setup
1. Download and unzip a set of Protein Data Bank (PDB) files:

:::console
$ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2023/alkanes.tar.gz
$ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/alkanes.tar.gz
$ tar -xzf alkanes.tar.gz

1. For these exercises, we are going to run a command that counts the number of
Expand All @@ -43,7 +43,7 @@ run the script, we will add the following header on the first line: ``#!/bin/bas
The "header" of `#!/bin/bash` will tell the computer that this is a bash shell script
and can be run in the same way that you would run individual commands on the command line.
We use `/bin/bash` instead of just `bash` because that is the full path to the `bash`
software file. (Run `which bash` to check!)
software file.

!!! note "Other languages"
We can use the same principle for any scripting language. For example, the header for a Python script
Expand Down Expand Up @@ -147,4 +147,4 @@ Your Work
If so, what should it be?

1. What items in your main code or commands are changing? Do you need to add arguments
to your code?
to your code?
10 changes: 5 additions & 5 deletions docs/materials/software/part3-ex1-apptainer-recipes.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,12 @@ Where to start

:::file
Bootstrap: docker
From: opensciencegrid/osgvo-ubuntu-20.04:latest
From: hub.opensciencegrid.org/htc/ubuntu:22.04

A custom container always is always built on an existing container. It is common
to use a container on Docker Hub. These lines tell Apptainer to pull the
pre-existing image from Docker Hub, and to use it as the base for the
container that will be built using this definition file.
to use a container on Docker Hub, or in this case, hub.opensciencegrid.org.
These lines tell Apptainer to pull the pre-existing image from the hub, and to
use it as the base for the container that will be built using this definition file.

When choosing a base container, try to find one that has most of what you need - for
example, if you want to install R packages, try to find a container that already
Expand Down Expand Up @@ -95,4 +95,4 @@ Environment
To set environment variables (especially useful for software in a custom location),
use the `%environment` section of the definition file.

[^1]: This text and previous list taken from [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/)
[^1]: This text and previous list taken from [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/)
6 changes: 3 additions & 3 deletions docs/materials/software/part3-ex2-docker-build.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,14 +91,14 @@ new container, we will want to use a similar naming scheme of:

In what follows, you will want to replace `USERNAME` with your DockerHub user name.
The `CONTAINER` name and `VERSIONTAG` are your choice; in what follows, we will
use `py3-numpy` as the container name and `2021-08` as the version tag.
use `py3-numpy` as the container name and `2024-08` as the version tag.

1. To build and name the new container, open a command line window on your computer
where you can run Docker commands. Use the `cd` command to change your working directory
to the build directory with the `Dockerfile` inside.

:::console
$ docker build -t USERNAME/py3-numpy:2021-08 .
$ docker build -t USERNAME/py3-numpy:2024-08 .
Note the `.` at the end of the command! This indicates that we're using the current
directory as our build environment, including the `Dockerfile` inside.
Expand All @@ -113,7 +113,7 @@ elsewhere, it needs to be added to a public registry like Docker Hub.
command line:

:::console
$ docker push USERNAME/py3-numpy:2021-08
$ docker push USERNAME/py3-numpy:2024-08

If the push doesn't work, you may need to run `docker login` first, enter your
Docker Hub username and password and then try the push again.
Expand Down
6 changes: 3 additions & 3 deletions docs/materials/software/part4-ex1-download.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ it. If you want to do this all from the command line, the sequence will
look like this (using `wget` as the download command.)

:::console
user@login $ wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.15.0+-x64-linux.tar.gz
user@login $ wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.15.0/ncbi-blast-2.15.0+-x64-linux.tar.gz
user@login $ tar -xzf ncbi-blast-2.15.0+-x64-linux.tar.gz

1. We're going to be using the `blastx` binary in our job. Where is it
Expand All @@ -80,8 +80,8 @@ we'll use an abbreviated fasta file with mouse genome information.
1. Download these files to your current directory:

:::console
username@login $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2023/pdbaa.tar.gz
username@login $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2023/mouse.fa
username@login $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/pdbaa.tar.gz
username@login $ wget http://proxy.chtc.wisc.edu/SQUID/osg-school-2024/mouse.fa

1. Untar the `pdbaa` database:

Expand Down
4 changes: 2 additions & 2 deletions docs/materials/software/part4-ex2-wrapper.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Our wrapper script will be a bash script that runs several commands.
:::bash
#!/bin/bash
ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results.txt
./ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results.txt


Submit File Changes
Expand Down Expand Up @@ -76,7 +76,7 @@ Now that our database and BLAST software are being transferred to the job as `ta
tar -xzf ncbi-blast-2.15.0+-x64-linux.tar.gz
tar -xzf pdbaa.tar.gz

ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt
./ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt

1. While not strictly necessary, it's a good idea to enable executable permissions on the wrapper script, like so:

Expand Down
10 changes: 5 additions & 5 deletions docs/materials/software/part4-ex3-arguments.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ and third arguments, respectively. Thus, in the main command of the script,
replace the various names with these variables:

:::bash
ncbi-blast-2.15.0+/bin/blastx -db $1/$1 -query $2 -out $3
./ncbi-blast-2.15.0+/bin/blastx -db $1/$1 -query $2 -out $3

> If your wrapper script is in a different language, you should use
that language's syntax for reading in variables from the command line.
Expand All @@ -71,12 +71,12 @@ One of the downsides of this approach, is that our command has become
harder to read. The original script contains all the information at a glance:

:::bash
ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt
./ncbi-blast-2.15.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt

But our new version is more cryptic -- what is `$1`?:

:::bash
ncbi-blast-2.15.1+/bin/blastx -db $1 -query $2 -out $3
./ncbi-blast-2.15.0+/bin/blastx -db $1 -query $2 -out $3

One way to overcome this is to create our own variable names inside the wrapper
script and assign the argument values to them. Here is an example for our
Expand All @@ -89,10 +89,10 @@ BLAST script:
INFILE=$2
OUTFILE=$3

tar -xzf ncbi-blast-2.15.1+-x64-linux.tar.gz
tar -xzf ncbi-blast-2.15.0+-x64-linux.tar.gz
tar -xzf pdbaa.tar.gz

ncbi-blast-2.15.1+/bin/blastx -db $DATABASE/$DATABASE -query $INFILE -out $OUTFILE
./ncbi-blast-2.15.0+/bin/blastx -db $DATABASE/$DATABASE -query $INFILE -out $OUTFILE

Here, we are assigning the input arguments (`$1`, `$2` and `$3`) to new variable names, and
then using **those** names (`$DATABASE`, `$INFILE`, and `$OUTFILE`) in the command,
Expand Down
13 changes: 6 additions & 7 deletions docs/materials/software/part5-ex1-prepackaged.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,7 @@ For this exercise, we will be using the bioinformatics package HMMER. HMMER is a
1. Do an internet search to find the HMMER software downloads page and the
installation instructions page. On the installation page, there are short instructions for how to install HMMER. There are two options shown for installation -- which should we use?

1. For the purposes of this example, we are going to use the instructions under the heading "...to obtain and compile from source." Download the HMMER source as shown in
these instructions (command should start with `wget`)
1. For the purposes of this example, we are going to use the instructions under the "Current version" heading, with the "Source" link. Download the HMMER source using wget.

1. Go back to the installation
documentation page and look at the steps for compiling from source. This process
Expand All @@ -44,7 +43,7 @@ for this example, we are going to compile directly on the Access Point
1. Now run the commands to unpack the source code:

:::console
username@host $ tar -zxf hmmer.tar.gz
username@host $ tar -zxf hmmer-3.4.tar.gz
username@host $ cd hmmer-3.4

1. Now we can follow the second set of installation instructions. For the prefix, we'll use the variable `$PWD` to capture the name of our current working directory and then a relative path to the `hmmer-build` directory we created in step 1:
Expand Down Expand Up @@ -75,7 +74,7 @@ Note that we now have two tarballs in our directory -- the *source* tarball (`hm
Wrapper Script
--------------

Now that we've created our portable installation, we need to write a script that opens and uses the installation, similar to the process we used in a [previous exercise](../part4-ex2-wrapper). These steps should be performed back on the submit server (`ap1.facility.path-cc.io`).
Now that we've created our portable installation, we need to write a script that opens and uses the installation, similar to the process we used in a [previous exercise](../part4-ex2-wrapper). These steps should be performed back on the access point.

1. Create a script called `run_hmmer.sh`.

Expand All @@ -100,7 +99,7 @@ Now that we've created our portable installation, we need to write a script that
1. Make sure the wrapper script has executable permissions:

:::console
username@ap1 $ chmod u+x run_HMMER.sh
username@login $ chmod u+x run_hmmer.sh


Run a HMMER job
Expand All @@ -112,7 +111,7 @@ We're almost ready! We need two more pieces to run a HMMER job.
run the job. You already have these files back in the directory where you unpacked the source code:

:::console
username@ap1 $ ls hmmer-3.4/tutorial
username@login $ ls hmmer-3.4/tutorial
7LESS_DROME fn3.hmm globins45.fa globins4.sto MADE1.hmm Pkinase.hmm
dna_target.fa fn3.sto globins4.hmm HBB_HUMAN MADE1.sto Pkinase.sto

Expand All @@ -139,4 +138,4 @@ run the job. You already have these files back in the directory where you unpack

!!! note
For a very similar compiling example, see this guide on how to
compile `samtools`: [Example Software Compilation](https://support.opensciencegrid.org/support/solutions/articles/12000074984-example-software-compilation)
compile `samtools`: [Example Software Compilation](https://portal.osg-htc.org/documentation/htc_workloads/using_software/example-compilation/)
4 changes: 2 additions & 2 deletions docs/materials/software/part5-ex2-python.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Background
**Why learn this?**: This is very similar to the [previous exercise](part5-ex1-prepackaged.md).


Interactive Job for Pre-Building
Pre-Building
--------------------------------

The first step in our job process is building a Python installation that we can package up.
Expand All @@ -26,7 +26,7 @@ The first step in our job process is building a Python installation that we can
1. Download the Python source code from <https://www.python.org/>.

:::console
username@ap1 $ wget https://www.python.org/ftp/python/3.10.5/Python-3.10.5.tgz
username@login $ wget https://www.python.org/ftp/python/3.10.5/Python-3.10.5.tgz

1. First, we have to determine how to install Python to a specific location in our working directory.
1. Untar the Python source tarball (`tar -xzf Python-3.10.5.tgz`) and look at the `README.rst` file in the `Python-3.10.5` directory (`cd Python-3.10.5`). You'll want to look for the "Build Instructions" header. What will the main installation steps be? What command is required for the final installation? Once you've tried to answer these questions, move to the next step.
Expand Down

0 comments on commit 239f005

Please sign in to comment.