Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2024 update #31

Merged
merged 3 commits into from
Sep 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,19 @@ ______________________________________________________________________
## Introduction

This is a set of tutorials for the CMS Machine Learning Hands-on Advanced Tutorial Session (HATS).
They are intended to show you how to build machine learning models in python, using `Keras`, `TensorFlow`, and `PyTorch`, and use them in your `ROOT`-based analyses.
We will build event-level classifiers for differentiating VBF Higgs and standard model background 4 muon events and jet-level classifiers for differentiating boosted W boson jets from QCD jets using dense and convolutional neural networks.
They are intended to show you how to build machine learning models in python, using `xgboost`, `Keras`, `TensorFlow`, and `PyTorch`, and use them in your `ROOT`-based analyses.
We will build event-level classifiers for differentiating VBF Higgs and standard model background 4 muon events and jet-level classifiers for differentiating boosted W boson jets from QCD jets using BDTs, and dense and convolutional neural networks.
We will also explore more advanced models such as graph neural networks (GNNs), variational autoencoders (VAEs), and generative adversarial networks (GANs) on simple datasets.

## Setup

### Vanderbilt Jupyterhub (Recommended!)
### Purdue Analysis Facility (New and recommended!)

The recommended method for running the tutorials live is the Vanderbilt Jupyterhub, follow the instructions [here](https://fnallpc.github.io/machine-learning-hats/setup/vanderbilt-jupyterhub/vanderbilt.html).
The recommended method for running the tutorials live is the Purdue AF, follow the instructions [here](https://fnallpc.github.io/machine-learning-hats/setup/purdue/purdue.html).

### Vanderbilt Jupyterhub

Another option is the Vanderbilt Jupyterhub, instructions [here](https://fnallpc.github.io/machine-learning-hats/setup/vanderbilt-jupyterhub/vanderbilt.html).

### FNAL LPC

Expand Down
18 changes: 10 additions & 8 deletions machine-learning-hats/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ root: index
parts:
- caption: Setup
chapters:
- file: setup/purdue/purdue
- file: setup/vanderbilt-jupyterhub/vanderbilt
- file: setup/lpc
- file: setup-libraries
Expand All @@ -14,12 +15,13 @@ parts:
maxdepth: 2
chapters:
- file: notebooks/1-datasets-uproot
- file: notebooks/2-dense
- file: notebooks/2-boosted-decision-tree
- file: notebooks/3-dense
sections:
- file: notebooks/2.1-dense-keras
- file: notebooks/2.2-dense-pytorch
- file: notebooks/2.3-dense-bayesian-optimization
- file: notebooks/3-conv2d
- file: notebooks/4-gnn-cora
- file: notebooks/5-vae-mnist
- file: notebooks/6-gan-mnist
- file: notebooks/3.1-dense-keras
- file: notebooks/3.2-dense-pytorch
- file: notebooks/3.3-dense-bayesian-optimization
- file: notebooks/4-conv2d
- file: notebooks/5-gnn-cora
- file: notebooks/6-vae-mnist
- file: notebooks/7-gan-mnist
468 changes: 15 additions & 453 deletions machine-learning-hats/notebooks/1-datasets-uproot.ipynb

Large diffs are not rendered by default.

556 changes: 556 additions & 0 deletions machine-learning-hats/notebooks/2-boosted-decision-tree.ipynb

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,13 @@
"### Convolution Operation\n",
"Two-dimensional convolutional layer for image height $H$, width $W$, number of input channels $C$, number of output kernels (filters) $N$, and kernel height $J$ and width $K$ is given by:\n",
"\n",
"\\begin{align}\n",
"\\label{convLayer}\n",
"\\boldsymbol{Y}[v,u,n] &= \\boldsymbol{\\beta}[n] + \\sum_{c=1}^{C} \\sum_{j=1}^{J} \\sum_{k=1}^{K} \\boldsymbol{X}[v+j,u+k,c]\\, \\boldsymbol{W}[j,k,c,n]\\,,\n",
"\\end{align}\n",
"$$\n",
"\\boldsymbol{Y}[v,u,n] = \\boldsymbol{\\beta}[n] + \\sum_{c=1}^{C} \\sum_{j=1}^{J} \\sum_{k=1}^{K} \\boldsymbol{X}[v+j,u+k,c]\\, \\boldsymbol{W}[j,k,c,n]\\,,\n",
"$$\n",
"\n",
"where $Y$ is the output tensor of size $V \\times U \\times N$, $W$ is the weight tensor of size $J \\times K \\times C \\times N$ and $\\beta$ is the bias vector of length $N$ .\n",
"\n",
"The example below has $C=1$ input channel and $N=1$ ($J\\times K=3\\times 3$) kernel [credit](https://towardsdatascience.com/types-of-convolution-kernels-simplified-f040cb307c37):\n",
"The example below has $C=1$ input channel and $N=1$ ($J\\times K=3\\times 3$) kernel ([credit](https://towardsdatascience.com/types-of-convolution-kernels-simplified-f040cb307c37)):\n",
"\n",
"![convolution](https://miro.medium.com/v2/resize:fit:780/1*Eai425FYQQSNOaahTXqtgg.gif)"
]
Expand All @@ -84,7 +83,7 @@
"source": [
"### Pooling\n",
"\n",
"We also add pooling layers to reduce the image size between layers. For example, max pooling: (also from [here]([page](https://cs231n.github.io/convolutional-networks/))\n",
"We also add pooling layers to reduce the image size between layers. For example, max pooling (also from [here]([page](https://cs231n.github.io/convolutional-networks/))):\n",
"\n",
"![maxpool](https://cs231n.github.io/assets/cnn/maxpool.jpeg)"
]
Expand Down

Large diffs are not rendered by default.

Binary file added machine-learning-hats/setup/purdue/folders.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added machine-learning-hats/setup/purdue/git.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
33 changes: 33 additions & 0 deletions machine-learning-hats/setup/purdue/purdue.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Purdue Analysis Facility

## 1. Sign-in

See the [Getting Started](https://analysis-facility.physics.purdue.edu/en/latest/doc-getting-started.html) guide and the rest of the documentation for details.

Point your browser to https://cms.geddes.rcac.purdue.edu/hub and log in with your CERN or FNAL account.

Create an instance with the default resources.


## 2. Clone this repository

1. Once the session starts, open the Git sidebar menu:

![git menu](git.png)

2. Click "Clone a Repository".

3. Copy and paste the repository URL: https://github.com/FNALLPC/machine-learning-hats.git and Clone with the default options.

4. You should now see the `machine-learning-hats` directory in your file browser:

![folders](folders.png)

Open it and navigate to `machine-learning-hats` -> `notebooks`


## 3. Notebooks

Open up a notebook and use the `Python3 kernel (default)` kernel. You can now the run the notebook by pressing `Shift + Enter`, one cell at a time.


3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,5 @@ pandas
torch
ipykernel
tqdm
jupyter
jupyter
xgboost
Loading