pytorch.cpp

Running PyTorch Models for Inference using GGML

Directory Structure

conversion.py - Converts weights of a PyTorch model to GGML format
model.py - Sample PyTorch model for training a neural network to learn 2 input truth table
main.cpp - Main driver program for running inference using ggml

python3 model.py xor
OR
python3 model.py and
OR
python3 model.py or

It is a binary format that is designed for fast loading and saving of models, and for ease of reading.
Usually your model weights per layer are stored with dimensions shape, actual dimensions and then actual weights tightly.
Packing needs to be done in binary format such that it can be loaded using GGML C/C++ code.
Run following command to convert your PyTorch model weights stored in assets/model.pth to GGML format:

python3 conversion.py

Refer main.cpp for referring to load and predict functions.
load loads the GGML format and reads the weights to initialize GGML params per layer specific to the model and initialize context.
predict uses the initialized model and perform the vector calculations as a forward pass would do eventually.
Run following command to include GGML headers:

git clone https://github.com/ggerganov/ggml

mkdir build && cd build
cmake ..
make

./bin/pytorch.cpp

This project is licensed under MIT License.