Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to documentation and iGPU example #110

Merged
merged 11 commits into from
Aug 8, 2024
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ AMD Ryzen™ AI Software includes the tools and runtime libraries for optimizing

This repository contains the demos, examples and tutorials, demonstrating usage and capabilities of the Ryzen™ AI Software. It is a subset of the Ryzen™ AI Software release.

Follow the instructions at [Ryzen™ AI Software](https://ryzenai.docs.amd.com/en/latest/inst.html) for installation.

## Git LFS and Instructions to clone:

Due to the presence of large files in some examples/tutorials, Git Large File Storage (LFS) has been configured in this repository. Follow the instructions below to ensure Git LFS is properly set up:
Expand Down
5 changes: 3 additions & 2 deletions example/transformers/models/llm/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ The above list is a just representative collection of models supported using the
Create conda environment:
```powershell
cd <transformers>
set TRANSFORMERS_ROOT=%CD%
conda env create --file=env.yaml
conda activate ryzenai-transformers
build_dependencies.bat
Expand All @@ -45,7 +46,7 @@ build_dependencies.bat
AWQ Model zoo has precomputed scales, clips and zeros for various LLMs including OPT, Llama. Get the precomputed results:
```powershell
git lfs install
cd <transformers>\ext
cd %TRANSFORMERS_ROOT%\ext
git clone https://huggingface.co/datasets/mit-han-lab/awq-model-zoo awq_cache
```

Expand Down Expand Up @@ -100,7 +101,7 @@ pip install ops\torch_cpp --force-reinstall
When using locally downloaded weights, pass the model directory name as the argument to model_name. Only certain model names are supported by default, make sure the model directory name matches the supported model name.

```powershell
cd <transformers>\models\llm
cd %TRANSFORMERS_ROOT%\models\llm
```

### Recipe 1: Smoothquant with w8a8/w8a16
Expand Down
36 changes: 18 additions & 18 deletions example/transformers/models/llm_assisted_generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ Complete the setup instructions for [LLMs on RyzenAI with Pytorch](../llm/docs/R
Generate the quantized checkpoints by running ```models/llm/run_awq.py``` for target models.

* To generate the quantized target model for OPT-6.7b run the below command from ```models/llm``` directory

```python run_awq.py --model_name facebook/opt-6.7b --task quantize```

```
python run_awq.py --model_name facebook/opt-6.7b --task quantize
```
* To generate the quantized target model for llama-2-7b run the below command from ```models/llm``` directory

```python run_awq.py --model_name llama-2-7b --task quantize```

```
python run_awq.py --model_name llama-2-7b --task quantize
```
## Step 2: Get draft models
* For OPT-6.7b, ```facebook/opt-125m``` is used as the draft model.
* For llama-2-7b, ```JackFram/llama-160m``` is used as the draft model.
Expand All @@ -31,22 +31,22 @@ Generate the quantized checkpoints by running ```models/llm/run_awq.py``` for ta
# Instructions to run the LLM models
## 1. OPT-6.7b
### a. OPT-6.7b without assisted generation

```python assisted_generation.py --model_name opt-6.7b --task benchmark ```

```
python assisted_generation.py --model_name opt-6.7b --task benchmark
```
### b. OPT-6.7b with OPT-125M assistant model

```python assisted_generation.py --model_name opt-6.7b --task benchmark --assisted_generation```

```
python assisted_generation.py --model_name opt-6.7b --task benchmark --assisted_generation
```
## 2. llama-2-7b
### a. llama-2-7b without assisted generation, with fast attention

```python assisted_generation.py --model_name llama-2-7b --task benchmark --fast_attention```

```
python assisted_generation.py --model_name llama-2-7b --task benchmark --fast_attention
```
### b. llama-2-7b with llama-160m assistant model, with fast attention

```python assisted_generation.py --model_name llama-2-7b --task benchmark --assisted_generation --fast_attention```

```
python assisted_generation.py --model_name llama-2-7b --task benchmark --assisted_generation --fast_attention
```
**Note:**
- fast_attention argument is only supported with llama-2-7b, and llama-2-7b-chat models in this release.
- Know issue related to kernel driver shows up when using assisted_generation with Llama-2-7b.
20 changes: 19 additions & 1 deletion example/transformers/models/llm_gguf/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,32 @@ Assumes Windows CMD shell
### Activate ryzenai-transformers conda-enviornment
```console
cd <transformers>
set TRANSFORMERS_ROOT=%CD%
conda env create --file=env.yaml
conda activate ryzenai-transformers
```

⚠️ **Warning:** Windows has a path length limit that you may hit when building the project or installing the wheels, resulting in cryptic errors.
To work around it, use a virtual drive to shorten the path the repository is cloned to:

*On Command Prompt*
```batch
@REM use any unused drive letter, Z: for example
subst Z: %cd%
@REM switch to the Z: drive
Z:
```

You can remove the virtual drive with:

*On Command Prompt*
```batch
subst /d Z:
```

### Build and Install RyzenAI
```console
setup_phx.bat # or setup_stx.bat
set TRANSFORMERS_ROOT=%PYTORCH_AIE_PATH%

cd %TRANSFORMERS_ROOT%\ops\cpp
cmake -B build\ -DCMAKE_INSTALL_PREFIX=%CONDA_PREFIX%
Expand Down
16 changes: 11 additions & 5 deletions example/transformers/models/llm_onnx/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,15 @@ and accept [Huggingface license](https://huggingface.co/meta-llama/Llama-2-7b-hf

:pushpin: Conda environment with python 3.10

Setup the environment variable:
```powershell
cd <transformers>
set TRANSFORMERS_ROOT=%CD%
```

Create conda environment:
```powershell
cd <transformers/models/llm_onnx>
cd %TRANSFORMERS_ROOT%\models\llm_onnx
conda update -n base -c defaults conda -y
conda env create --file=env.yaml
conda activate llm_onnx
Expand All @@ -40,13 +46,13 @@ pip install numpy==1.26.4

##### For PHX
```
cd <transformers>
cd %TRANSFORMERS_ROOT%
.\setup_phx.bat
```

##### For STX
```
cd <transformers>
cd %TRANSFORMERS_ROOT%
.\setup_stx.bat
```

Expand All @@ -58,6 +64,7 @@ Use "prepare_model.py" script to export, optimize and quantize the LLMs. You can

Check script usage
```powershell
cd %TRANSFORMERS_ROOT%\models\llm_onnx
python prepare_model.py --help

usage: prepare_model.py [-h]
Expand Down Expand Up @@ -86,7 +93,6 @@ optional arguments:
#### Export, Optimize and quantize the model

```powershell

python .\prepare_model.py --model_name <model_name> --output_model_dir <output directory> --export --optimize --quantize
```
#### Optimize and quantize existing model
Expand Down Expand Up @@ -149,7 +155,7 @@ as `float32` model. The quantizer used is `MatMul4BitsQuantizer` from onnxruntim
> As for the optimizer , ORT optimizer is used.
### Using ONNX Runtime Interface

Copy the 'model.onnx.data' file from output_model_dir to the models/llm_onnx/ folder.
**Note:** Copy the 'model.onnx.data' file from output_model_dir to the models/llm_onnx/ folder.

**Note:** Each run generates a log file in `./logs` directory with name `log_<model_name>.log`.
```powershell
Expand Down
51 changes: 48 additions & 3 deletions iGPU/getting_started/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ This is an example showing how to run the ResNet50 model from PyTorch on AMD's i

## Activate Ryzen AI conda environment

Activate the conda environment created by the automatic installer:
Activate the conda environment created by the MSI installer:

```powershell
conda activate ryzenai-1.2
conda activate ryzen-ai-1.2.0
```

## Install Olive
Expand All @@ -33,8 +33,53 @@ The optimized models will be available in `./torch_to_onnx-float16_conversion-pe

## Run the generated model on the iGPU

### Deployment in Python

```powershell
python predict.py
```

**_NOTE:_** In predict.py, line 15, the iGPU device ID is enumerated as 0. For PCs with multiple GPUs, you may adjust the device_id to target a specific iGPU.

### Deployment in C++

#### Prerequisites

1. Visual Studio 2022 Community edition, ensure “Desktop Development with C++” is installed
2. cmake (version >= 3.26)
3. opencv (version=4.6.0) required for the resnet50 example

#### Install OpenCV from source

It is recommended to build OpenCV from the source code and use static build. The following instruction installs OpenCV in the location "C:\\opencv" as an example, this can be changed by modifying `CMAKE_PREFIX_PATH` in the following cmake command. You may first change the directory to where you want to clone the OpenCV repository.

```powershell
git clone https://github.com/opencv/opencv.git -b 4.6.0
cd opencv

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_SHARED_LIBS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DCMAKE_CONFIGURATION_TYPES=Release -A x64 -T host=x64 -G "Visual Studio 17 2022" "-DCMAKE_INSTALL_PREFIX=C:\opencv" "-DCMAKE_PREFIX_PATH=C:\opencv" -DCMAKE_BUILD_TYPE=Release -DBUILD_opencv_python2=OFF -DBUILD_opencv_python3=OFF -DBUILD_WITH_STATIC_CRT=OFF -B build

cmake --build build --config Release
cmake --install build --config Release
```
The build files will be written to ``build\``.

#### Run Olive-optimized ResNet50 model on the iGPU

Build the given ResNet50 C++ example:

```powershell

cd cpp
compile.bat "path/to/your/opencv/build"
```

Run inference:

```powershell

run.bat
```




20 changes: 20 additions & 0 deletions iGPU/getting_started/cpp/compile.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
@echo off

if "%RYZEN_AI_INSTALLATION_PATH%" == "" echo RYZEN_AI_INSTALLATION_PATH not set. This script requires the RYZEN_AI_INSTALLATION_PATH env var to be set to the RyzenAI 1.2 installation folder. & goto :error

REM Check if the first argument is provided
if "%1"=="" (
echo Usage: %0 [OpenCV_DIR]
exit /b 1
)

set "OpenCV_DIR=%~1"

echo OpenCV_DIR is set to: %OpenCV_DIR%

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_SHARED_LIBS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DCMAKE_CONFIGURATION_TYPES=Release -A x64 -T host=x64 -DCMAKE_INSTALL_PREFIX=. -DCMAKE_PREFIX_PATH=. -B build -S resnet50 -DOpenCV_DIR="%OpenCV_DIR%" -G "Visual Studio 17 2022"

cmake --build .\build --config Release --target ALL_BUILD

:error
exit /b %errorlevel%
38 changes: 38 additions & 0 deletions iGPU/getting_started/cpp/resnet50/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
cmake_minimum_required(VERSION 3.5)

project(resnet50 VERSION 1.0.0 LANGUAGES C CXX)
set(CMAKE_CXX_STANDARD 17)
find_package(Python 3.10 EXACT COMPONENTS Interpreter Development)

set (RYZEN_AI_INSTALLATION_PATH $ENV{RYZEN_AI_INSTALLATION_PATH})


set(ONNXRUNTIME_ROOTDIR "${RYZEN_AI_INSTALLATION_PATH}/onnxruntime")

include_directories("${ONNXRUNTIME_ROOTDIR}/include/onnxruntime/core/session")

link_directories("${ONNXRUNTIME_ROOTDIR}/lib")

link_directories("${CMAKE_INSTALL_PREFIX}/lib")



find_package(OpenCV COMPONENTS core highgui imgproc REQUIRED)

add_executable(resnet50 resnet50.cpp util/getopt.c)
target_include_directories(resnet50
PRIVATE
${CMAKE_CURRENT_SOURCE_DIR}/util
)
target_link_libraries(resnet50 ${ORT_LIBRARY} ${OpenCV_LIBS} onnxruntime)
install(TARGETS resnet50 RUNTIME DESTINATION bin)

add_custom_command(
TARGET ${PROJECT_NAME} POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy
${RYZEN_AI_INSTALLATION_PATH}/onnxruntime/bin/DirectML.dll
${CMAKE_CURRENT_BINARY_DIR}/$<CONFIGURATION>/DirectML.dll)

add_custom_command(
TARGET ${PROJECT_NAME} POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy
${RYZEN_AI_INSTALLATION_PATH}/onnxruntime/bin/onnxruntime.dll
${CMAKE_CURRENT_BINARY_DIR}/$<CONFIGURATION>/onnxruntime.dll)
Loading