Skip to content

Commit

Permalink
Add support for generating models ops test (#998)
Browse files Browse the repository at this point in the history
1. **Restructure the model analysis script:**

Fixes [#1001](#1001)

Instead of having multiple class(i.e MarkDownWriter,
MatchingExceptionRule, ModelVariantInfo, etc.) and functions in single
python file, created python package named model_analysis and declared
the function and class in separate file.

Eg: MarkDownWriter class is used for creating and writing markdown files
which is included in markdown.py file and
common_failure_matching_rules_list has separate python module named
expcetion_rules.py file
     
     
2. **Created a script for generating models ops test from the unique ops
configuration extracted all the models present in the
`forge/test/models` directory.**

Fixes [#874](#761)

 Workflow:
1. Collect all the test that doesn't contain skip_model_analysis marker
in the directory path specified the user
     eg: `forge/test/models`.
2. Run all the collected tests to extract the unique ops configuration
and export the model unique ops configuration as excel and metadata json
file
Note: It will doesn't generate unique op test like model analysis
pipeline
3. After extracting unique op configuration for the all test, then try
to extract the unique ops configuration across all the test (i.e model
variants)
4. By using extracted unique ops configuration across all the model
variants, create models ops test with forge module in the directory path
specified by the user.
5. Black formating and spdx headers are also automatically done in the
generated nightly/push test
    
**Note:** 
The ForgeModules present in the generated models ops test doesn't use
actual parameter/constant tensor values from the model parameter/buffers
by using the process_framework_parameter function. Instead, it will
generate random tensor based upon the constant/parameter tensor shapes
and dtypes which is done inside the test function.
<img width="937" alt="Screenshot 2025-01-07 at 6 32 35 PM"
src="https://github.com/user-attachments/assets/851ae7ef-da53-407a-a344-4656dd3e92e5"
/>

    
3. **Breakdown the Model Analysis Weekly workflow:**

Fixes [#1002](#1002)
1. model-analysis.yml -> Common yml file used for running the model
analysis pipeline for markdown generation and models ops test
generation.
2. model-analysis-weekly.yml -> Workflow used for triggering the Model
Analysis Workflow(i.e model-analysis.yml) for running the model analysis
and markdown files generation.
3. model-analysis-config.sh -> shell script containing the script and PR
configuration/environmental variables for the markdown generation and
model test generation.
              
The generated models ops test PR for albert model -
#1014
  • Loading branch information
chandrasekaranpradeep authored Jan 10, 2025
1 parent 640718a commit d9f3cfb
Show file tree
Hide file tree
Showing 116 changed files with 2,254 additions and 1,639 deletions.
49 changes: 49 additions & 0 deletions .github/model-analysis-config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/bin/bash
# SPDX-FileCopyrightText: (c) 2024 Tenstorrent AI ULC
#
# SPDX-License-Identifier: Apache-2.0

# If set to true, it will set the environment variables for models ops test generation otherwise markdown generation env variables will be set
GENERATE_MODELS_OPS_TEST=$1

# Declare an associative array to store environment variables
declare -A env_vars

# Markdown Generation
# 1) PR config
env_vars["BRANCH_NAME"]="model_analysis"
env_vars["COMMIT_MESSAGE"]="Update model analysis documentation"
env_vars["TITLE"]="Update model analysis documentation"
env_vars["BODY"]="This PR will update model analysis documentation."
env_vars["OUTPUT_PATH"]="model_analysis_docs/"

# 2) Script config
env_vars["MARDOWN_DIR_PATH"]="./model_analysis_docs"
env_vars["SCRIPT_OUTPUT_LOG"]="model_analysis.log"


# Model ops test generation
# 1) Script config
env_vars["MODELS_OPS_TEST_OUTPUT_DIR_PATH"]="forge/test"
env_vars["MODELS_OPS_TEST_PACKAGE_NAME"]="models_ops"


# Common Config for markdown generation and model ops test generation
env_vars["TEST_DIR_OR_FILE_PATH"]="forge/test/models"
env_vars["UNIQUE_OPS_OUTPUT_DIR_PATH"]="./models_unique_ops_output"


# If GENERATE_MODELS_OPS_TEST is set to true, Modify the PR config to model ops test generation.
if [[ "$GENERATE_MODELS_OPS_TEST" == "true" ]]; then
env_vars["BRANCH_NAME"]="generate_models_ops_test"
env_vars["COMMIT_MESSAGE"]="Generate and update models ops tests"
env_vars["TITLE"]="Generate and update models ops tests"
env_vars["BODY"]="This PR will generate models ops tests by extracting the unique ops configurations across all the pytorch models present inside the forge/test/models directory path."
env_vars["OUTPUT_PATH"]="forge/test/models_ops/"
env_vars["SCRIPT_OUTPUT_LOG"]="generate_models_ops_test.log"
fi


for key in "${!env_vars[@]}"; do
echo "$key=${env_vars[$key]}"
done
116 changes: 4 additions & 112 deletions .github/workflows/model-analysis-weekly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,116 +6,8 @@ on:
- cron: '0 23 * * 5' # 11:00 PM UTC Friday (12:00 AM Saturday Serbia)

jobs:

docker-build:
uses: ./.github/workflows/build-image.yml
model-analysis-weekly:
uses: ./.github/workflows/model-analysis.yml
secrets: inherit

model-analysis:
needs: docker-build
runs-on: runner
timeout-minutes: 10080 # Set job execution time to 7 days(default: 6 hours)

container:
image: ${{ needs.docker-build.outputs.docker-image }}
options: --device /dev/tenstorrent/0
volumes:
- /dev/hugepages:/dev/hugepages
- /dev/hugepages-1G:/dev/hugepages-1G
- /etc/udev/rules.d:/etc/udev/rules.d
- /lib/modules:/lib/modules
- /opt/tt_metal_infra/provisioning/provisioning_env:/opt/tt_metal_infra/provisioning/provisioning_env

env:
GITHUB_TOKEN: ${{ secrets.GH_TOKEN }}

steps:

- name: Set reusable strings
id: strings
shell: bash
run: |
echo "work-dir=$(pwd)" >> "$GITHUB_OUTPUT"
echo "build-output-dir=$(pwd)/build" >> "$GITHUB_OUTPUT"
- name: Git safe dir
run: git config --global --add safe.directory ${{ steps.strings.outputs.work-dir }}

- uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 0 # Fetch all history and tags
token: ${{ env.GITHUB_TOKEN }}

# Clean everything from submodules (needed to avoid issues
# with cmake generated files leftover from previous builds)
- name: Cleanup submodules
run: |
git submodule foreach --recursive git clean -ffdx
git submodule foreach --recursive git reset --hard
- name: ccache
uses: hendrikmuhs/[email protected]
with:
create-symlink: true
key: model-analysis-${{ runner.os }}

- name: Build
shell: bash
run: |
source env/activate
cmake -G Ninja \
-B ${{ steps.strings.outputs.build-output-dir }} \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_C_COMPILER_LAUNCHER=ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=ccache
cmake --build ${{ steps.strings.outputs.build-output-dir }}
- name: Run Model Analysis Script
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_HUB_DISABLE_PROGRESS_BARS: 1
shell: bash
run: |
source env/activate
apt-get update
apt install -y libgl1 libglx-mesa0
set -o pipefail # Ensures that the exit code reflects the first command that fails
python scripts/model_analysis.py \
--test_directory_or_file_path forge/test/models/pytorch \
--dump_failure_logs \
--markdown_directory_path ./model_analysis_docs \
--unique_ops_output_directory_path ./models_unique_ops_output \
2>&1 | tee model_analysis.log
- name: Upload Model Analysis Script Logs
uses: actions/upload-artifact@v4
if: success() || failure()
with:
name: model-analysis-outputs
path: model_analysis.log

- name: Upload Models Unique Ops test Failure Logs
uses: actions/upload-artifact@v4
if: success() || failure()
with:
name: unique-ops-logs
path: ./models_unique_ops_output

- name: Create Pull Request
uses: peter-evans/create-pull-request@v7
with:
branch: model_analysis
committer: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
author: ${{ github.actor }} <${{ github.actor }}@users.noreply.github.com>
base: main
commit-message: "Update model analysis docs"
title: "Update model analysis docs"
body: "This PR will update model analysis docs"
labels: automatic_model_analysis
delete-branch: true
token: ${{ env.GITHUB_TOKEN }}
add-paths: |
model_analysis_docs/
with:
generate_models_ops_test: false
155 changes: 155 additions & 0 deletions .github/workflows/model-analysis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
name: Model Analysis

on:
workflow_dispatch:
inputs:
generate_models_ops_test:
description: 'If set to True, it will generate models ops test by extracting the unique ops config across all the models otherwise it will run the model analysis and generate markdown files'
required: false
type: boolean
default: false
workflow_call:
inputs:
generate_models_ops_test:
description: 'If set to True, it will generate models ops test by extracting the unique ops config across all the models otherwise it will run the model analysis and generate markdown files'
required: false
type: boolean
default: false

jobs:

docker-build:
uses: ./.github/workflows/build-image.yml
secrets: inherit

model-analysis:
needs: docker-build
runs-on: runner
timeout-minutes: 4320 # Set job execution time to 3 days(default: 6 hours)

container:
image: ${{ needs.docker-build.outputs.docker-image }}
options: --device /dev/tenstorrent/0
volumes:
- /dev/hugepages:/dev/hugepages
- /dev/hugepages-1G:/dev/hugepages-1G
- /etc/udev/rules.d:/etc/udev/rules.d
- /lib/modules:/lib/modules
- /opt/tt_metal_infra/provisioning/provisioning_env:/opt/tt_metal_infra/provisioning/provisioning_env

env:
GITHUB_TOKEN: ${{ secrets.GH_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_HUB_DISABLE_PROGRESS_BARS: 1

steps:

- name: Set reusable strings
id: strings
shell: bash
run: |
echo "work-dir=$(pwd)" >> "$GITHUB_OUTPUT"
echo "build-output-dir=$(pwd)/build" >> "$GITHUB_OUTPUT"
- name: Git safe dir
run: git config --global --add safe.directory ${{ steps.strings.outputs.work-dir }}

- uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 0 # Fetch all history and tags
token: ${{ env.GITHUB_TOKEN }}

# Clean everything from submodules (needed to avoid issues
# with cmake generated files leftover from previous builds)
- name: Cleanup submodules
run: |
git submodule foreach --recursive git clean -ffdx
git submodule foreach --recursive git reset --hard
- name: ccache
uses: hendrikmuhs/[email protected]
with:
create-symlink: true
key: model-analysis-${{ runner.os }}

- name: Set environment variables
shell: bash
run: |
OUTPUT=$(bash .github/model-analysis-config.sh ${{ inputs.generate_models_ops_test }})
# Assign the script output to GitHub environment variables
echo "$OUTPUT" | while IFS= read -r line; do
echo "$line" >> $GITHUB_ENV
done
- name: Build
shell: bash
run: |
source env/activate
cmake -G Ninja \
-B ${{ steps.strings.outputs.build-output-dir }} \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_C_COMPILER_LAUNCHER=ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=ccache
cmake --build ${{ steps.strings.outputs.build-output-dir }}
- name: Run Model Analysis Script
if: ${{ !inputs.generate_models_ops_test }}
shell: bash
run: |
source env/activate
apt-get update
apt install -y libgl1 libglx-mesa0
set -o pipefail # Ensures that the exit code reflects the first command that fails
python scripts/model_analysis/run_analysis_and_generate_md_files.py \
--test_directory_or_file_path ${{ env.TEST_DIR_OR_FILE_PATH }} \
--dump_failure_logs \
--markdown_directory_path ${{ env.MARDOWN_DIR_PATH }} \
--unique_ops_output_directory_path ${{ env.UNIQUE_OPS_OUTPUT_DIR_PATH }} \
2>&1 | tee ${{ env.SCRIPT_OUTPUT_LOG }}
- name: Generate Models Ops test
if: ${{ inputs.generate_models_ops_test }}
shell: bash
run: |
source env/activate
apt-get update
apt install -y libgl1 libglx-mesa0
set -o pipefail # Ensures that the exit code reflects the first command that fails
python scripts/model_analysis/generate_models_ops_test.py \
--test_directory_or_file_path ${{ env.TEST_DIR_OR_FILE_PATH }} \
--unique_ops_output_directory_path ${{ env.UNIQUE_OPS_OUTPUT_DIR_PATH }} \
--models_ops_test_output_directory_path ${{ env.MODELS_OPS_TEST_OUTPUT_DIR_PATH }} \
--models_ops_test_package_name ${{ env.MODELS_OPS_TEST_PACKAGE_NAME }} \
2>&1 | tee ${{ env.SCRIPT_OUTPUT_LOG }}
- name: Upload Script Output Logs
uses: actions/upload-artifact@v4
if: success() || failure()
with:
name: script-outputs
path: ${{ env.SCRIPT_OUTPUT_LOG }}

- name: Upload Models Unique Ops test Failure Logs
uses: actions/upload-artifact@v4
if: success() || failure()
with:
name: unique-ops-logs
path: ${{ env.UNIQUE_OPS_OUTPUT_DIR_PATH }}

- name: Create Pull Request
uses: peter-evans/create-pull-request@v7
with:
branch: ${{ env.BRANCH_NAME }}
committer: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
author: ${{ github.actor }} <${{ github.actor }}@users.noreply.github.com>
base: main
commit-message: ${{ env.COMMIT_MESSAGE }}
title: ${{ env.TITLE }}
body: ${{ env.BODY }}
delete-branch: true
token: ${{ env.GITHUB_TOKEN }}
add-paths: |
${{ env.OUTPUT_PATH }}
17 changes: 10 additions & 7 deletions forge/forge/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,16 +184,19 @@ class CompilerConfig:
# Number of patterns to match for each module
tvm_module_to_num_patterns: Dict[str, int] = field(default_factory=lambda: dict())

# If enabled, for given test, it generates Forge Modules in form of PyTest for each unique operation configuration within the given module.
# If enabled, for given test, it only extracts the unique operation configuration.
extract_tvm_unique_ops_config: bool = False

# If enabled, for given test, it extracts the unique operation configuration and generates Forge Modules in form of PyTest for each unique operation configuration within the given module.
# Each configuration is based on:
# - Operand Type (e.g., Activation, Parameter, Constant)
# - Operand Shape
# - Operand DataType
# - Operation Arguments (if any)
tvm_generate_unique_op_tests: bool = False
tvm_generate_unique_ops_tests: bool = False

# Export the generated unique operations configurations information with test file path to the excel file
export_tvm_generated_unique_op_tests_details: bool = False
# Export the unique operations configurations information to the excel file
export_tvm_unique_ops_config_details: bool = False

# Enables a transform for conv that directly reads input, such that it goes from stride > 1 to stride = 1
# This usually translates to lower DRAM BW and less math as the input better populates tiles
Expand Down Expand Up @@ -359,9 +362,9 @@ def apply_env_config_overrides(self):
os.environ["FORGE_OVERRIDE_DEVICE_YAML"]
)

if "FORGE_EXPORT_TVM_GENERATED_UNIQUE_OP_TESTS_DETAILS" in os.environ:
self.export_tvm_generated_unique_op_tests_details = bool(
int(os.environ["FORGE_EXPORT_TVM_GENERATED_UNIQUE_OP_TESTS_DETAILS"])
if "FORGE_EXPORT_TVM_UNIQUE_OPS_CONFIG_DETAILS" in os.environ:
self.export_tvm_unique_ops_config_details = bool(
int(os.environ["FORGE_EXPORT_TVM_UNIQUE_OPS_CONFIG_DETAILS"])
)

def __post_init__(self):
Expand Down
Loading

0 comments on commit d9f3cfb

Please sign in to comment.