This test suite includes small scale versions of Large Language Models (LLMs) and other Generative AI (GenAI) programs exported using the sharktank package built as part of the shark-ai project.
-
Download files through git lfs as needed:
git lfs install git lfs pull --include="*" git lfs ls-files # 37f90b4754 * sharktank_models/llama3.1/assets/toy_llama.irpa # 7172acdf43 * sharktank_models/llama3.1/assets/toy_llama.mlir # e997647ecc * sharktank_models/llama3.1/assets/toy_llama_tp2.irpa # b7b2f5a206 * sharktank_models/llama3.1/assets/toy_llama_tp2.mlir # 917845c887 * sharktank_models/llama3.1/assets/toy_llama_tp2.rank0.irpa # 9ab51093c4 * sharktank_models/llama3.1/assets/toy_llama_tp2.rank1.irpa
-
Set up your virtual environment and install requirements:
cd sharktank_models python -m venv .venv source .venv/bin/activate python -m pip install -r requirements.txt
-
To use IREE from nightly pre-release Python packages:
python -m pip install -r requirements-iree.txt
-
To use a custom version of IREE follow the instructions for building the IREE Python packages from source.
-
-
Run pytest using typical flags:
pytest \ -rA \ -m "target_cpu" \ --timeout=300 \ --durations=0 \ --log-cli-level=info
See https://docs.pytest.org/en/stable/how-to/usage.html for other options.
-
The
log-cli-level
level can also be set todebug
,warning
, orerror
. See https://docs.pytest.org/en/stable/how-to/logging.html. -
Run only tests matching a name pattern:
pytest -k llama
-
Run tests that require an AMD GPU (https://docs.pytest.org/en/stable/example/markers.html):
pytest -m "target_hip"
-
Ignore xfail marks (https://docs.pytest.org/en/stable/how-to/skipping.html#ignoring-xfail):
pytest --runxfail
-
Run tests in parallel using https://pytest-xdist.readthedocs.io/ (note that this swallows some logging):
# Run with an automatic number of threads (usually one per CPU core). pytest -n auto # Run on an explicit number of threads. pytest -n 4
-
Create an HTML report using https://pytest-html.readthedocs.io/en/latest/index.html
pytest --html=report.html --self-contained-html --log-cli-level=info
See also https://docs.pytest.org/en/latest/how-to/output.html#creating-junitxml-format-files