Releases · cli99/llm-analysis

13 Nov 04:45

cli99

Bug fixes Latest

Latest

v0.2.2

bump version

Assets 2

02 Nov 17:37

cli99

v0.2.1

fix act checkpointing and add local model config file support

Assets 2

31 Oct 07:55

cli99

This release fixes a few bugs when calculating memory usage (e.g. activation, optimizer states), and adds support to analysis MoE training.

Assets 2

18 Aug 06:30

cli99

This release:

adds group query attention (GQA) support
changes the activation memory calculation in inference to assume maximum tensor buffer
fixes the kv cache size calculation
adds a gpu cost analysis in the inference
adds llama2 inference case study

Assets 2

02 May 17:19

cli99

v0.1.0

release

Assets 2

Provide feedback