Skip to content

v2.11.0

Compare
Choose a tag to compare
@KodiaqQ KodiaqQ released this 17 Jun 11:02
· 2339 commits to develop since this release

Post-training Quantization:

Features:

  • (OpenVINO) Added Scale Estimation algorithm for 4-bit data-aware weights compression. The optional scale_estimation parameter was introduced to nncf.compress_weights() and can be used to minimize accuracy degradation of compressed models (note that this algorithm increases the compression time).
  • (OpenVINO) Added GPTQ algorithm for 8/4-bit data-aware weights compression, supporting INT8, INT4, and NF4 data types. The optional gptq parameter was introduced to nncf.compress_weights() to enable the GPTQ algorithm.
  • (OpenVINO) Added support for models with BF16 weights in the weights compression method, nncf.compress_weights().
  • (PyTorch) Added support for quantization and weight compression of the custom modules.

Fixes:

  • (OpenVINO) Fixed incorrect node with bias determination in Fast-/BiasCorrection and ChannelAlighnment algorithms.
  • (OpenVINO, PyTorch) Fixed incorrect behaviour of nncf.compress_weights() in case of compressed model as input.
  • (OpenVINO, PyTorch) Fixed SmoothQuant algorithm to work with Split ports correctly.

Improvements:

  • (OpenVINO) Aligned resulting compression subgraphs for the nncf.compress_weights() in different FP precisions.
  • Aligned 8-bit scheme for NPU target device with the CPU.

Examples:

  • (OpenVINO, ONNX) Updated ignored scope for YOLOv8 examples utilizing a subgraphs approach.

Tutorials:

Compression-aware training:

Features:

  • (PyTorch) nncf.quantize method is now the recommended path for the quantization initialization for Quantization-Aware Training.
  • (PyTorch) Compression modules placement in the model now can be serialized and restored with new API functions: compressed_model.nncf.get_config() and nncf.torch.load_from_config. The documentation for the saving/loading of a quantized model is available, and Resnet18 example was updated to use the new API.

Fixes:

  • (PyTorch) Fixed compatibility with torch.compile.

Improvements:

  • (PyTorch) Base parameters were extended for the EvolutionOptimizer (LeGR algorithm part).
  • (PyTorch) Improved wrapping for parameters which are not tensors.

Examples:

  • (PyTorch) Added an example for STFPM model from Anomalib.

Tutorials:

Deprecations/Removals:

  • Removed extra dependencies to install backends from setup.py (like [torch] are [tf], [onnx] and [openvino]).
  • Removed openvino-dev dependency.

Requirements:

  • Updated PyTorch (2.3.0) and Torchvision (0.18.0) versions.

Acknowledgements

Thanks for contributions from the OpenVINO developer community:
@DaniAffCH
@UsingtcNower
@anzr299
@AdiKsOnDev
@Viditagarwal7479
@truhinnm