Skip to content

Releases: deiteris/voice-changer

b2332

07 Dec 16:10
7ca1147
Compare
Choose a tag to compare

This is a minor release that addresses few issues with dark theme found in b2329.

  1. Fix "echo" icon color in dark theme for "file" input.
  2. Fix folder icon in dark theme for "file" input.
  3. Use less dark black color for backgrounds.

b2329

07 Dec 14:59
8fa48e9
Compare
Choose a tag to compare

Improvements

  1. Minor overall UI style improvements.
  2. Merge Lab now downloads merged model instead of saving it into slot.
  3. Merge Lab now saves merged model in PyTorch format instead of safetensors for compatibility.
  4. When index is not present for the model, Index slider is disabled.
    image
  5. Added automatic dark theme. Light and dark theme are selected based on the system or browser preferences.
    image

Fixes

  1. Fixed an error when model is not selected and "SIO rec." set to "start".
  2. Fixed an error when attempting to load empty index.
  3. Server Audio no longer creates a separate thread and app no longer hangs due to running server audio thread.
  4. Server Audio no longer spams with error messages on invalid devices.

Misc

  1. Added icon to executable.
  2. Updated Python to 3.12 for Windows and macOS distributions.
  3. Updated PyTorch to 2.5.1 for all distributions.
  4. Updated onnxruntime to 1.20.1 for all distributions.
  5. Updated torch-directml 0.2.5.dev240914 for DirectML.
  6. Updated ROCm to 6.2.3 for Linux ROCm distribution.

fixes-refactoring-b2320-d8f2474

05 Dec 20:28
Compare
Choose a tag to compare
Pre-release
Bump python release version

b2309

24 Aug 22:28
8b35e1d
Compare
Choose a tag to compare

This is a hotfix for b2307.

Fixes voice changer loading with CUDA version on Windows.

b2307

24 Aug 12:07
f25d4fb
Compare
Choose a tag to compare

This is a minor maintenance release.

Changes

  • For PyTorch models, PyTorch has been updated to v2.4.0 in the following versions:
    • CUDA version. Now also uses CUDA 12+ and CuDNN 9+ which may offer better performance. This has also reduced the prebuilt version size.
    • CPU version.
    • macOS version (ARM only).
  • In DirectML version, torch-directml has been updated to 0.2.4.dev240815. No performance change is expected since the new operators are not used by the voice changer.
  • For ONNX models, ONNX runtime has been updated to 1.19 in the following versions:
    • CUDA version. Now also uses CUDA 12+ and CuDNN 9+ which may offer better performance.
    • DirectML version. DirectML runtime and opset support have been updated which may offer better performance.
    • CPU version.
    • macOS version. CoreML runtime now has support for more operators which are used by the voice changer and may offer better performance.

b2300

11 Aug 16:45
edd2341
Compare
Choose a tag to compare

This is a minor maintenance release.

Fixes FP16 support detection with older NVIDIA GPUs (Maxwell GPUs and earlier).

b2298

11 Aug 13:15
958a25f
Compare
Choose a tag to compare

Changes

  • "export to onnx" button has been replaced with the new "Convert to ONNX" setting in Advanced setting. The behavior of this setting has also changed. When this option is checked, uploaded or selected models (if not converted before) will be automatically converted to ONNX in the same slot. When unchecked, the original PyTorch model will be used. Note that this option does not replace the original model, but adds an ONNX variant to it. If uploaded model is already in ONNX format, it will be loaded as is.
    image
  • Performance monitor now shows the model type including the used runtime (f.e., onnxRVC for ONNX RVC models and pyTorchRVCv2 for original RVC v2 models).

Improvements

  • WASAPI no longer requires matching the sample rate of input and output devices, the audio will be automatically resampled by the system audio mixer. However, it's still recommended if possible.

Fixes

  • ASIO channel selection now correctly selects an input/output channel.
  • "Operation in progress" dialog will now appear when changing long-running options in Advanced settings.
  • Fixed a potential bug when FP16 ONNX model could fail to load if it was generated previously and removed.
  • Model settings (Pitch, Index, etc.) no longer reset to last saved settings after changing GPU.

Experimental

  • When using inference on CPU, the contentvec embedder model will be quantized using INT8 precision. This significantly reduces RAM usage and slightly reduces CPU usage. Currently, there's no option to opt out from this behavior.

Miscellaneous

  • Updated WebUI npm dependencies.

Known issues

  • WDM-KS and true ASIO devices produce crackling audio. For the lowest delay, the current workaround is to use WASAPI or FlexASIO.

b2277

06 Aug 18:31
270a673
Compare
Choose a tag to compare

Improvements

  • ASIO devices now can specify input and output channels. For FlexASIO and ASIO4ALL, usually no changes are required since they normally work on default channels.
    image

Fixes

  • Fix model merging.
  • Exclude GeForce MX series from FP16 since it's not supported by them.

b2271

04 Aug 14:08
8bcfcdb
Compare
Choose a tag to compare

Important change

The voice changer will now constantly utilize CPU/GPU when voice conversion is enabled. This is to address an issue when the voice changer may lag after a short period of silence or demonstrate inconsistent performance that can be observed with NVIDIA GPUs in previous versions.

New

  • "perf" metric now includes a graph. The graph shows data points over last 5 seconds (if chunk size is less than 100ms) or more. Performance graph allows you to see if there're performance fluctuations and adjust chunk size depending on the usage over time.
    performance_graph
  • Most of the settings now include tooltips with explanation. Just hover over the text with dotted underline.
    image
  • When changing settings that may have negative impact on performance in DirectML version, a notification will appear near GPU, asking to switch between CPU and GPU.
    dml_gpu_warning
  • Introduced Just-in-Time (JIT) compilation for PyTorch models. This slightly improves performance (fast response on the first start, a bit lower latency) and reduces memory usage in some cases. However, currently, it increases model loading time. To opt out from JIT compilation, set "Disable JIT compilation" option in Advanced settings to "on". Note that JIT is currently not available for DirectML devices so this option won't have any effect for them.

Changes

  • Increased input volume slider range to 250%

Experimental

  • Version for NVIDIA includes 2 bat files to lock or reset GPU core and memory clocks to address possible inconsistent performance issues. Note that the support for frequency locking is not guaranteed by all NVIDIA GPUs, you can verify that the option took effect with GPU-Z. Both scripts must run with administrator rights to take effect:
    • force_gpu_clocks.bat - queries GPU core and memory clocks and locks to reported maximum clocks.
    • reset_gpu_clocks.bat - resets GPU core and memory frequencies.

b2245

31 Jul 11:59
45063ad
Compare
Choose a tag to compare

Changes

  • Changed the background color of performance stats block.

  • buf metric has been moved to perf.

  • perf now indicates the performance by highlighting the inference time in green, yellow or red when a certain condition is met. The following table shows an example of 3 conditions that it may indicate:

    Stable Potentially unstable
    / High usage
    Unstable
    image image image
    image image image

    The logic of these conditions is the following:

    • Stable - your inference speed is sufficient for the selected chunk size. Usually, no actions required.
    • Potentially unstable - your inference speed is sufficient but audio may be unstable when other processes run concurrently. Operation in this range will also incur high GPU usage. Increasing Chunk size or reducing Extra is recommended.
    • Unstable - your inference speed is insufficient for the selected chunk size. Increase Chunk size or reduce Extra.

Experimental

  • In client audio mode, additional audio buffering used to compensate for the lag was removed. This should reduce audio latency in this mode (up to 25% less latency) and not cause any issues during stable operation, but report issues if any.