Embedding Matrix Size Not Resized Properly - Bug Report #1483

sumukshashidhar · 2024-12-29T05:38:35Z

(Continued) Pre-Training a model - unsloth works perfectly without special tokens, but, with special tokens, I get the following error:

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))==  Unsloth 2024.12.11: Fast Qwen2 patching. Transformers: 4.47.1.
   \\   /|    GPU: NVIDIA H100 80GB HBM3. Max memory: 79.109 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 9.0. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:07<00:00,  1.25s/it]
Traceback (most recent call last):
  File "/shared/storage-01/users/sumuks2/foundry/paper-reviews-finetuning/src/_experimental/finetune_14_special_toks.py", line 27, in <module>
    add_new_tokens(model, tokenizer, new_tokens = ["<review>", "</review>", "<paper_title>", "</paper_title>", "<paper_abstract>", "</paper_abstract>", "<paper_keywords>", "</paper_keywords>", "<review_title>", "</review_title>", "<review_text>", "</review_text>", "<review_rating>", "</review_rating>", "<review_confidence>", "</review_confidence>"])
  File "/shared/storage-01/users/sumuks2/foundry/paper-reviews-finetuning/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/shared/storage-01/users/sumuks2/foundry/paper-reviews-finetuning/.venv/lib/python3.10/site-packages/unsloth_zoo/tokenizer_utils.py", line 132, in add_new_tokens
    raise RuntimeError(
RuntimeError: Unsloth: Embedding matrix size did not get resized properly. Please file a bug report!

The text was updated successfully, but these errors were encountered:

johnpaulbin · 2025-01-06T00:02:40Z

same error here for llama3 8B

danielhanchen · 2025-01-10T12:53:37Z

Much apologies on the delay - I'm working on making adding new tokens much better - hopefully in a few days

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedding Matrix Size Not Resized Properly - Bug Report #1483

Embedding Matrix Size Not Resized Properly - Bug Report #1483

sumukshashidhar commented Dec 29, 2024

johnpaulbin commented Jan 6, 2025

danielhanchen commented Jan 10, 2025

Embedding Matrix Size Not Resized Properly - Bug Report #1483

Embedding Matrix Size Not Resized Properly - Bug Report #1483

Comments

sumukshashidhar commented Dec 29, 2024

johnpaulbin commented Jan 6, 2025

danielhanchen commented Jan 10, 2025