-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mario_lm = MarioLM(lm=BASE, tokenizer=BASE) The parameter here prompts an error. #20
Comments
Hey! What version of mario-gpt are you running? Can you try a pip install mario-gpt —upgrade? |
Can I see the full stacktrace? Because from what I see above it looks like the error is coming from: mario_lm = MarioLM(lm_path=BASE, tokenizer_path=BASE) But below it looks like its working? Doesn't really look like an issue with the trainer / training config. I ran the code in a new clean workspace: >>> import torch
>>> from mario_gpt import MarioDataset, MarioLM, TrainingConfig, MarioGPTTrainer
>>> BASE = "distilgpt2"
>>> mario_lm = MarioLM(lm_path=BASE, tokenizer_path=BASE)
Using distilgpt2 lm
/home/shyam/miniconda3/envs/py39/lib/python3.9/site-packages/transformers/models/auto/modeling_auto.py:1352: FutureWarning: The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.
warnings.warn(
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at distilgpt2 and are newly initialized: ['transformer.h.0.crossattention.c_attn.weight', 'transformer.h.3.crossattention.c_attn.weight', 'transformer.h.4.crossattention.bias', 'transformer.h.5.crossattention.bias', 'transformer.h.2.crossattention.q_attn.weight', 'transformer.h.3.ln_cross_attn.weight', 'transformer.h.2.crossattention.c_proj.weight', 'transformer.h.2.crossattention.c_proj.bias', 'transformer.h.2.ln_cross_attn.weight', 'transformer.h.5.crossattention.c_proj.bias', 'transformer.h.3.crossattention.c_proj.bias', 'transformer.h.0.crossattention.c_proj.bias', 'transformer.h.5.crossattention.c_proj.weight', 'transformer.h.5.ln_cross_attn.weight', 'transformer.h.3.crossattention.masked_bias', 'transformer.h.1.crossattention.c_proj.weight', 'transformer.h.5.crossattention.c_attn.weight', 'transformer.h.1.crossattention.masked_bias', 'transformer.h.1.crossattention.c_proj.bias', 'transformer.h.3.crossattention.c_proj.weight', 'transformer.h.0.ln_cross_attn.weight', 'transformer.h.1.crossattention.bias', 'transformer.h.3.crossattention.bias', 'transformer.h.5.crossattention.masked_bias', 'transformer.h.5.crossattention.q_attn.weight', 'transformer.h.1.crossattention.q_attn.weight', 'transformer.h.1.crossattention.c_attn.weight', 'transformer.h.4.crossattention.q_attn.weight', 'transformer.h.0.crossattention.bias', 'transformer.h.3.crossattention.q_attn.weight', 'transformer.h.0.crossattention.masked_bias', 'transformer.h.4.crossattention.c_proj.bias', 'transformer.h.4.crossattention.c_attn.weight', 'transformer.h.2.crossattention.bias', 'transformer.h.0.crossattention.c_proj.weight', 'transformer.h.4.crossattention.c_proj.weight', 'transformer.h.2.crossattention.masked_bias', 'transformer.h.1.ln_cross_attn.weight', 'transformer.h.0.crossattention.q_attn.weight', 'transformer.h.4.ln_cross_attn.weight', 'transformer.h.4.crossattention.masked_bias', 'transformer.h.2.crossattention.c_attn.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using distilgpt2 tokenizer Can you try doing a |
I will redeploy according to your suggestion, the main problem before is mario_lm = MarioLM(lm=BASE, tokenizer=BASE) |
Ah looks like accelerator changed their api. I’ll update it! |
Whether the relevant modification is completed, I look forward to your revision, I hope to continue to debug your results, thank you. |
Should be fixed now! Let me know if you still have errors |
Can you give some specific suggestions?
The text was updated successfully, but these errors were encountered: