-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG/Help] <title>ValueError: 130001 is not in list #596
Comments
代码和模型版本都是最新 |
#432 |
mask_token = gMASK if gMASK in input_ids else MASK 这样会导致下面这句报错(假设input_ids有一条数据存在gMASK,一条数据存在MASK) 按照下面写 把原来那两句注释掉 |
目前的实现里都是用gMASK的,如果没用gMASK就是出错了 |
如果数据本身存在mask,在 |
那如果数据里有mask的情况,数据需要把mask去掉? |
哦原来是这样,我终于知道出现这种错误的都是什么情况了。 |
现在应该已经修复了,不管数据里有没有 [MASK],tokenizer都会在末尾加入 [gMASK]。不过还是建议把数据里的 [MASK] 去掉。目前 |
Is there an existing issue for this?
Current Behavior
ptuning时候max_steps改大一点点, 就会这样,是我哪里搞错了吗
Expected Behavior
。
Steps To Reproduce
。
Environment
Anything else?
。
The text was updated successfully, but these errors were encountered: