We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
下载下来7B的模型之后,测试了几个中文问题,发现回答有很多无法识别的字符,是不是模型中中文的词汇表特别小?请问如何扩充中文词汇,并且在此基础上增加中文预训练语料来预训练?
The text was updated successfully, but these errors were encountered:
可以试试bloom
Sorry, something went wrong.
看https://github.com/ymcui/Chinese-LLaMA-Alpaca,这个项目增加了词汇表
发现这个链接没有给增加词表和预训练的相关代码,是否还有其他的推荐?
No branches or pull requests
下载下来7B的模型之后,测试了几个中文问题,发现回答有很多无法识别的字符,是不是模型中中文的词汇表特别小?请问如何扩充中文词汇,并且在此基础上增加中文预训练语料来预训练?
The text was updated successfully, but these errors were encountered: