Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mistralai/Mixtral-8x22B-Instruct-v0.1 / tokenizer #69

Open
ggbetz opened this issue Oct 4, 2024 · 0 comments
Open

mistralai/Mixtral-8x22B-Instruct-v0.1 / tokenizer #69

ggbetz opened this issue Oct 4, 2024 · 0 comments

Comments

@ggbetz
Copy link
Contributor

ggbetz commented Oct 4, 2024

Same error as here #66

2024-10-04:15:42:36,551 INFO     [__main__.py:364] Passed `--trust_remote_code`, setting environment variable `HF_DATASETS_TRUST_REMOTE_CODE=true`
2024-10-04:15:42:36,551 INFO     [__main__.py:376] Selected Tasks: ['logiqa2_base', 'logiqa_base', 'lsat-ar_base', 'lsat-lr_base', 'lsat-rc_base']
2024-10-04:15:42:36,553 INFO     [evaluator.py:161] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2024-10-04:15:42:36,553 INFO     [evaluator.py:198] Initializing local-completions model, with arguments: {'base_url': 'http://localhost:8080/v1/completions', 'num_concurrent': 1, 'max_retries': 3, 'tokenized_requests': False, 'model': 'mistralai/Mixtral-8x22B-Instruct-v0.1', 'trust_remote_code': True}
2024-10-04:15:42:36,553 INFO     [api_models.py:108] Using max length 2048 - 1
2024-10-04:15:42:36,553 INFO     [api_models.py:111] Concurrent requests are disabled. To enable concurrent requests, set `num_concurrent` > 1.
2024-10-04:15:42:36,553 INFO     [api_models.py:121] Using tokenizer huggingface
Traceback (most recent call last):
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2450, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 157, in __init__
    super().__init__(
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 107, in __init__
    raise ValueError(
ValueError: Cannot instantiate this tokenizer from a slow version. If it's based on sentencepiece, make sure you have sentencepiece installed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/bin/lm-eval", line 8, in <module>
    sys.exit(cli_evaluate())
             ^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/__main__.py", line 382, in cli_evaluate
    results = evaluator.simple_evaluate(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/utils.py", line 397, in _wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/evaluator.py", line 201, in simple_evaluate
    lm = lm_eval.api.registry.get_model(model).create_from_arg_string(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/api/model.py", line 147, in create_from_arg_string
    return cls(**args, **args2)
           ^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 18, in __init__
    super().__init__(
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/models/api_models.py", line 130, in __init__
    self.tokenizer = transformers.AutoTokenizer.from_pretrained(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 907, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2216, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2451, in _from_pretrained
    except import_protobuf_decode_error():
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_base.py", line 87, in import_protobuf_decode_error
    raise ImportError(PROTOBUF_IMPORT_ERROR.format(error_message))
ImportError:
 requires the protobuf library but it was not found in your environment. Checkout the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant