input truncated | n_ctx=2048 #24

Qualzz · 2024-07-18T23:27:28Z

No matter the settings, chunk size, etc i'm using, during the verb entity , ollama logs looks like this:
INFO [update_slots] input truncated | n_ctx=2048 n_erase=1090 n_keep=4 n_left=2044 n_shift=1022 tid="130623106379776" timestamp=1721345155

The issue is that no matter what, ollama will default to 2048 for max_tokens and this options is not compatible with the openAI api endpoint. That mean if you want to use a larger context windows, you NEED to create a new modelfile for your model.

You can do so by copying your model MODELFILE in a temp file:

ollama show MODELNAME --modelfile > settings.txt

Then add a new line PARAMETER num_ctx 8192 or PARAMETER num_ctx 4096 as you wish at the end of this file.

Then you can create a "model" out of that settings file:

ollama create YOURNEWMODELNAME -f settings.txt

Now it should work without your inputs being truncated by ollama.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input truncated | n_ctx=2048 #24

input truncated | n_ctx=2048 #24

Qualzz commented Jul 18, 2024 •

edited

Loading

input truncated | n_ctx=2048 #24

input truncated | n_ctx=2048 #24

Comments

Qualzz commented Jul 18, 2024 • edited Loading

Qualzz commented Jul 18, 2024 •

edited

Loading