Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

input truncated | n_ctx=2048 #24

Open
Qualzz opened this issue Jul 18, 2024 · 0 comments
Open

input truncated | n_ctx=2048 #24

Qualzz opened this issue Jul 18, 2024 · 0 comments

Comments

@Qualzz
Copy link

Qualzz commented Jul 18, 2024

No matter the settings, chunk size, etc i'm using, during the verb entity , ollama logs looks like this:
INFO [update_slots] input truncated | n_ctx=2048 n_erase=1090 n_keep=4 n_left=2044 n_shift=1022 tid="130623106379776" timestamp=1721345155

The issue is that no matter what, ollama will default to 2048 for max_tokens and this options is not compatible with the openAI api endpoint. That mean if you want to use a larger context windows, you NEED to create a new modelfile for your model.

You can do so by copying your model MODELFILE in a temp file:

ollama show MODELNAME --modelfile > settings.txt

Then add a new line PARAMETER num_ctx 8192 or PARAMETER num_ctx 4096 as you wish at the end of this file.

Then you can create a "model" out of that settings file:

ollama create YOURNEWMODELNAME -f settings.txt

Now it should work without your inputs being truncated by ollama.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant