-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Caching to OpenAI chat #603
Conversation
Thanks @adamgordonbell ! I am just about merge some extensive client/server updates, then I'll come back and try and loop this in. |
One quick question on the PR. Adding poetry support sounds nice, but do you know if there is a way to do that without creating two copies of our dependency lists? (we can also just leave that out for now if it slows things down too much) |
It's outside of my area of knowledge but I think so. Though I do think the cpp files probably complicate that. I can remove that from this PR, just helped with my development.
Awesome! |
Merged this with the latest updates. Thanks again! |
Thank. you for adding this @adamgordonbell and @slundberg . Caching was about 50% of the reason I used Guidance. It's easy to implement myself but adding it to each project becomes tedious. |
Why was this merged?
|
Hi @maximegmd, I was running into similar challenges with caching, and dropped it as part of a simplification we're making to the codebase (#820). I think this PR was a strong initial concept (thank you @adamgordonbell!!), but as we've started changing the structure of guidance in the last few months, I'd like to pursue a more universal solution (beyond just OpenAI) that also addresses your concerns. If you want to use a no-cache version of the codebase today, just install from source. |
Yeah, this caching implementation had some weakenesses. Excited to see what the next iteration is.
It was disabled for a call when the temperature was above 0. I thought that made sense, as with 0 temp you are asking for a deterministic result, which temp 0 and caching will give. |
Except it didn't distinguish between models so with temperature 0 you get the cache of whichever model was used first with a specific prompt and we were doing model comparisons so we scratched our heads a little to understand why all OpenAI models had the same score haha |
Caching was no longer happening in newest version. Reintroducing here for OpenAI chat.
Not sure if this is the ideal area to insert caching or if model.call could somehow cache for all.