You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some models price tokens differently based on the length of the prompt. It would be helpful to potentially restructure or add fields to the model price dictionary to account for this.
we can be more specific here @areibman - since it's not standard yet what 'long' and 'short' mean.
This seems similar to how tgai pricing works based on token params - what if we do input_cost_per_token_up_to_128k and input_cost_per_token_above_128k ?
we can be more specific here @areibman - since it's not standard yet what 'long' and 'short' mean.
This seems similar to how tgai pricing works based on token params - what if we do input_cost_per_token_up_to_128k and input_cost_per_token_above_128k ?
That would probably work! Only precaution I can think of is if some providers start providing multiple tier pricing per model I.e. <128k, 128k - 256k, 256k - 512k etc.
This should be as easy as updating the proxies json, no?
The Feature
Some models price tokens differently based on the length of the prompt. It would be helpful to potentially restructure or add fields to the model price dictionary to account for this.
This could look something like:
Motivation, pitch
Google models price tokens differently for prompts >128k tokens. According to https://ai.google.dev/pricing:![image](https://private-user-images.githubusercontent.com/14807319/340138776-619cbaeb-72dd-490e-a0cb-ed6cbbdbe142.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5NTg4NDEsIm5iZiI6MTczODk1ODU0MSwicGF0aCI6Ii8xNDgwNzMxOS8zNDAxMzg3NzYtNjE5Y2JhZWItNzJkZC00OTBlLWEwY2ItZWQ2Y2JiZGJlMTQyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA3VDIwMDIyMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWM4MWJiOGMzZmQ5Y2NkMzk1YjkyYWM5NjIwNmE1MjJjYThjZTQ1MTlmMjc4OWVmYzc4N2RhZmUxZWY2ODA3MDkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.lQIQs7qY8kKHEw6ehVcPDt5xdTkR3kB2IcLds_0bKwM)
This was brought up in AgentOps-AI/tokencost#53 which relies on the LiteLLM cost tracker
Twitter / LinkedIn details
https://www.twitter.com/alexreibman
The text was updated successfully, but these errors were encountered: