Managing Large Tool Call Outputs in LangGraph #2605
Unanswered
developer-hassan
asked this question in
Q&A
Replies: 1 comment
-
Any updates? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Context:
I am currently working on a persistence application using Python and LangGraph where the LLM decides whether to invoke a tool call based on the context of the user's input. If there is no tool call, I stream the output using FastAPI's
StreamingResponse
. However, when a tool call is made, the result is a large JSON payload (approximately 3000-4000 lines per call) since it involves API responses from third-party services like fetching the latest news articles, large shopping results, or comprehensive food store rankings.Throughout a single chat session, a user might trigger multiple tool calls (up to 10 or more). If each tool call output were appended to the session memory ("messages" payload), the memory size would quickly become unmanageable, significantly impacting performance and scalability.
Current Approach:
At the moment, I am skipping the appending of tool call outputs to the memory messages. This prevents the memory from ballooning to an unmanageable size but creates challenges when it comes to maintaining context or history related to these tool calls and LLM is not aware of what responses it has returned me from my query.
Questions for the Community:
Additional Considerations:
I would appreciate insights, recommendations, or examples from the community to help address this challenge. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions