Replies: 4 comments
-
Also, it would be helpful to understand more about |
Beta Was this translation helpful? Give feedback.
-
my understandingCurrent Framework on how we are handling requests: The above-mentioned proposal might not directly work as memory functions have a direct dependency on inner monologue (i.e.) the response received from the LLM. What we can do to optimize it better is, split the update functions and retrieval functions to different agents. As, when the agent have to retrieve some information, it has a need to send Here, we can send the new message using @cpacker I would love to understand your perspective. Also, would want to understand what is the goal for performing this. What we are saying is The memory functions are blocked by the first request as they are dependent. I do not have a holistic understanding of the project and would like to know your thoughts! |
Beta Was this translation helpful? Give feedback.
-
One advantage I see to this is the eventually we might want the memory management agent to (intelligently, with some provided contextual information) modify the retrieved information such as filtering out most relevant parts or summarizing due to length. Would make it easier to use third party tools for those operations too. |
Beta Was this translation helpful? Give feedback.
-
Also, you can probably use a cheaper model for the memory management thread. |
Beta Was this translation helpful? Give feedback.
-
Add implicit version of MemGPT with two separate threads (dialogue thread + memory management thread)
Beta Was this translation helpful? Give feedback.
All reactions