-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-agent application documentation #541
Conversation
Deploying pydantic-ai with Cloudflare Pages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 files reviewed, 1 total issue(s) found.
The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:
- [Dd]ataclass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:
- system_prompt
When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run. | ||
|
||
!!! Multiple models | ||
Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [`UsageLimits`][pydantic_ai.usage.UsageLimits] to avoid unexpected costs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes me feel like we should have a way to tally usage on a per-model basis. Of course that's well outside the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should also have a dedicated docs section talking about various ways to manage usage with multiple models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs definitely, possibly change how usage is calculated.
In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it?
|
This would also be useful when an agent returns its final response to the main/supervisor/delegator agent and then the main agent can know what went down. |
Love it thank you for this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting stuff! Left some minor nit picks on the docs, plus a request to split this into a docs PR and a usage structures refactor PR 👍
docs/multi-agent-applications.md
Outdated
#> Seat preference: row=1 seat='A' | ||
``` | ||
|
||
1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP 747 lands, see [structure results](results.md#structured-result-validation). We a union as the result type so the model can communicate that it's unable to find a satisfactory choice, internally each member of the union will be registered as a separate tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP link maybe?
docs/multi-agent-applications.md
Outdated
return FlightDetails(flight_number='AK456') | ||
|
||
|
||
usage_limit = UsageLimits(request_limit=15) # (3)! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usage_limit = UsageLimits(request_limit=15) # (3)! | |
usage_limits = UsageLimits(request_limit=15) # (3)! |
And same below
When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run. | ||
|
||
!!! Multiple models | ||
Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [`UsageLimits`][pydantic_ai.usage.UsageLimits] to avoid unexpected costs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should also have a dedicated docs section talking about various ways to manage usage with multiple models.
Ah my bad, looks like we already have those in ctx.messages. Love it. This was not in the examples tho. It might be worth adding. |
3ca04f1
to
99882bf
Compare
Co-authored-by: Sydney Runkle <[email protected]> Co-authored-by: David Montague <[email protected]>
050090b
to
f2b66c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:
- [Dd]eps
- [Ii]nterdependencies
Thank you for writing these guides. I don't want to be overly critical, but I'm not convinced the "Programmatic agent hand-off" example is all that helpful because it's not really a hand-off. It's basically two separate agents that are run in a sequence with one variable being passed over. I know this is meant as a minimal example, but it doesn't provide a programming pattern that's useful for most scenarios. A minimal example I'd like to see would be similar to I'm happy to participate on this. I'm just not sure, I have an elegant solution for this myself. A proper handoff should:
Here is a simple flow based on the Swarm Triage example: |
Ok, I built something hacky und ugly I want to share. class AgentHandoff(BaseModel):
target_agent: Literal["Triage Agent", "Refunds Agent"] We use a pydantic model as a return type to signal a handoff. Problem 1: Functions/tools aren't helpful at the moment because the response is fed back to the LLM without the possibility to intervene. triage_agent = Agent(
"openai:gpt-4o",
name="Triage Agent",
system_prompt="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent.",
result_type=str | AgentHandoff,
)
refunds_agent = Agent(
"openai:gpt-4o",
name="Refunds Agent",
system_prompt="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund.",
result_type=str | AgentHandoff,
)
@refunds_agent.tool_plain
def process_refund() -> str:
return "Purchase refunded"
@refunds_agent.tool_plain
def apply_discount() -> str:
return "11% Discount applied" We define the agents similar to the Swarm example with the result type being either a normal message or a handoff request. agents = {
"Triage Agent": triage_agent,
"Refunds Agent": refunds_agent
}
while True:
message = input("Please enter your message to the agent")
if message.lower() == 'q':
break
res = await agent.run(message, message_history=messages)
messages = res.all_messages()
if isinstance(res.data, AgentHandoff):
agent = agents[res.data.target_agent]
messages[0].parts[0].content = agent._system_prompts[0]
res = await agent.run(message, message_history=messages)
messages = res.all_messages() Ok, so basically, as long as we get normal text responses from the agent, we keep the conversation going on. In case of a AgentHandoff event, we set the current agent as the target agent. Problem 2: The system prompt in the message list is still the old one, so we have to swap it out manually. Problem 3: The "final-result" signature of the return type doesn't semantically fit what we're doing, so we might want to add a custom model for this purpose. Problem 4: After the handoff tool call, we would want to run the model again without a new user message. This is how Swarm does it. The tool call signals the LLM that the handoff has occured, so it continues the conversation based on the new persona. Since Overall I'm very unhappy with this approach. I still have my hopes up that some smarter people come up with better ideas. For my use case (which requires multi agent logic to the core), this isn't cutting it and I don't see a simple way of getting there. |
triage_agent = Agent(
"openai:gpt-4o",
name="Triage Agent",
system_prompt="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent."
)
refunds_agent = Agent(
"openai:gpt-4o",
name="Refunds Agent",
system_prompt="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund."
)
@triage_agent.handoff
def handoff_to_refunds():
return refunds_agent
@refunds_agent.handoff
def handoff_to_triage():
return triage_agent
@refunds_agent.tool_plain
def process_refund() -> str:
return "Purchase refunded"
@refunds_agent.tool_plain
def apply_discount() -> str:
return "11% Discount applied" What about this? The handoff decorator could be treated as a special tool, which switches out the system prompt(s) and runs another chat completion afterwards. It's equivalent to what Swarm does. |
is this available? |
No, sorry I didn't make this clear. It's just an API idea I had that seems quite simple and clean. |
fix #120
fIx #273
fix #300
Here I've added an example of agent delegation as requested by @Luca-Blight in #120.
There are roughly four levels of complexity when building applications with PydanticAI: