Multi-agent application documentation #541

samuelcolvin · 2024-12-24T11:09:13Z

Here I've added an example of agent delegation as requested by @Luca-Blight in #120.

There are roughly four levels of complexity when building applications with PydanticAI:

Single agent workflows — what most of this documentation covers
Agent delegation — agents using another agent via tools documented in this PR
Programmatic agent hand-off — one agent runs, then application code calls another agent documented in this PR
Graph based control flow — for the most complex cases, graph and a state machine can be used to control the execution of multiple agents. Work to add graph support is ongoing in Graph Support #528 and pydantic-ai-graph - simplify public generics #539

cloudflare-workers-and-pages · 2024-12-24T19:13:13Z

Deploying pydantic-ai with Cloudflare Pages

Latest commit:	`10927a3`
Status:	✅ Deploy successful!
Preview URL:	https://75c4afc1.pydantic-ai.pages.dev
Branch Preview URL:	https://agent-delegation.pydantic-ai.pages.dev

View logs

hyperlint-ai

5 files reviewed, 1 total issue(s) found.

The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:

[Dd]ataclass

docs/multi-agent-applications.md

hyperlint-ai

The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:

system_prompt

docs/examples/flight-booking.md

examples/pydantic_ai_examples/flight_booking.py

docs/multi-agent-applications.md

dmontagu · 2024-12-24T21:40:58Z

docs/multi-agent-applications.md

+When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run.
+
+!!! Multiple models
+    Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [`UsageLimits`][pydantic_ai.usage.UsageLimits] to avoid unexpected costs.


Makes me feel like we should have a way to tally usage on a per-model basis. Of course that's well outside the scope of this PR.

I feel like we should also have a dedicated docs section talking about various ways to manage usage with multiple models.

docs definitely, possibly change how usage is calculated.

docs/multi-agent-applications.md

pydantic_ai_slim/pydantic_ai/result.py

HamzaFarhan · 2024-12-25T01:43:34Z

In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it?
I'm guessing there are 2 possible approaches:

Adding messages/key+values using dependency injection.
Passing the messages throughout the "team" of agents and accumulating them. Especially after Configuration and parameters for all_messages() and new_messages() #496

HamzaFarhan · 2024-12-25T01:47:40Z

In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it? I'm guessing there are 2 possible approaches:

Adding messages/key+values using dependency injection.

Passing the messages throughout the "team" of agents and accumulating them. Especially after Configuration and parameters for all_messages() and new_messages() #496

This would also be useful when an agent returns its final response to the main/supervisor/delegator agent and then the main agent can know what went down.

jacobweiss2305 · 2024-12-26T14:19:10Z

Love it thank you for this!

examples/pydantic_ai_examples/flight_booking.py

sydney-runkle

Exciting stuff! Left some minor nit picks on the docs, plus a request to split this into a docs PR and a usage structures refactor PR 👍

docs/agents.md

docs/examples/flight-booking.md

docs/multi-agent-applications.md

sydney-runkle · 2024-12-28T01:26:45Z

docs/multi-agent-applications.md

+        #> Seat preference: row=1 seat='A'
+```
+
+1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP 747 lands, see [structure results](results.md#structured-result-validation). We a union as the result type so the model can communicate that it's unable to find a satisfactory choice, internally each member of the union will be registered as a separate tool.


PEP link maybe?

sydney-runkle · 2024-12-28T01:27:36Z

docs/multi-agent-applications.md

+    return FlightDetails(flight_number='AK456')
+
+
+usage_limit = UsageLimits(request_limit=15)  # (3)!


Suggested change

usage_limit = UsageLimits(request_limit=15) # (3)!

usage_limits = UsageLimits(request_limit=15) # (3)!

And same below

sydney-runkle · 2024-12-28T01:30:02Z

docs/multi-agent-applications.md

+When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run.
+
+!!! Multiple models
+    Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [`UsageLimits`][pydantic_ai.usage.UsageLimits] to avoid unexpected costs.


I feel like we should also have a dedicated docs section talking about various ways to manage usage with multiple models.

examples/pydantic_ai_examples/flight_booking.py

pydantic_ai_slim/pydantic_ai/usage.py

HamzaFarhan · 2024-12-29T04:06:26Z

In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it? I'm guessing there are 2 possible approaches:

Adding messages/key+values using dependency injection.

Passing the messages throughout the "team" of agents and accumulating them. Especially after Configuration and parameters for all_messages() and new_messages() #496

Ah my bad, looks like we already have those in ctx.messages. Love it. This was not in the examples tho. It might be worth adding.

Co-authored-by: Sydney Runkle <[email protected]> Co-authored-by: David Montague <[email protected]>

hyperlint-ai

The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:

[Dd]eps
[Ii]nterdependencies

pietz · 2025-01-02T13:07:51Z

Thank you for writing these guides. I don't want to be overly critical, but I'm not convinced the "Programmatic agent hand-off" example is all that helpful because it's not really a hand-off. It's basically two separate agents that are run in a sequence with one variable being passed over. I know this is meant as a minimal example, but it doesn't provide a programming pattern that's useful for most scenarios.

A minimal example I'd like to see would be similar to Swarm's Triage Example.

I'm happy to participate on this. I'm just not sure, I have an elegant solution for this myself. A proper handoff should:

Keep the full conversation of messages as context
Switch out the system prompt(s)
Have the new agent reply after the handoff was executed.

Here is a simple flow based on the Swarm Triage example:

pietz · 2025-01-03T17:02:29Z

Ok, I built something hacky und ugly I want to share.

class AgentHandoff(BaseModel):
    target_agent: Literal["Triage Agent", "Refunds Agent"]

We use a pydantic model as a return type to signal a handoff.

Problem 1: Functions/tools aren't helpful at the moment because the response is fed back to the LLM without the possibility to intervene.

triage_agent = Agent(
    "openai:gpt-4o",
    name="Triage Agent",
    system_prompt="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent.",
    result_type=str | AgentHandoff,
)

refunds_agent = Agent(
    "openai:gpt-4o",
    name="Refunds Agent",
    system_prompt="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund.",
    result_type=str | AgentHandoff,
)

@refunds_agent.tool_plain
def process_refund() -> str:
    return "Purchase refunded"

@refunds_agent.tool_plain
def apply_discount() -> str:
    return "11% Discount applied"

We define the agents similar to the Swarm example with the result type being either a normal message or a handoff request.

agents = {
    "Triage Agent": triage_agent,
    "Refunds Agent": refunds_agent
}

while True:
    message = input("Please enter your message to the agent")
    if message.lower() == 'q':
        break
    res = await agent.run(message, message_history=messages)
    messages = res.all_messages()
    if isinstance(res.data, AgentHandoff):
        agent = agents[res.data.target_agent]
        messages[0].parts[0].content = agent._system_prompts[0]
        res = await agent.run(message, message_history=messages)
        messages = res.all_messages()

Ok, so basically, as long as we get normal text responses from the agent, we keep the conversation going on. In case of a AgentHandoff event, we set the current agent as the target agent.

Problem 2: The system prompt in the message list is still the old one, so we have to swap it out manually.

Problem 3: The "final-result" signature of the return type doesn't semantically fit what we're doing, so we might want to add a custom model for this purpose.

Problem 4: After the handoff tool call, we would want to run the model again without a new user message. This is how Swarm does it. The tool call signals the LLM that the handoff has occured, so it continues the conversation based on the new persona. Since .run() doesn't allow an empty message, I just copied the old message, which now appears twice in the message list.

Overall I'm very unhappy with this approach. I still have my hopes up that some smarter people come up with better ideas. For my use case (which requires multi agent logic to the core), this isn't cutting it and I don't see a simple way of getting there.

pietz · 2025-01-03T17:12:00Z

triage_agent = Agent(
    "openai:gpt-4o",
    name="Triage Agent",
    system_prompt="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent."
)

refunds_agent = Agent(
    "openai:gpt-4o",
    name="Refunds Agent",
    system_prompt="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund."
)

@triage_agent.handoff
def handoff_to_refunds():
    return refunds_agent

@refunds_agent.handoff
def handoff_to_triage():
    return triage_agent

@refunds_agent.tool_plain
def process_refund() -> str:
    return "Purchase refunded"

@refunds_agent.tool_plain
def apply_discount() -> str:
    return "11% Discount applied"

What about this? The handoff decorator could be treated as a special tool, which switches out the system prompt(s) and runs another chat completion afterwards. It's equivalent to what Swarm does.

rohithbojja · 2025-01-07T09:35:07Z

triage_agent = Agent(
    "openai:gpt-4o",
    name="Triage Agent",
    system_prompt="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent."
)

refunds_agent = Agent(
    "openai:gpt-4o",
    name="Refunds Agent",
    system_prompt="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund."
)

@triage_agent.handoff
def handoff_to_refunds():
    return refunds_agent

@refunds_agent.handoff
def handoff_to_triage():
    return triage_agent

@refunds_agent.tool_plain
def process_refund() -> str:
    return "Purchase refunded"

@refunds_agent.tool_plain
def apply_discount() -> str:
    return "11% Discount applied"

What about this? The handoff decorator could be treated as a special tool, which switches out the system prompt(s) and runs another chat completion afterwards. It's equivalent to what Swarm does.

is this available?

pietz · 2025-01-07T10:59:20Z

is this available?

No, sorry I didn't make this clear. It's just an API idea I had that seems quite simple and clean.

hyperlint-ai bot reviewed Dec 24, 2024

View reviewed changes

docs/multi-agent-applications.md Outdated Show resolved Hide resolved

hyperlint-ai bot reviewed Dec 24, 2024

View reviewed changes

samuelcolvin marked this pull request as ready for review December 24, 2024 20:28

This was referenced Dec 24, 2024

Add multi-agent example #120

Closed

API ideas: Agents vs. Functions #273

Closed

Is this the correct approach to building a multi-agent application using Pydantic AI? #300

Closed