Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-agent application documentation #541

Merged
merged 9 commits into from
Dec 31, 2024
Merged

Multi-agent application documentation #541

merged 9 commits into from
Dec 31, 2024

Conversation

samuelcolvin
Copy link
Member

@samuelcolvin samuelcolvin commented Dec 24, 2024

fix #120
fIx #273
fix #300

Here I've added an example of agent delegation as requested by @Luca-Blight in #120.

There are roughly four levels of complexity when building applications with PydanticAI:

  1. Single agent workflows — what most of this documentation covers
  2. Agent delegation — agents using another agent via tools documented in this PR
  3. Programmatic agent hand-off — one agent runs, then application code calls another agent documented in this PR
  4. Graph based control flow — for the most complex cases, graph and a state machine can be used to control the execution of multiple agents. Work to add graph support is ongoing in Graph Support #528 and pydantic-ai-graph - simplify public generics #539

Copy link

cloudflare-workers-and-pages bot commented Dec 24, 2024

Deploying pydantic-ai with  Cloudflare Pages  Cloudflare Pages

Latest commit: 10927a3
Status: ✅  Deploy successful!
Preview URL: https://75c4afc1.pydantic-ai.pages.dev
Branch Preview URL: https://agent-delegation.pydantic-ai.pages.dev

View logs

Copy link
Contributor

@hyperlint-ai hyperlint-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 total issue(s) found.

The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:

  • [Dd]ataclass

docs/multi-agent-applications.md Outdated Show resolved Hide resolved
Copy link
Contributor

@hyperlint-ai hyperlint-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:

  • system_prompt

When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run.

!!! Multiple models
Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [`UsageLimits`][pydantic_ai.usage.UsageLimits] to avoid unexpected costs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes me feel like we should have a way to tally usage on a per-model basis. Of course that's well outside the scope of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should also have a dedicated docs section talking about various ways to manage usage with multiple models.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs definitely, possibly change how usage is calculated.

@HamzaFarhan
Copy link

In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it?
I'm guessing there are 2 possible approaches:

  1. Adding messages/key+values using dependency injection.
  2. Passing the messages throughout the "team" of agents and accumulating them. Especially after Configuration and parameters for all_messages() and new_messages() #496

@HamzaFarhan
Copy link

In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it? I'm guessing there are 2 possible approaches:

  1. Adding messages/key+values using dependency injection.
  2. Passing the messages throughout the "team" of agents and accumulating them. Especially after Configuration and parameters for all_messages() and new_messages() #496

This would also be useful when an agent returns its final response to the main/supervisor/delegator agent and then the main agent can know what went down.

@jacobweiss2305
Copy link

Love it thank you for this!

Copy link
Member

@sydney-runkle sydney-runkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exciting stuff! Left some minor nit picks on the docs, plus a request to split this into a docs PR and a usage structures refactor PR 👍

docs/agents.md Show resolved Hide resolved
docs/examples/flight-booking.md Outdated Show resolved Hide resolved
docs/examples/flight-booking.md Show resolved Hide resolved
docs/multi-agent-applications.md Outdated Show resolved Hide resolved
docs/multi-agent-applications.md Outdated Show resolved Hide resolved
#> Seat preference: row=1 seat='A'
```

1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP 747 lands, see [structure results](results.md#structured-result-validation). We a union as the result type so the model can communicate that it's unable to find a satisfactory choice, internally each member of the union will be registered as a separate tool.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP link maybe?

return FlightDetails(flight_number='AK456')


usage_limit = UsageLimits(request_limit=15) # (3)!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
usage_limit = UsageLimits(request_limit=15) # (3)!
usage_limits = UsageLimits(request_limit=15) # (3)!

And same below

When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run.

!!! Multiple models
Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [`UsageLimits`][pydantic_ai.usage.UsageLimits] to avoid unexpected costs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should also have a dedicated docs section talking about various ways to manage usage with multiple models.

examples/pydantic_ai_examples/flight_booking.py Outdated Show resolved Hide resolved
pydantic_ai_slim/pydantic_ai/usage.py Outdated Show resolved Hide resolved
@HamzaFarhan
Copy link

HamzaFarhan commented Dec 29, 2024

In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it? I'm guessing there are 2 possible approaches:

  1. Adding messages/key+values using dependency injection.
  2. Passing the messages throughout the "team" of agents and accumulating them. Especially after Configuration and parameters for all_messages() and new_messages() #496

Ah my bad, looks like we already have those in ctx.messages. Love it. This was not in the examples tho. It might be worth adding.

Copy link
Contributor

@hyperlint-ai hyperlint-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:

  • [Dd]eps
  • [Ii]nterdependencies

@samuelcolvin samuelcolvin merged commit 4537c07 into main Dec 31, 2024
16 checks passed
@samuelcolvin samuelcolvin deleted the agent-delegation branch December 31, 2024 17:20
@pietz
Copy link

pietz commented Jan 2, 2025

Thank you for writing these guides. I don't want to be overly critical, but I'm not convinced the "Programmatic agent hand-off" example is all that helpful because it's not really a hand-off. It's basically two separate agents that are run in a sequence with one variable being passed over. I know this is meant as a minimal example, but it doesn't provide a programming pattern that's useful for most scenarios.

A minimal example I'd like to see would be similar to Swarm's Triage Example.

I'm happy to participate on this. I'm just not sure, I have an elegant solution for this myself. A proper handoff should:

  • Keep the full conversation of messages as context
  • Switch out the system prompt(s)
  • Have the new agent reply after the handoff was executed.

Here is a simple flow based on the Swarm Triage example:

Screenshot 2025-01-03 at 17 24 51

@pietz
Copy link

pietz commented Jan 3, 2025

Ok, I built something hacky und ugly I want to share.

class AgentHandoff(BaseModel):
    target_agent: Literal["Triage Agent", "Refunds Agent"]

We use a pydantic model as a return type to signal a handoff.

Problem 1: Functions/tools aren't helpful at the moment because the response is fed back to the LLM without the possibility to intervene.

triage_agent = Agent(
    "openai:gpt-4o",
    name="Triage Agent",
    system_prompt="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent.",
    result_type=str | AgentHandoff,
)

refunds_agent = Agent(
    "openai:gpt-4o",
    name="Refunds Agent",
    system_prompt="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund.",
    result_type=str | AgentHandoff,
)

@refunds_agent.tool_plain
def process_refund() -> str:
    return "Purchase refunded"

@refunds_agent.tool_plain
def apply_discount() -> str:
    return "11% Discount applied"

We define the agents similar to the Swarm example with the result type being either a normal message or a handoff request.

agents = {
    "Triage Agent": triage_agent,
    "Refunds Agent": refunds_agent
}

while True:
    message = input("Please enter your message to the agent")
    if message.lower() == 'q':
        break
    res = await agent.run(message, message_history=messages)
    messages = res.all_messages()
    if isinstance(res.data, AgentHandoff):
        agent = agents[res.data.target_agent]
        messages[0].parts[0].content = agent._system_prompts[0]
        res = await agent.run(message, message_history=messages)
        messages = res.all_messages()

Ok, so basically, as long as we get normal text responses from the agent, we keep the conversation going on. In case of a AgentHandoff event, we set the current agent as the target agent.

Problem 2: The system prompt in the message list is still the old one, so we have to swap it out manually.

Problem 3: The "final-result" signature of the return type doesn't semantically fit what we're doing, so we might want to add a custom model for this purpose.

Problem 4: After the handoff tool call, we would want to run the model again without a new user message. This is how Swarm does it. The tool call signals the LLM that the handoff has occured, so it continues the conversation based on the new persona. Since .run() doesn't allow an empty message, I just copied the old message, which now appears twice in the message list.

Overall I'm very unhappy with this approach. I still have my hopes up that some smarter people come up with better ideas. For my use case (which requires multi agent logic to the core), this isn't cutting it and I don't see a simple way of getting there.

@pietz
Copy link

pietz commented Jan 3, 2025

triage_agent = Agent(
    "openai:gpt-4o",
    name="Triage Agent",
    system_prompt="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent."
)

refunds_agent = Agent(
    "openai:gpt-4o",
    name="Refunds Agent",
    system_prompt="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund."
)

@triage_agent.handoff
def handoff_to_refunds():
    return refunds_agent

@refunds_agent.handoff
def handoff_to_triage():
    return triage_agent

@refunds_agent.tool_plain
def process_refund() -> str:
    return "Purchase refunded"

@refunds_agent.tool_plain
def apply_discount() -> str:
    return "11% Discount applied"

What about this? The handoff decorator could be treated as a special tool, which switches out the system prompt(s) and runs another chat completion afterwards. It's equivalent to what Swarm does.

@rohithbojja
Copy link

triage_agent = Agent(
    "openai:gpt-4o",
    name="Triage Agent",
    system_prompt="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent."
)

refunds_agent = Agent(
    "openai:gpt-4o",
    name="Refunds Agent",
    system_prompt="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund."
)

@triage_agent.handoff
def handoff_to_refunds():
    return refunds_agent

@refunds_agent.handoff
def handoff_to_triage():
    return triage_agent

@refunds_agent.tool_plain
def process_refund() -> str:
    return "Purchase refunded"

@refunds_agent.tool_plain
def apply_discount() -> str:
    return "11% Discount applied"

What about this? The handoff decorator could be treated as a special tool, which switches out the system prompt(s) and runs another chat completion afterwards. It's equivalent to what Swarm does.

is this available?

@pietz
Copy link

pietz commented Jan 7, 2025

is this available?

No, sorry I didn't make this clear. It's just an API idea I had that seems quite simple and clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants