Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalability Concerns for Multi-Agent Problems in Crew AI #1989

Open
alm0ra opened this issue Jan 28, 2025 · 7 comments
Open

Scalability Concerns for Multi-Agent Problems in Crew AI #1989

alm0ra opened this issue Jan 28, 2025 · 7 comments
Labels
bug Something isn't working

Comments

@alm0ra
Copy link

alm0ra commented Jan 28, 2025

Description

I have been researching how Crew AI addresses scalability issues for multi-agent problems but couldn’t find any detailed information or documentation on this topic. Additionally, I’ve come across discussions where people suggest that Crew AI may not be scalable in handling multi-agent scenarios effectively.

Steps to Reproduce

1.	Searched the documentation and forums for scalability solutions in Crew AI.
2.	Reviewed community discussions pointing out potential limitations in scalability.

Expected behavior

Screenshots/Code snippets

Operating System

macOS Sonoma

Python Version

3.12

crewAI Version

0.98.0

crewAI Tools Version

1

Virtual Environment

Venv

Evidence

Possible Solution

Additional context

@alm0ra alm0ra added the bug Something isn't working label Jan 28, 2025
@alm0ra
Copy link
Author

alm0ra commented Jan 28, 2025

I believe the issue lies in the kickoff() method within the Crew class. Currently, it appears to execute agents sequentially, one after another.

        for agent in self.agents:
            agent.i18n = i18n
            # type: ignore[attr-defined] # Argument 1 to "_interpolate_inputs" of "Crew" has incompatible type "dict[str, Any] | None"; expected "dict[str, Any]"
            agent.crew = self  # type: ignore[attr-defined]
            # TODO: Create an AgentFunctionCalling protocol for future refactoring
            if not agent.function_calling_llm:  # type: ignore # "BaseAgent" has no attribute "function_calling_llm"
                agent.function_calling_llm = self.function_calling_llm  # type: ignore # "BaseAgent" has no attribute "function_calling_llm"

            if not agent.step_callback:  # type: ignore # "BaseAgent" has no attribute "step_callback"
                agent.step_callback = self.step_callback  # type: ignore # "BaseAgent" has no attribute "step_callback"

            agent.create_agent_executor()

To improve scalability and bring it closer to a production-ready level, consider adopting a more asynchronous approach. For example, using a message broker like Apache Kafka would allow agents to operate independently and process messages online in real time. This would not only enable high scalability but also enhance availability, making the system more robust for handling larger workloads.

@alm0ra
Copy link
Author

alm0ra commented Jan 28, 2025

If you decide to implement this approach, I recommend building it on top of FastStream. This library simplifies integration with various message brokers, reducing the complexity typically associated with managing them.

However, keep in mind that some setup and handling will still be required for specific brokers. For instance, with Kafka, you’ll need to implement configurations such as creating topics and partitions to ensure the system operates efficiently and is production-ready.

I’d be happy to assist with this issue and contribute to making Crew AI more scalable!

@joaomdmoura
@bhancockio

@Vidit-Ostwal
Copy link

Hi @alm0ra, Usually major usage I have seen is where the output of one of the agent is input for another one, therefore Sequential is essential.

If you have more of a general usecase, where there is no flow for the entire process, kickoff() can be ran async

Refer to this documentation

@alm0ra
Copy link
Author

alm0ra commented Jan 28, 2025

@Vidit-Ostwal
This sentence perfectly captures the bottleneck:

The output of one agent serves as the input for another, making sequential execution essential.”

Now, imagine having multiple agents—for example, five.

The process starts with the first agent, and in the worst-case scenario, all agents execute sequentially. This means that until the fifth agent completes its task, new requests cannot be processed because the first agent remains unavailable.

Now, extend this scenario to a production workflow involving more than ten agents. The bottleneck becomes even more pronounced.

However, if agents communicate asynchronously via messages through a broker, this limitation could be addressed efficiently.

@Vidit-Ostwal
Copy link

@alm0ra, now I understand the issue, atmost only one agent would be working at a time, or two depending whether they require each other or not, but meanwhile, till the entire things does not complete, we can not process any other request from production side.
I guess this is one of the needed feature, making crew ai more production ready,
Currently scaling more than 10 agents would also be very tricky, as it could lead to latency issues.

Let me know if you making a PR for this, I would like to contribute on this along with you, if you don't mind.
I know Kafka a bit.

@alm0ra
Copy link
Author

alm0ra commented Jan 28, 2025

The implementation of this feature ultimately depends on the maintainers of Crew AI and their decisions and plans for the future of the package.

@Vidit-Ostwal
Copy link

@alm0ra yep yep, i understand that,
There is no point we develop some stuff, which doesn't align with what maintainers vision.

I am currently facing the same problem, where latency becomes an issue, when multiple hits are in the API.

What you have proposed, partially solves my problem.

And therefore I believe this kind of latency solving bug / feature request will surely come at a point and maintainers have to think about that at one point of time.

The only question is how are they willing to approach the problem, with Kafka or something else, it's upon them only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants