tpu-client-next: add mechanism to implement custom scheduler #4436

KirillLykov · 2025-01-13T17:04:56Z

Problem

The current scheduler implementation works for SendTransactionService or similar applications where the client checks if the transaction has been added to the block and retries when necessary. But for some other applications, like transaction-bench client, we want to have a custom strategy. For example, a combination of try_send and send to some future leaders.

This PR introduces trait that implements most of the logic except one method send_to_workers, which should be implemented by user.

For the sake of backport simplicity, should be added after #4454

Summary of Changes

tpu-client-next/src/workers_cache.rs

tpu-client-next/src/connection_workers_scheduler.rs

ilya-bobyr · 2025-01-27T22:05:18Z

tpu-client-next/src/connection_workers_scheduler.rs

-                    }
-                }
-            }
+            Broadcaster::send_to_workers(&mut workers, fanout_leaders, transaction_batch).await?;


Handing errors in async code is a bit tricky.

Here you are saying that if send_to_workers() fails, you are just going to return the error to the caller.

Just want to make sure that this is correct behavior.
Considering that under normal flow there are these calls, that are executed when the worker scheduler is terminating:

workers.shutdown().await; endpoint.close(0u32.into(), b"Closing connection"); leader_updater.stop().await;

Are you sure they should not be executed if send_to_workers() fails for any reason?

If not, you may want to store an error, terminate the loop, run the shutdown code and only then return the error to the caller.

Good catch! Thanks

Maybe the best strategy would be to break the loop in this case. Add this to the new trait documentation.

@ilya-bobyr what about this part? I added a last_error which holds the value of the error if it happen during send_to_workers. Added also to the trait documentation clarification that send_to_workers errors are critical meaning that they stop the scheduler.

Looks good.
I would probably call it send_to_workers_err or something like that.
I understand that last in the name comes from the fact that you are calling the send function in the loop, and you are talking about the fact that it is the error observed on the last loop iteration.
But at first it was not clear to me, what the name means.

tpu-client-next/src/quic_networking.rs

tpu-client-next/src/workers_cache.rs

tpu-client-next/src/connection_workers_scheduler.rs

KirillLykov requested review from ilya-bobyr and alessandrod January 13, 2025 17:05

KirillLykov commented Jan 13, 2025

View reviewed changes

tpu-client-next/src/workers_cache.rs Show resolved Hide resolved

alessandrod changed the title ~~add mechanism to implement custom scheduler~~ tpu-client-next: add mechanism to implement custom scheduler Jan 14, 2025

ilya-bobyr reviewed Jan 16, 2025

View reviewed changes

tpu-client-next/src/workers_cache.rs Outdated Show resolved Hide resolved

tpu-client-next/src/connection_workers_scheduler.rs Outdated Show resolved Hide resolved

KirillLykov force-pushed the klykov/add-extention-mechanism branch from 2f4a82b to 727996d Compare January 27, 2025 20:56

KirillLykov commented Jan 27, 2025

View reviewed changes

tpu-client-next/src/connection_workers_scheduler.rs Show resolved Hide resolved

KirillLykov requested a review from ilya-bobyr January 27, 2025 20:57

ilya-bobyr reviewed Jan 28, 2025

View reviewed changes

KirillLykov force-pushed the klykov/add-extention-mechanism branch from 8e83062 to 1c17270 Compare January 28, 2025 17:51

KirillLykov commented Jan 28, 2025

View reviewed changes

tpu-client-next/src/connection_workers_scheduler.rs Show resolved Hide resolved

KirillLykov commented Jan 28, 2025

View reviewed changes

tpu-client-next/src/connection_workers_scheduler.rs Show resolved Hide resolved

KirillLykov force-pushed the klykov/add-extention-mechanism branch from fd0b83b to 7488ce8 Compare January 29, 2025 08:12

KirillLykov requested a review from ilya-bobyr January 29, 2025 08:13

ilya-bobyr previously approved these changes Jan 29, 2025

View reviewed changes

KirillLykov added 6 commits January 29, 2025 15:46

add mechanism to implement custom scheduler

e72c10f

rename Broadcast to Broadcaster

779f4b8

PR review changes

8eea6bd

update comment for run

3a9bab6

moved NonblockingBroadcaster to the end of module

bce116e

Enlarge public interface

bd4b8fb

KirillLykov dismissed ilya-bobyr’s stale review via bd4b8fb January 29, 2025 14:47

KirillLykov force-pushed the klykov/add-extention-mechanism branch from 7488ce8 to bd4b8fb Compare January 29, 2025 14:47

KirillLykov added the automerge automerge Merge this Pull Request automatically once CI passes label Jan 29, 2025

KirillLykov mentioned this pull request Jan 29, 2025

Enhancements for tpu-client-next #2991

Open

10 tasks

ilya-bobyr approved these changes Jan 30, 2025

View reviewed changes

mergify bot merged commit b63e819 into anza-xyz:master Jan 30, 2025
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tpu-client-next: add mechanism to implement custom scheduler #4436

tpu-client-next: add mechanism to implement custom scheduler #4436

KirillLykov commented Jan 13, 2025 •

edited

Loading

ilya-bobyr Jan 27, 2025

KirillLykov Jan 28, 2025

KirillLykov Jan 28, 2025

KirillLykov Jan 29, 2025

ilya-bobyr Jan 30, 2025

tpu-client-next: add mechanism to implement custom scheduler #4436

tpu-client-next: add mechanism to implement custom scheduler #4436

Conversation

KirillLykov commented Jan 13, 2025 • edited Loading

Problem

Summary of Changes

ilya-bobyr Jan 27, 2025

Choose a reason for hiding this comment

KirillLykov Jan 28, 2025

Choose a reason for hiding this comment

KirillLykov Jan 28, 2025

Choose a reason for hiding this comment

KirillLykov Jan 29, 2025

Choose a reason for hiding this comment

ilya-bobyr Jan 30, 2025

Choose a reason for hiding this comment

KirillLykov commented Jan 13, 2025 •

edited

Loading