Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applies multinodessuite in twonodessuite #1031

Merged
merged 19 commits into from
Dec 17, 2024

Conversation

benbierens
Copy link
Contributor

@benbierens benbierens commented Dec 11, 2024

During the work on #1019 it was found that the integration tests for block exchange are not covering an aspect of reliability that other (marketplace) integration tests are relying on. In order to help focus debugging efforts, this adds a transfer test that checks this aspect.

The instability caused by the old process-handling code of the twonodessuite has been updated to use the new multinodessuite fixture.

Additionally: sets http timeout for the codex client. (Was infinite. Now 1 minute.)

@benbierens benbierens self-assigned this Dec 11, 2024
@benbierens benbierens added Client See https://miro.com/app/board/uXjVNZ03E-c=/ for details Testing See https://miro.com/app/board/uXjVNZ03E-c=/ for details labels Dec 11, 2024
@benbierens
Copy link
Contributor Author

benbierens commented Dec 11, 2024

I've gone through several steps to isolate the issue this PR addresses. Read the story: https://hackmd.io/@ThatBen/S15F3DY4Jl

TLDR:
The issue is in how the integration tests handle the process-streams from the codex nodes. A solution was already implemented in multinodesuite, but not applied everywhere. This RP fixes that and solves the stalling issue.

@benbierens
Copy link
Contributor Author

This PR addresses point 2 of Mark's: #746 clean up multinodes and use it in every test
When/if merged, plz update this issue.

Copy link
Contributor

@emizzle emizzle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! The main question is: do all these tests pass reliably under the conditions in https://hackmd.io/@ThatBen/S15F3DY4Jl? If so, awesome, let's get this in, so you can be unblocked on the blockexchange 👍

@@ -311,7 +311,7 @@ template multinodesuite*(name: string, body: untyped) =
role: Role.Client,
node: node
)
if clients().len == 1:
if running.len == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm sure there was a good reason for this, but could you please explain why? We don't want the running node to be a HardhatNode, which wouldn't give us what we want (bootstrap info)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is behind if var clients =? nodeConfigs.clients: and nodeConfigs.clients is of type ?CodexConfigs. So these should always be Codexes. I changed this because using clients() assumes that the config for this test has at least 1 client configured and that the client will always be started before any providers/validators (true because of the ordering of the code). If a user creates a config with only providers or validators, none of them would be used for bootstrap. With this change, the first node regardless of role will be the bootstrap node.

Copy link
Contributor

@emizzle emizzle Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user creates a configuration that runs a single hardhat node and at least one client node, running.len == 1 will never be true when processing the first client node. We don't want that.

Perhaps a better option would be to check that clients().len == 1 or providers.len() == 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user creates a config with only providers or validators, none of them would be used for bootstrap. With this change, the first node regardless of role will be the bootstrap node.

Without a client config, there will be no bootstrap nodes, since this is nested under the iteration of clients.configs.

I've made some changes that add each started client and validator to a seq of bootstrap nodes, that are used to start the next node. That means each subsequently started node should be more connected to peers than the last. This also fixes the case when no clients were started, that bootstrap SPRs are still used. PR #1048

Comment on lines 8 to 12
multinodesuite "Marketplace":
let marketplaceConfig = NodeConfigs(
clients: CodexConfigs.init(nodes=1).some,
providers: CodexConfigs.init(nodes=1).some,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use marketplacesuite here (and in others below)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it. It makes very little difference technically, but makes slightly more sense. :D

Copy link
Contributor

@emizzle emizzle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have approved the last one. Apart from the running.len == 1 fix, this is mostly good to go, and so I'll approve to get you unblocked

@benbierens
Copy link
Contributor Author

Nice work! The main question is: do all these tests pass reliably under the conditions in https://hackmd.io/@ThatBen/S15F3DY4Jl? If so, awesome, let's get this in, so you can be unblocked on the blockexchange 👍

I can't say for certain that the tests are reliable after this PR. I can only say they are more reliable than before.

@benbierens benbierens marked this pull request as ready for review December 17, 2024 08:11
@benbierens benbierens changed the title Reliable transfer integration test. Applies multinodessuite in twonodessuite Dec 17, 2024
@benbierens benbierens added this pull request to the merge queue Dec 17, 2024
Copy link
Member

@markspanbroek markspanbroek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you @benbierens!

@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Dec 17, 2024
# Conflicts:
#	tests/integration/nodeprocess.nim
@benbierens benbierens enabled auto-merge December 17, 2024 11:24
@benbierens benbierens added this pull request to the merge queue Dec 17, 2024
Merged via the queue into master with commit 20bb5e5 Dec 17, 2024
17 checks passed
@benbierens benbierens deleted the feature/transfer-reliability-test branch December 17, 2024 14:11
@benbierens benbierens mentioned this pull request Apr 2, 2024
8 tasks
emizzle added a commit that referenced this pull request Dec 18, 2024
After a change in PR #1031, bootstrap node sprs may not work when Hardhat nodes are started with the tests. This fixes it by appending all started client's and provider's SPR to a sequence, and using that sequence of SPRs to start the next node. This means all subsequently started nodes will be connected to its previously started peers.

This also fixes the case when bootstrap SPRs would not be present if no clients were started.
github-merge-queue bot pushed a commit that referenced this pull request Dec 18, 2024
After a change in PR #1031, bootstrap node sprs may not work when Hardhat nodes are started with the tests. This fixes it by appending all started client's and provider's SPR to a sequence, and using that sequence of SPRs to start the next node. This means all subsequently started nodes will be connected to its previously started peers.

This also fixes the case when bootstrap SPRs would not be present if no clients were started.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client See https://miro.com/app/board/uXjVNZ03E-c=/ for details Testing See https://miro.com/app/board/uXjVNZ03E-c=/ for details
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants