Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test/e2e 2 20241228 #680

Merged
merged 8 commits into from
Jan 3, 2025

Conversation

JamesPiechota
Copy link
Collaborator

No description provided.

@JamesPiechota JamesPiechota force-pushed the test/e2e-2-20241228 branch 4 times, most recently from 2faf810 to 731b3a1 Compare January 1, 2025 17:19
wait_store_entropy_processes(WaitN),
{ok, N};
{ChunkEntropy, Rest} ->
true = ar_replica_2_9:get_entropy_partition(PaddedEndOffset) == Partition,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this change? The way I see it the original assertion should hold because store_entropy only takes the entropies corresponding to a single partition in the input.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! The intent of the assertion is the same (unless Ive made a mistake)

Previously Partition was calculated from PartitionEnd and then passed into the function and, I believe, only used in the assertion.

Now I pass in PartitionEnd directly so I can use it in a guard to stop the recursion. So the assertion should still be asserting the same logic (one partition of entropy) but it's doing it using PartitionEnd instead of Partition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, thank you. What do you think about keeping the previous assertion too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like re add the Partition argument?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my bad, nevermind, I see RangeEnd replaced Partition so the assertion is changed accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -389,7 +389,8 @@ handle_cast(do_prepare_replica_2_9, State) ->
%% storage modules are commonly set up with.
PaddedEndOffset2 = shift_replica_2_9_entropy_offset(
PaddedEndOffset, -SliceIndex),
store_entropy(Entropies, PaddedEndOffset2, SubChunkStart, Partition,
PartitionEnd = (Partition + 1) * ?PARTITION_SIZE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pad PartitionEnd it because we compare it with the padded offset. We should do the same for the CheckRangeEnd check above considering the updates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated! 6811cea

@@ -389,7 +389,8 @@ handle_cast(do_prepare_replica_2_9, State) ->
%% storage modules are commonly set up with.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting a change: since the 2.9 partition is slightly bigger than the recall partitition... -> since the storage module may be configured to be smaller than the partition.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated! 6811cea

There is an existing bug with repacking in place to/from unpacked,
so those tests are currently commented out. Will uncomment them
when the bug is fixed
We always have to generate more entropy than there is data in a
partition. Partly because a partition at 3.6TB is not evenly
divided into chunks (at 256 KiB each), and also partly because of
how entropy is sliced up and allocated to chunks.

This extra entropy is called the "overhang". Previously the overhang
would stack - i.e. the overhang on partition 0 would actually be
applied to the first bit of partition 1, and so on. By the time
we got to a hgher partition, a significant fraction of the chunks
in the partition would rely on entropy that belons to the previous
partition.

This made accounting for entropy partitions vs. recall partitions
a little confusing, and it also had a negative impact on performance
when packing non-consecutive partition.

This change localizes all entropy to a single partition - basically
discarding the overhang rather than applying it to the next partition.

at 256 MiB the amount of entropy discarded per partition is relatively
small. The benefit of this change is that there should no longer be a
negative performance impact to packing non-consecutive partition. This
impact was particularly noticeable when packing non-consecutive higher
partitions (e.g. packing just partition 30 would suffer from
a multi-hour period during which entropy generation was very slow due
to the early chunks in the partition relying on entropy that was
predominantly used to pack chunks for partition 29).

In addition to the overhang change, this commit addresses a couple
of bugs in the prepartion and enciphering phases
previously the enciphering mode was inconsisten in teste based
on whether a whole chunk was enciphered or just a sub chunk.

This fix ensures all enciphering is handled the same - and since
production is using xor which is fast, now test and production
enciphering is identical
Recactor the case statement that decides whether or not to log
an error. Also include more fine grained information about what
sub-process requested the chunk
@JamesPiechota JamesPiechota marked this pull request as ready for review January 2, 2025 22:04
This PR adds 2 checks one on the server side and one on the client
side.
@JamesPiechota JamesPiechota merged commit 0da18e1 into release/N.2.9.0-early-adopter Jan 3, 2025
57 of 68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants