-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve load time (mpsc channel, post page-lock, directory buckets) #968
Conversation
8aa66a2
to
fe62b16
Compare
iris-mpc-gpu/src/helpers/mod.rs
Outdated
chunk_offset, | ||
chunk_offset + chunk_length | ||
); | ||
let size = chunk_length / device_manager.device_count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of interest - why do we divide by the count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we actually don't allocate one blob of memory, but device_count many. Elements are allocated to devices by serial_id % device_index
. This is mostly a leftover from when the memory was loaded onto the devices, we could also do it differently now.
iris-mpc-store/src/s3_importer.rs
Outdated
})); | ||
} | ||
|
||
drop(tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eaypek-tfh what's the reason for the manual drop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't need to drop it expicitly as it'd be implicitly dropped after all clones are dropped, right? I think I just wanted to make it obvious but let me remove it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah should be all fine, if anything this is a bit awkward since you drop it before the handles are awaited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
51dbcaf
to
c4ce09b
Compare
c4ce09b
to
7d93887
Compare
Change
This PR improves load time via:
select_all
andtry_next
) to (spawning a separate tokio task for eachget_object
and sending items to a tokio mpsc channel). Instead of polling multiple dynamically changed list of streams, we now wait for a single channel.Background & More Details
CUDA_ERROR_INVALID_VALUE
errors. So, we abandoned them.