-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core dump with flux archive create #6461
Comments
Thanks! I created some files with dd from
and (eek) I was able to reproduce with simply
where FILE is Edit: so don't bother with the core file - I can make them all day long! |
|
That's great! I'm glad it wasn't something about my environment. If you make a bunch of cores, you can use them for pie. 🥧 Happy Thanksgiving @garlick ! |
You too @vsoch ! |
Oh wow, this was a lame error on my part! This appears to fix the problem: diff --git a/src/common/libfilemap/fileref.c b/src/common/libfilemap/fileref.c
index 02756a9c3..dab7a1dc5 100644
--- a/src/common/libfilemap/fileref.c
+++ b/src/common/libfilemap/fileref.c
@@ -98,7 +98,7 @@ static json_t *blobvec_create (int fd,
#endif
if (offset < size) {
off_t notdata;
- int blobsize;
+ size_t blobsize;
#ifdef SEEK_HOLE
// N.B. returns size if there are no more holes |
Nice! Should I hot patch this on a custom branch or are you planning to do a PR soon? I'm good either way, but please let me know so I can run more experiments over break! I'm testing different topology setups in the Flux Operator and it would be really nice to go about 2GB. I'm really wanting to know if the different kary designs (at larger sizes) split out more-so than this (right now it's mostly binomial vs. kary that seems to make a difference, and in some cases kary:1 is just really bad). Don't mind the crappy graphs - I'll make them better when I'm beyond prototype mode. |
Problem: 'flux archive create' segfaults when it tries to archive files with size >2G. Change a local variable from int to size_t. Fixes flux-framework#6461
Just posted a PR. Right - the performance will be sensitive to the tree fanout because each level of the tree will fetch data once from its parent, then provide it once to each child that is requesting it. Well, that would assume perfect caching but the LRU cache tries to maintain itself below 16MB so for large amounts of data the cache may thrash a bit. If you want to play with that limit, you could do something like
Not sure what effect that would have since it kind of depends on how the timing works out. You can peek at the cache size with
|
Oh nice! That's really helpful - I'll add that to my notes (and experiments). We definitely want to crank that size up. And we are testing exactly that - how different topologies handle the distribution. We are wanting to use flux as a distribution mechanism, and I'm new to this generally and want to experimentally see/verify things. There are a few uses:
Anyway - thank you! |
I should have mentioned that if you do the module reload, use flux exec to do it instance wide. |
Works great! 🥳 We weren't able to go above 2GB before, and here we are successfully creating and distributing 3: And we go up to 10! The times are really shooting up there - the cleans / deletes aren't that important, moreso the create and distribute. I don't have any plots yet - doing a smaller cluster first (6 nodes up to 10GB). And then I'm thinking the base case would be a wget download of an archive of the same size, because that's what we are trying to improve upon. Question for you @garlick - for the stats, what would this show me / how would it be useful? Should I run it at the beginning / end / between operations to get stats, and what do they tell me? Here is a shot of a run at the start (before I've done anything, but after I reload the content module across brokers): |
I'm looking at some of the result data, and the purple line is erroneous: Specifically, I think I'm hitting some limit or other error with kary-3 (and I'm not sure what at the moment, just finished the runs and would need to manually inspect): Here is what that topology looks like:
I'll try bringing the cluster up tomorrow and just targeting that size, and running that stats command between the calls to see if anything looks weird. I could also try bringing up the larger cluster just to do spot check runs to see if the issue reproduces (I bet it will)! Also - I know that binomial == kary:2, I'm mostly doing both to sanity check flux knows that too - seems like something is funny! :) In the plot, the green and aqua should be the same (but they are not) and it makes sense because the topologies look different. This of course could be a bug on my part, but here is the tree I get when I ask the broker to make me a binomial topology: nodes-6-topo-binomial
0 faa621ae1899: full
├─ 1 faa621ae1899: full
├─ 2 faa621ae1899: full
│ └─ 3 faa621ae1899: full
└─ 4 faa621ae1899: full
└─ 5 faa621ae1899: full And the kary:2 (this is my understanding of what it should look like) nodes-6-topo-kary:2
0 faa621ae1899: full
├─ 1 faa621ae1899: full
│ ├─ 3 faa621ae1899: full
│ └─ 4 faa621ae1899: full
└─ 2 faa621ae1899: full
└─ 5 faa621ae1899: full If I'm reading that right, broker 0 has three children, and I think there should be just two? |
I'm sure @garlick will answer here shortly with a more thoughtful answer, but I just wanted to quickly point out that binomial != kary:2. kary:2 is a binary tree, while a binomial tree has a more complex definition (explained better than I can in the binomial heap Wikipedia article.) |
You're right! I need to read more about this - this totally goes against my idea of what a heap is too. I think I was probably thinking of binary tree? https://en.wikipedia.org/wiki/M-ary_tree. What about the bug? It's not letting me comment anymore so I'm re-opening. |
Also - I was perusing through old issues and found one that seems to encompass both of the findings here - a segfault and no such file or directory: The difference is that it's directed at the kvs, but (I suspect)? flux archive is using kvs, so maybe there is some overlap there? I think we fixed the segfault, and maybe there is some issue still with keys/indexes. I'm bringing up a new cluster now to see if I can reproduce and get you more data. |
Is the bug you are referring to the ENOENT from Sorry I'm away from a computer atm |
The |
Continued in #6463. |
When I try to do a flux archive create, specifically 3GB or larger, there is a Segmentation fault:
Here is the run with valgrind:
I saw the message above, and added
--leak-check=full
:I'm attemping to upload the core dump to Microsoft OneDrive, but it's a 🔥 🗑️ so it's failing every time - I'll keep trying and post a link here if/when it works. I hope (think maybe?) the above gives enough hint to what might be going on?
Ping @garlick and @grondo !
The text was updated successfully, but these errors were encountered: