-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot limit memory usage; OOM when creating many small files. #1576
Comments
Can you try with below config once and see:
OR
|
Still the same problem: cat > bf2.yaml # create either of the files that you indicated
docker run --memory=1G --privileged \
-e AZURE_STORAGE_ACCOUNT="$AZURE_STORAGE_ACCOUNT" \
-e AZURE_STORAGE_ACCOUNT_CONTAINER="$AZURE_STORAGE_ACCOUNT_CONTAINER" \
-e AZURE_STORAGE_ACCESS_KEY="$AZURE_STORAGE_ACCESS_KEY" -v $(pwd)/bf2.yaml:/etc/bf2.yaml \
-it ghcr.io/alignmentresearch/public/blobfuse2:2.3.2-ubuntu-22.04 \
/bin/bash -c 'blobfuse2 mount --tmp-path=/bf --config-file=/etc/bf2.yaml /mnt; for i in {1..10000}; do echo $i > /mnt/$i & done; wait' Sometimes I don't get any errors printed, but the program ends very quickly and I can check the Azure portal and see that only a few files (~20 or so) have been created, which is also incorrect behavior. |
How much of memory and cpu cores your pod gets? I suspect some low resource availability causing the crash here. |
As stated in my reproducing snippet, the container gets 1G of memory and unlimited cores. In my actual deployment there is no reserved amount of memory, but the host has 1TiB of RAM, so it is unlikely that the final amount is as low as 1GiB. Is there some way I can guarantee there will be enough memory for blobfuse2? |
I guess 1GB might be a low limit as blobfuse2's config that we shared last requests 800MB just for block-cache. On top of this there are other components like attr-cache which also needs memory and there are other processes running as well on the pod. May be creating a pod with higher memory like 4GB or something might help. If that is not a possibility then file-cache migth be the only way. You need to analyze on your pod how much of free memory you have after blobfuse starts. Do not run any test just create the pod and log in to that to analyze what is the current memory utilization without any load on the system. |
I run a Kubernetes cluster with Blobfuse CSI and we're having lots of problems from Blobfuse2 dying when too many commands are sent to it. I've managed to reproduce this with a few-liner. With any account and an existing (perhaps empty) container, run:
That is, with 1GB of limit memory, create 10000 tiny files simultaneously. The container will crash after a bit with a lot of errors like:
which indicates blobfuse2 has died, in this case due to OOMKill.
It's true that the amount of memory here is pretty small (1 GB), but the same crash happens at larger amounts of memory with many concurrent files (perhaps larger files as well). I have also reproduced the crash with the following config file:
An alternative config file which also crashes, and does no caching.
The text was updated successfully, but these errors were encountered: