Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker storage driver possible? #233

Closed
reneleonhardt opened this issue Aug 25, 2024 · 2 comments
Closed

Docker storage driver possible? #233

reneleonhardt opened this issue Aug 25, 2024 · 2 comments

Comments

@reneleonhardt
Copy link

reneleonhardt commented Aug 25, 2024

Hello, thank you for your amazing work, the results are already very impressive! 🚀

Theoretically, would it be possible to create a Docker storage driver optimized for DwarFS?
Or could the DwarFS file system already be used within Docker images?
Could this result in smaller downloads or faster boot times for cloud workloads? 😍
https://docs.docker.com/engine/storage/drivers/overlayfs-driver/

@mhx
Copy link
Owner

mhx commented Aug 25, 2024

Hi && welcome!

Hello, thank you for your amazing work, the results are already very impressive! 🚀

Thanks! <3

Theoretically, would it be possible to create a Docker storage driver optimized for DwarFS?

I just had a very brief look at your link and as far as I understand it (and please correct me if I'm wrong), the overlayfs driver is what docker uses by default. The data in the various overlayed directories is stored pretty much as-is, with no compression or anything involved.

I assume:

  • Each Dockerfile command creates a new layer.
  • All of these layers are read-only when you run a container.

In the very distant future (once #18 is resolved), I could imagine that all layers live in a single DwarFS image that can be rolled back / appended as needed. But we're likely talking years here.

Or could the DwarFS file system already be used within Docker images?

I very much think so, although I'm completely unfamiliar with how Docker works internally.

However, it should be totally possible to build a separate DwarFS image for each layer after the layer has been finalized, and then mount that image before using the mountpoint as a layer for overlayfs. I've documented how something like this can be done.

In fact, I've used DwarFS in a similar way long before Docker became a big thing. I had a "base layer" of about 1000 different vanilla versions of Perl. Then layers on top that added certain sets of modules to the base layer. And finally a writeable layer on top that I could use for testing.

Could this result in smaller downloads or faster boot times for cloud workloads? 😍

Smaller downloads? Maybe. It depends a bit on how Docker layers are compressed, but in general DwarFS images can be about the same size as a .tar.xz, but it depends a lot on the parameters.

Faster boot times? Also maybe. It's definitely faster to mount a DwarFS image than to extract a tarball, as data is only decompressed on-demand.

For me, the biggest advantage of using DwarFS layers in Docker would be storage. I just took a random 5 GiB layer from an Ubuntu docker image and DwarFS stores this in a 850 MiB image.

One downside of using DwarFS to compose many multi-layer overlays is that currently, each instance of the FUSE driver uses its own block cache. #219 tracks an effort to efficiently mount a large number of DwarFS images using a shared block cache.

In any case, let me know what you think. From my perspective, DwarFS should already offer everything that's needed to use it for (experimental) Docker storage.

@mhx
Copy link
Owner

mhx commented Aug 25, 2024

I'll move this to discussions as it's not really an issue.

Repository owner locked and limited conversation to collaborators Aug 25, 2024
@mhx mhx converted this issue into discussion #234 Aug 25, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants