Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage requirement estimates for S3 FSCK #2

Open
ghost opened this issue Sep 23, 2021 · 0 comments
Open

Storage requirement estimates for S3 FSCK #2

ghost opened this issue Sep 23, 2021 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@ghost
Copy link

ghost commented Sep 23, 2021

Not a bug/issue, posting it for review/discussion in case anyone identifies an incorrect estimation.

The below numbers are based on if:

  • ring showed 10 Billion chunks/objects
  • s3 showed 8 billion objects in UTAPI

While without knowing the quantity of objects that are splits, and how many stripes each split contains, the storage estimates look something like:

  • listkeys.py: output contains 91 - 93 bytes per line (ARC/COS key).
    • The variance is disk groups and/or 0 padding.
    • Without disk groups or 0 padding disk1 - disk9 will be 90 bytes where disk10+ will be 91 bytes consumed.
  • sproxyd-basic: keys.txt output file is at least 91 bytes per line (s3 object).
    • The variance is bucket name length. 91 bytes is a 1 character bucket name, +1 byte for every additional character of the bucket name.
  • P0: 49-109 bytes per line
    • NOK: 49 bytes during a dig lookup failure
    • SINGLE: 75 bytes if there is a single arc stripe object found in the dig
    • SPLIT: 109 bytes + 109 bytes for each additional split arc stripe
  • P1: 71 bytes per line (should be consistent)
  • P2: 41 bytes per line
  • P3: 0 bytes (no stored output)
  • P4: 126 bytes per line with a 200 success and using 127.0.0.1 loopback IP for srebuildd communication.
    • +5-10 bytes for failures depending on what response status gets stored in csv.

P0 is not easily estimate-able. A split with 10 arc stripes will increase storage from 109 bytes per s3 object to 1090 bytes, 30 splits per object would be 3270 bytes. Minimum storage appears to be 600GB for 8B objects.

Otherwise a rough estimate of required storage for the above quantity of objects would be:

  • listkeys: ~900GB
  • sproxyd-basic: ~750GB
  • P0: ~ 2.4TB (wild guestimate)

4TB available space.

@ghost ghost added the documentation Improvements or additions to documentation label Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

0 participants