Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: use clustered directory structure for cache #562

Open
hak0 opened this issue Jan 18, 2025 · 1 comment
Open

Request: use clustered directory structure for cache #562

hak0 opened this issue Jan 18, 2025 · 1 comment

Comments

@hak0
Copy link

hak0 commented Jan 18, 2025

Thank you once again for creating such an excellent piece of software! I truly appreciate the effort and thought that has gone into its development. I do, however, have a question regarding the indexing performance of cached transcoded files.

Currently, all transcoded files are placed in the transcoding folder:

  • album covers are saved in transcoding/covers
  • songs and podcasts are saved in transcoding/audio

But they are all stored in the root directory, and as the number of files increases, the read performance will degrade because of the limit of the filesystem. I think we can add a folder structure to mitigate this problem.

For album arts like al-1234-512.png, we can extract the prefix al-12 as the folder name, so the cached files are saved in al-12/al-1234-512:

(In handlers_raw.go)

	cachePath := filepath.Join(
		c.cacheCoverPath,
		id.String()[:5], // Use the first 5 characters as the subfolder name
		fmt.Sprintf("%s-%d.%s", id.String(), size, coverCacheFormat),
	)

Similarly, for transcoded songs like 000e4dc149737ce71867ad753bd2bb79, we can use the prefix 00 as the first-level folder name, and 0e as the second-level folder name. Then we will have a two-level tree structure that can hold 256*256*256=16777216 transcoded songs without performance degrade (each folder contains <256 entries on average).

(In transcoder_caching.go)

path := filepath.Join(
    t.cachePath,          // Base path
    key[:2],              // First-level directory (first 2 characters of `key`)
    key[2:4],             // Second-level directory (next 2 characters of `key`)
    key,                  // Actual file name
)

I would greatly appreciate it if you could consider my suggestion. Thank you for your time and attention!

@sentriz
Copy link
Owner

sentriz commented Jan 18, 2025

thank you for the nice words! this makes a lot of sense to me

should be an easy enough change to implement

though doing it a backwards compatible way might be tricky (to not invalidate peoples' large caches) we could

  • check the old and new path format when finding a cache hit
  • or, write a migration to move all the files to the new format on startup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants