-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to slim down .scalarCache? #620
Comments
Unfortunately, there is no way (that I am aware of) to accomplish this with Git in any faster way. A hacky way around that would be to perform a fresh partial clone from the current one (using This concern about the missing objects is with merit, by the way. Imagine that you performed an interactive rebase in the worktree, and then relied on the reflog to refer to the otherwise-unreachable objects. With that "hacky way" I described, those would become missing objects and since they were never pushed, irretrievably lost ones at that. |
Why are the objects that were never pushed not stored in .git/objects? |
Oh, right, you want to trim the shared alternate object database, not Typically, it is an unsolvable problem how to determine what objects in a share alternate object database can be deleted: you never know which repositories use this as an alternate object database. To safely remove objects from such a shared alternate object database, it would have to be known which repositories use that alternate, and then you would have to determine all objects which correspond to checked-out files from all the worktrees of those repositories. However, in this instance, |
Yes, I am using |
The idea is that, for the most part, I only need the objects for my current checkout version, so I want to remove the unneeded objects after updating the code. |
@miku1958 do you have a single clone with a single worktree? If so, you may still be able to get what you need by enumerating the object names (SHAs) corresponding to the checked-out files. This is not enough, though: you will need the object name of the current Here is an attempt to do that: (
# the blobs corresponding to the checked-out files
git ls-files --sparse --stage |
grep -ve '^160000' -e '/$' |
cut -c 8-47 &&
# the commit and root tree
git rev-parse HEAD HEAD^{tree} &&
# the trees corresponding to the checked-out files
git ls-files --sparse |
sed -n 's/\/[^/][^/]*$//p' |
uniq |
sort |
uniq |
xargs -d '\n' rev-parse
) |
git pack-objects trimmed-pack Note: This will not be enough, as even something as simple as git diff --no-abbrev --raw HEAD^! |
sed -ne '/^:160000/d' -e 's/^[^ ]* [^ ]* \([^ ]*\).*$/\1/p' But even that would not be enough, it would forget the parent commit's object name as well as all of the involved trees. And since trees can be deep, a shell scriptlet to enumerate those would have to look something like this: git diff --no-abbrev --raw HEAD^! |
sed -ne '/^:160000/d' -e '/\//{
s/^[^ ]* [^ ]* [^ ]* [^ ]* .\t\(.*\/\)\?.*$/HEAD^:\1/
:1
s/\/[^/]*$//
p
/\//b1
}' |
uniq |
sort |
uniq At this stage, we're already on a journey to a ridiculously-involved shell script, and I am sure that I forgot something crucial that also needs to go into that trimmed packfile... |
I probably get it, let me try tomorrow to see if this will work. Thanks! |
To be honest, if it works, you will most likely want to use a better programming language to implement this ;-) |
I tried running this, but the .scalarCache size just went from 24GB to 23.3GB
And if I delete all the objects, and spend an hour running
git reset --hard
, it only generates about 8GB of objects.The text was updated successfully, but these errors were encountered: