-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodically clean up cached bundles directory #5976
base: master
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for oasisprotocol-oasis-core canceled.
|
8682e72
to
6e5f668
Compare
9ef3941
to
0869adc
Compare
c8ded6d
to
f3f52e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't dive too deeply into the overall logic, but it looks good based on an initial look! I left a couple of minor comments on the code.
f3f52e3
to
4ed1a32
Compare
I think will will either have 1. stop copying bundles configured via legacy path or 2. block at init time for the cleanup. With current design, even if you remove the bundle from the config (bundle path), it was previousy copied as part of Unless we do cleanup before that (we don't as we would block further with cleanup?), you cannot know after that ( Update this actually has a further implication: I have confirmed rn the Update of update |
dbddfa5
to
8bfe9f6
Compare
64a7aaa
to
622d0bd
Compare
I can confirm thought that e.g. if I delete bundles too early (when I receive new runtime descriptor) as was the case here, the runtime was actually suspended so test failed. This is good. Update: |
9cc236e
to
3d4b25d
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5976 +/- ##
===========================================
+ Coverage 0 65.16% +65.16%
===========================================
Files 0 631 +631
Lines 0 64508 +64508
===========================================
+ Hits 0 42036 +42036
- Misses 0 17548 +17548
- Partials 0 4924 +4924 ☔ View full report in Codecov by Sentry. |
3d4b25d
to
cba543b
Compare
go/runtime/bundle/registry.go
Outdated
return | ||
default: | ||
if v.Less(active) { | ||
r.logger.Info("Removing bundle with version lower then active", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would first log that we are removing all that have version less than active, and then in the for loop log for every bundle that is removed. This way, you see that we tried to removed bundles, but nothing was needed to be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is desirable since this function is called everytime an epoch changes. I would prefer logging if we do an actual removal? Anyways let's see how this changes once rebase on top of manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epoch are every 1h. You can log as Info
almost whatever you want. This way you can be sure that background tasks are triggered, even though they do nothing as nothing is upgraded.
cba543b
to
5197ed3
Compare
If you fetch the current epoch, read active version for that epoch, and delete all previous versions, the bundles should not be deleted too early. |
5197ed3
to
c405417
Compare
c405417
to
3b4eb79
Compare
Correct. I was just confirming I made sure my e2e test is failing when clean-up was not implemented as you write above. This was happening here: #5976 (comment) when I initially mis-understood the registry updates, thus deleting things to early. :) |
Freshly rebased on top of #6003. Should be ready for a second round of reviews. :) |
return sc.RunTestClientAndCheckLogs(ctx, childEnv) | ||
} | ||
|
||
func ensureCorrectBundlesDir(logger *logging.Logger, workerName, workerDir string) error { | ||
logger.Info("ensuring cached exploded bundle for version 0.0.0 was removed", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.Info("ensuring cached exploded bundle for version 0.0.0 was removed", | |
logger.Info("verifying exploded bundle directories") |
You don't need to be so specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still within a line? Some extra details don't help especially when test actually fails...
if up := r.updateActiveDescriptor(ctx); up && !activeInitialized { | ||
close(r.activeDescriptorCh) | ||
activeInitialized = true | ||
} | ||
|
||
// Trigger clean-up for bundles less than active version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to move this code and the code below that triggers downloads to the manager once that is possible (i.e. when there are no more dependencies). Maybe this is already possible 🤔 However, in the future the manager should register to active and registry descriptor events (maybe only to the latter), and trigger cleanup and download when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm so you mean that discovery
is no longer started via runtime.registry
but instead directly as a background service where we start common workers?
However, in the future the manager should register to active and registry descriptor events (maybe only to the latter)
And new epochs I assume? I think with ActiveVersion()
, together with registry descriptor it would suffice.
Would prefer to the the refactor in the follow-up unless we find quick consensus here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, will move in another PR, just mentioning here.
go/runtime/bundle/manager.go
Outdated
// CleanStaleBundles removes outdated manifest hashes and deletes corresponding | ||
// exploded bundles for runtimes in the clean-up queue. | ||
func (m *Manager) CleanStaleBundles() { | ||
m.logger.Info("removing regular bundles with version less than active") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m.logger.Info("removing regular bundles with version less than active") | |
m.logger.Info("cleaning bundles") |
If you are mentioning active version, one would like to know what the active version is. The better way is to just make a simple log, so that we know that the cleanup is triggered. Could also add a similar message downloading bundles
to Download
.
da9d9aa
to
83a760b
Compare
12de758
to
7474376
Compare
// bundle removed from its bundles dir. | ||
for _, worker := range sc.Net.ComputeWorkers() { | ||
if err := sc.verifyBundleDir(ctx, worker); err != nil { | ||
sc.Logger.Error("compute worker bundle dir clean-up error", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log could be omitted as you already log in verifyBundleDir
, and the error will be logged anyway.
if err != nil { | ||
return err | ||
} | ||
// if n := len(entries); n != 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uncomment.
return fmt.Errorf("%s is not a dir", entry.Name()) | ||
} | ||
|
||
// Ensure exploded cached bundle is for the latest version (0.1.0). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This contradicts the logged info comment above, which should be improved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ensuring cached exploded bundle for version 0.0.0 was removed" ?
Up it says was removed for 0.0.0 here that that is left for 0.1.0?
Happy to simplify this, agree may be weird?
// Fetch registry descriptor. | ||
rt, err := sc.Net.Controller().Registry.GetRuntime(ctx, ®istry.GetRuntimeQuery{ | ||
Height: consensus.HeightLatest, | ||
ID: sc.Net.Runtimes()[sc.upgradedRuntimeIndex].ID(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just use key value runtime id.
return fmt.Errorf("failed to unmarshal dir name to hash") | ||
} | ||
if !want.Equal(&got) { | ||
return fmt.Errorf("unexpected exploded bundle hash: want %v, got %v", want, got) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment could be improved, to say that the folder name is not correct or that folder content failed to verify, as this is what we are testing.
go/runtime/bundle/registry.go
Outdated
@@ -19,14 +19,20 @@ import ( | |||
rtConfig "github.com/oasisprotocol/oasis-core/go/runtime/config" | |||
) | |||
|
|||
// explodedManifest is manifest with corresponding exploded bundle dir. | |||
type explodedManifest struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that this would be public, like ExplodedComponent
.
go/runtime/bundle/registry.go
Outdated
regularManifests map[common.Namespace]map[version.Version]*Manifest | ||
components map[common.Namespace]map[component.ID]map[version.Version]*ExplodedComponent | ||
notifiers map[common.Namespace]*pubsub.Broker | ||
explodedManifests map[hash.Hash]explodedManifest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All manifests are exploded, so this renaming is not needed.
go/runtime/bundle/registry.go
Outdated
r.mu.RLock() | ||
defer r.mu.RUnlock() | ||
var manifests []*Manifest | ||
for _, m := range r.explodedManifests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See maps
(and slices
) lib to optimize this.
go/runtime/bundle/registry.go
Outdated
defer r.mu.Unlock() | ||
explManifest, ok := r.explodedManifests[hash] | ||
if !ok { | ||
return "", fmt.Errorf("missing manifest with hash %s", hash.Hex()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return false
, this is not an error.
go/runtime/bundle/registry.go
Outdated
for _, c := range explManifest.manifest.Components { | ||
delete(r.components[runtimeID][c.ID()], c.Version) | ||
} | ||
return explManifest.explodedDir, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Manifests()
returns []*ExplodedManifest
, you don't need this.
2310fd6
to
7f1fb9b
Compare
7f1fb9b
to
43c5a11
Compare
43c5a11
to
f8a7391
Compare
What
Furthermore, we fix the current bug (-> done as part of go/runtime/bundle: Cleanup bundles on startup #6003master
) of not being able to remove runtime from the configuration (see Periodically clean up cached bundles directory #5976 (comment))Why
Save on disk usage/ease the maintenance.
How
Regular and detached exploded bundles no longer present in the config, are removed during discovery startup. This way we are not blocking initialization-> Done here go/runtime/bundle: Cleanup bundles on startup #6003How to test
e2e