-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated failure estimator #3427
Conversation
type Equaler[T any] interface { | ||
// Returns true if both objects are equal. | ||
Equal(T) bool | ||
} | ||
|
||
// Number represents any integer or floating-point number. | ||
type Number interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curiously this interface isn't in the default collection.
@@ -0,0 +1,352 @@ | |||
using Random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've included my simulator here for anyone working on this in the future.
@@ -61,8 +61,14 @@ func TestExecutorRepository_LoadAndSave(t *testing.T) { | |||
} | |||
retrievedExecutors, err := repo.GetExecutors(ctx) | |||
require.NoError(t, err) | |||
executorSort := func(a *schedulerobjects.Executor, b *schedulerobjects.Executor) bool { | |||
return a.Id > b.Id | |||
executorSort := func(a *schedulerobjects.Executor, b *schedulerobjects.Executor) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have several uses of SortFunc
where the sorting is opposite that of what I would think the intuitive one.
import "gonum.org/v1/gonum/mat" | ||
|
||
type Optimiser interface { | ||
Update(out, p *mat.VecDense, g mat.Vector) *mat.VecDense |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better names for parameters? It's not clear what these do from the signature.
Comment at the top to make it clear how this interface works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. I left p and g as-is in the implementations as I think the long names make the code a bit less readable there.
// - {node, queue}SuccessProbabilityCordonThreshold: Success probability below which nodes (queues) are considered unhealthy. | ||
// - {node, queue}CordonTimeout: Amount of time for which nodes (queues) remain unhealthy in the absence of any job successes or failures for that node (queue). | ||
// - {node, queue}EquilibriumFailureRate: Job failures per second necessary for a node (queue) to remain unhealthy in the absence of any successes for that node (queue). | ||
type Update struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better names/comment?
This PR introduces a new learning algorithm for node and queue failure probability estimation. Simulations indicate this new method is much more accurate than the one we use now.
Adding the gonum package as a dependency resulted in a version increase of the slices package we use. As a result, I had to update all calls to
slices.SortFunc
as the API had changed.The snyk error reported seemingly only affects image decoding. It shouldn't matter to us.