Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the backfiller aware of revs and fill gaps #454

Merged
merged 12 commits into from
Dec 4, 2023

Conversation

whyrusleeping
Copy link
Collaborator

No description provided.

@whyrusleeping
Copy link
Collaborator Author

Hrn... jobs don't recover from error states. Need to figure that one out

return false, nil
case StateInProgress, StateEnqueued:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we buffering ops on enqueued backfill jobs here? They won't have been checked out yet if they're still enqueued so any events they get will be included in the checkout, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, thats true

func (j *Gormjob) FlushBufferedOps(ctx context.Context, fn func(kind, path string, rec *typegen.CBORMarshaler, cid *cid.Cid) error) error {
var ErrEventGap = fmt.Errorf("buffered event revs did not line up")

func (j *Gormjob) FlushBufferedOps(ctx context.Context, fn func(kind, path string, rec typegen.CBORMarshaler, cid *cid.Cid) error) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work with deletes? IIRC the rec being a pointer worked with deletes because it'd be nil when deleting a record, does it just end up being an empty typegen.CBORMarshaler in this case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typegen.CBORMarshaler is an interface, it would just be nil

@ericvolp12
Copy link
Collaborator

Hrn... jobs don't recover from error states. Need to figure that one out

Yeah, the way I typically handle this is by running a query to reset the job state on jobs that failed for a retryable reason but it's not always clear which jobs are safe to retry etc.

@whyrusleeping
Copy link
Collaborator Author

need to figure something out there, maybe just keep a counter of how many times its failed and do automatic retries up to a point?

@ericvolp12
Copy link
Collaborator

need to figure something out there, maybe just keep a counter of how many times its failed and do automatic retries up to a point?

would we do retries when we see more events from the repo or somehow enqueue it again for a retry when it fails? would probably also want some kind of backoff in there.

@whyrusleeping
Copy link
Collaborator Author

i think we should automatically re-enqueue it, after some duration

@ericvolp12
Copy link
Collaborator

i think we should automatically re-enqueue it, after some duration

okay, we probably want two new columns then for the jobs, something like a retry_count and a retry_after timestamptz and then we can bump the retry_after based on the retry count when we get a failure. Update the query to get the next job to include failed jobs that have a retry_after that's passed, and then when we hit a max number of retries we set that column to nil so it doesn't get picked up.

@whyrusleeping
Copy link
Collaborator Author

oh the other thing i havent implemented et is the gap fill handling logic, it spits out an error when we detect a gap, and resets the job state, but when we actually go to process that we arent filtering to just the gap records

@ericvolp12 ericvolp12 marked this pull request as ready for review December 2, 2023 00:07
@whyrusleeping whyrusleeping merged commit a857fdc into main Dec 4, 2023
7 checks passed
@whyrusleeping whyrusleeping deleted the feat/backfill-revs branch December 4, 2023 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants