Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Runs with no Stop Documents. #10

Open
gwbischof opened this issue Jul 24, 2020 · 0 comments
Open

Handle Runs with no Stop Documents. #10

gwbischof opened this issue Jul 24, 2020 · 0 comments

Comments

@gwbischof
Copy link
Contributor

gwbischof commented Jul 24, 2020

MongoConsumer get documents from a kafka partition and then inserts them into the correct mongo database with suitcase-mongo.Serializer.

Each time a stop document is received the Serializer is closed, and a new Serializer is created for the next set of documents it will receive. If a run does not have a stop document, then the following run will end up using the previous Serializer. So, as is, this Serializer may serialize more than one run. For the suitcase-mongo.Serializer, this shouldn't matter. But for a suitcase for an archival format, we only want one run per file, and this will be a problem.

So I think we should add some code, hopefully to the base class that handles runs with missing stop documents. This could work by checking if a second start document is received before receiving the stop document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant