Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor pipelines to include contentSources #83

Merged
merged 50 commits into from
Aug 19, 2024

Conversation

faberf
Copy link
Contributor

@faberf faberf commented Jul 8, 2024

This PR addresses the problem of multiple branches in a pipeline getting in each others way, by changing the use of retrievables to be append only. It is still the job of pipeline configurator to avoid race conditions by making sure that only parts of the retrievable are processed that are not being further updated. This is made easier via configurable contentSources which allow the configurator to specify which content should be used for an operator using a white list of other operators. This description is not very complete and the code needs to be cleaned up. Therefore this PR is a draft.

  • Add documentation in wiki
  • Clean up code
  • Remove Passthrough in favor of allowing stages to be configured without an operator

@faberf faberf changed the title refactor pipeliens refactor pipelines to include contentSources Jul 8, 2024
@v0idness v0idness marked this pull request as draft July 12, 2024 06:57
@faberf faberf marked this pull request as ready for review August 9, 2024 06:37
@faberf
Copy link
Contributor Author

faberf commented Aug 9, 2024

Hey, this PR is ready to review. We will add the documentation to the wiki once its merged.

@faberf
Copy link
Contributor Author

faberf commented Aug 14, 2024

It seems like the two concerns left unchanged in the latest commits are:

  • The need for a level of indirection via attributes at all
  • The unordered implementation of authors associated with a content (and likewise the unordered implementation of contents associated with an author)

I have made the appropriate comments addressing these issues above.

@v0idness v0idness linked an issue Aug 16, 2024 that may be closed by this pull request
@ppanopticon
Copy link
Member

What is the state of this PR? Have all issues been addressed?

Currently, there are two merge-conflicts which ought to be resolved. Also, please make sure that all tests run successfully.

@faberf
Copy link
Contributor Author

faberf commented Aug 19, 2024

What is the state of this PR? Have all issues been addressed?

Currently, there are two merge-conflicts which ought to be resolved. Also, please make sure that all tests run successfully.

Me and raphi will merge this today

@faberf faberf dismissed stale reviews from ppanopticon and lucaro August 19, 2024 12:11

We discussed this in person.

@net-cscience-raphael net-cscience-raphael merged commit 43965cb into dev Aug 19, 2024
1 check passed
@Spiess Spiess deleted the feature/contentpipelines branch October 18, 2024 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for dense retrieval with instructions
6 participants