You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@rahelarnold98 and I just ran into an issue in the XReco context. Due to the recent changes to the AbstractAggregator, the "aggregated" Retrievable no longer holds only aggregated content. Instead, it holds the original content plus the aggregated content in a single Retrievable .
The question now is: By what mechanism can we restrict downstream operators to only work with the aggregated content? It's one thing if an operator is built from scratch. In our case, however, we mostly rely on existing operators that are being configured.
Quite frankly, this change broke our complete video extraction pipeline.
The text was updated successfully, but these errors were encountered:
ppanopticon
changed the title
AbstractAggregator: Aggregation Operations no longer transparent downstream
AbstractAggregator: Aggregation operations are no longer transparent downstream
Aug 26, 2024
That is the downside of the append-only approach. We do have a mechanism to check the author of a content element via the ContentAuthorAttribute, so at least they are distinguishable.
While this may very well be, I currently don't see a way to leverage this in a configurable fashion (i.e., without changing all the operators). Or have I overlooked something?
Hey @ppanopticon sorry for just now getting to this. Yes, with the approach we are going with all of the downstream consumers must filter the content using the contentauthorattribute using a configurable value in the extraction configuration. You can check the FES Extractor class for an example.
However, I would expect the author(s) of such a breaking change to actually adjust existing operators such that they can work as they did before opening a PR. In its current state, the change breaks the pipeline.
@rahelarnold98 and I just ran into an issue in the XReco context. Due to the recent changes to the
AbstractAggregator
, the "aggregated"Retrievable
no longer holds only aggregated content. Instead, it holds the original content plus the aggregated content in a singleRetrievable
.The question now is: By what mechanism can we restrict downstream operators to only work with the aggregated content? It's one thing if an operator is built from scratch. In our case, however, we mostly rely on existing operators that are being configured.
Quite frankly, this change broke our complete video extraction pipeline.
The text was updated successfully, but these errors were encountered: