-
Thank you for all your hard work. I am trying to use a custom reranking dataset with multi-level relevance scores (e.g., One workaround might be to convert multi-level relevance scores into binary ones (e.g., |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Not as far as I know. I am unsure if the code would support continuous 0-1 (I think that it might) scores. However, otherwise, a solution is as you say create a binary mapping and potentially two versions of the dataset (one with only relevant and one with both). However, it might be better to simply create a PR to update the reranking task to allow for multi-level relevance scores. @orionw knows more about this area than me so maybe he has a better idea. |
Beta Was this translation helpful? Give feedback.
Short answer: we don't currently support that for reranking but do support it for retrieval due to the trec_eval evaluation library.
Long answer: Our current reranking setting is fairly distinct from the retrieval evaluation even though there are lots of overlap.
Right now it has the two buckets (positives and negatives) and then computes metrics based on those. But we really could compute it in the same way as retrieval by having qrels and using trec_eval (just basically a bunch of small retrieval tasks). We would need to provide a wrapper for backwards compatibility to make the previous tasks converted into this style, but I think it would be a lot more intuitive to use the future (as …