Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Any scripts or guidance on training #7

Open
mprorock opened this issue Jul 15, 2022 · 2 comments
Open

Any scripts or guidance on training #7

mprorock opened this issue Jul 15, 2022 · 2 comments

Comments

@mprorock
Copy link

Hey, this is awesome work!
Are there any tips, scripts or items I should be looking at for training on a separate corpus?
Or similarly, any documented methods for adding additional material into the model?

@mprorock
Copy link
Author

mprorock commented Jul 15, 2022

scratch that - looks like just go ahead and utilize pyserini with or without dense indexes based on desired behavior - is that a correct read?
And then utilize DPR as appropriate for bidirectional encodings?

@ola13
Copy link
Contributor

ola13 commented Jul 27, 2022

Hi @mprorock! Yes, our current repo is focused on:

  • providing access to pre-built indices
  • serving them using existing infrastructure - Pyserini for the sparse index and distributed-faiss for the dense index

As far as bi-encoder training is concerned, https://github.com/facebookresearch/DPR is a good place to start.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants