Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate preference pairs #10

Merged
merged 9 commits into from
Jul 20, 2024
Merged

Conversation

RishabhMaheshwary
Copy link
Collaborator

Major Changes:

  • Added translations using facebook/nllb-200-3.3B in translate_preference_pairs_nllb.py. At the moment, it supports only English as the source langauge and works on any huggingface preference dataset having prompt,chosen and rejected keys.
  • The run_nllb.sh script has the 22 langauges listed with the codes used by facebook/nllb-200-3.3B. The script parallelizes the translation of langauges across the number of availble GPUs.
  • The instructions to run are in README.

Copy link
Collaborator

@ljvmiranda921 ljvmiranda921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I parametrize the model_name just in case I can run a bigger NLLB model. Aside from that, did some minor formatting changes. LGTM

@ljvmiranda921
Copy link
Collaborator

ljvmiranda921 commented Jul 20, 2024

Will merge this now! Let's just update the ctranslate_ script in the next PR

@ljvmiranda921 ljvmiranda921 merged commit 13a1d02 into main Jul 20, 2024
1 check passed
@ljvmiranda921 ljvmiranda921 deleted the translate_preference_pairs branch July 20, 2024 03:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants