Skip to content

Latest commit

 

History

History
43 lines (34 loc) · 2.15 KB

README.md

File metadata and controls

43 lines (34 loc) · 2.15 KB

opus1m+bt-2021-04-10.zip

  • dataset: opus1m+bt
  • model: transformer-align
  • source language(s): eng
  • target language(s): hye
  • model: transformer-align
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>axm<< >>hye<< >>hye_Latn<< >>hyw<< >>xcl<<
  • download: opus1m+bt-2021-04-10.zip
  • test set translations: opus1m+bt-2021-04-10.test.txt
  • test set scores: opus1m+bt-2021-04-10.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test.eng-hye 16.4 0.400 1121 5114 1.000
Tatoeba-test.eng-multi 16.6 0.402 1121 5115 1.000

opus4m+btTCv20210807-2021-09-30.zip

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test-v2021-08-07.eng-multi 18.8 0.404 1121 5115 1.000
Tatoeba-test-v2021-08-07.multi-multi 18.8 0.404 1121 5115 1.000