Skip to content

Latest commit

 

History

History

eng-nic

opus-2020-07-06.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): bam_Latn ewe fuc fuv ibo kin lin lug nya run sag sna swh toi_Latn tso umb wol xho yor zul
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-06.zip
  • test set translations: opus-2020-07-06.test.txt
  • test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-bam.eng.bam 5.0 0.034
Tatoeba-test.eng-ewe.eng.ewe 5.2 0.295
Tatoeba-test.eng-ful.eng.ful 0.6 0.072
Tatoeba-test.eng-ibo.eng.ibo 3.6 0.261
Tatoeba-test.eng-kin.eng.kin 12.0 0.557
Tatoeba-test.eng-lin.eng.lin 1.2 0.301
Tatoeba-test.eng-lug.eng.lug 12.7 0.603
Tatoeba-test.eng.multi 13.5 0.484
Tatoeba-test.eng-nya.eng.nya 14.4 0.588
Tatoeba-test.eng-run.eng.run 13.7 0.486
Tatoeba-test.eng-sag.eng.sag 5.4 0.173
Tatoeba-test.eng-sna.eng.sna 22.5 0.584
Tatoeba-test.eng-swa.eng.swa 1.3 0.146
Tatoeba-test.eng-toi.eng.toi 7.0 0.199
Tatoeba-test.eng-tso.eng.tso 31.2 0.654
Tatoeba-test.eng-umb.eng.umb 4.7 0.336
Tatoeba-test.eng-wol.eng.wol 6.7 0.191
Tatoeba-test.eng-xho.eng.xho 24.4 0.608
Tatoeba-test.eng-yor.eng.yor 13.3 0.365
Tatoeba-test.eng-zul.eng.zul 34.3 0.731

opus-2020-07-14.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): bam_Latn ewe fuc fuv ibo kin lin lug nya run sag sna swh toi_Latn tso umb wol xho yor zul
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-14.zip
  • test set translations: opus-2020-07-14.test.txt
  • test set scores: opus-2020-07-14.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-bam.eng.bam 6.3 0.021
Tatoeba-test.eng-ewe.eng.ewe 4.9 0.244
Tatoeba-test.eng-ful.eng.ful 0.6 0.092
Tatoeba-test.eng-ibo.eng.ibo 4.1 0.273
Tatoeba-test.eng-kin.eng.kin 5.8 0.435
Tatoeba-test.eng-lin.eng.lin 1.3 0.311
Tatoeba-test.eng-lug.eng.lug 5.5 0.384
Tatoeba-test.eng.multi 10.9 0.423
Tatoeba-test.eng-nya.eng.nya 17.9 0.609
Tatoeba-test.eng-run.eng.run 13.1 0.477
Tatoeba-test.eng-sag.eng.sag 5.4 0.176
Tatoeba-test.eng-sna.eng.sna 17.8 0.560
Tatoeba-test.eng-toi.eng.toi 8.3 0.196
Tatoeba-test.eng-tso.eng.tso 41.3 0.698
Tatoeba-test.eng-umb.eng.umb 3.4 0.323
Tatoeba-test.eng-wol.eng.wol 4.3 0.191
Tatoeba-test.eng-xho.eng.xho 25.8 0.612
Tatoeba-test.eng-yor.eng.yor 15.7 0.351
Tatoeba-test.eng-zul.eng.zul 41.0 0.762

opus-2020-07-20.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): bam_Latn ewe fuc fuv ibo kin lin lug nya run sag sna swh toi_Latn tso umb wol xho yor zul
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-20.zip
  • test set translations: opus-2020-07-20.test.txt
  • test set scores: opus-2020-07-20.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-bam.eng.bam 5.4 0.028
Tatoeba-test.eng-ewe.eng.ewe 4.7 0.237
Tatoeba-test.eng-ful.eng.ful 0.5 0.071
Tatoeba-test.eng-ibo.eng.ibo 3.9 0.268
Tatoeba-test.eng-kin.eng.kin 5.7 0.437
Tatoeba-test.eng-lin.eng.lin 1.2 0.309
Tatoeba-test.eng-lug.eng.lug 5.5 0.384
Tatoeba-test.eng.multi 10.3 0.422
Tatoeba-test.eng-nya.eng.nya 22.3 0.629
Tatoeba-test.eng-run.eng.run 12.8 0.473
Tatoeba-test.eng-sag.eng.sag 5.7 0.180
Tatoeba-test.eng-sna.eng.sna 18.5 0.554
Tatoeba-test.eng-swa.eng.swa 1.3 0.155
Tatoeba-test.eng-toi.eng.toi 8.3 0.231
Tatoeba-test.eng-tso.eng.tso 31.2 0.671
Tatoeba-test.eng-umb.eng.umb 4.3 0.292
Tatoeba-test.eng-wol.eng.wol 5.1 0.163
Tatoeba-test.eng-xho.eng.xho 27.8 0.630
Tatoeba-test.eng-yor.eng.yor 17.8 0.357
Tatoeba-test.eng-zul.eng.zul 34.9 0.748

opus-2020-07-27.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): bam_Latn ewe fuc fuv ibo kin lin lug nya run sag sna swh toi_Latn tso umb wol xho yor zul
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-27.zip
  • test set translations: opus-2020-07-27.test.txt
  • test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-bam.eng.bam 6.2 0.029
Tatoeba-test.eng-ewe.eng.ewe 4.5 0.258
Tatoeba-test.eng-ful.eng.ful 0.5 0.073
Tatoeba-test.eng-ibo.eng.ibo 3.9 0.267
Tatoeba-test.eng-kin.eng.kin 6.4 0.475
Tatoeba-test.eng-lin.eng.lin 1.2 0.308
Tatoeba-test.eng-lug.eng.lug 3.9 0.405
Tatoeba-test.eng.multi 11.1 0.427
Tatoeba-test.eng-nya.eng.nya 14.0 0.622
Tatoeba-test.eng-run.eng.run 13.6 0.477
Tatoeba-test.eng-sag.eng.sag 5.5 0.199
Tatoeba-test.eng-sna.eng.sna 19.6 0.557
Tatoeba-test.eng-swa.eng.swa 1.8 0.163
Tatoeba-test.eng-toi.eng.toi 8.3 0.231
Tatoeba-test.eng-tso.eng.tso 50.0 0.789
Tatoeba-test.eng-umb.eng.umb 7.8 0.342
Tatoeba-test.eng-wol.eng.wol 6.7 0.143
Tatoeba-test.eng-xho.eng.xho 26.4 0.620
Tatoeba-test.eng-yor.eng.yor 15.5 0.342
Tatoeba-test.eng-zul.eng.zul 35.9 0.750