Skip to content

Latest commit

 

History

History

eng-gem

opus-2020-07-06.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-06.zip
  • test set translations: opus-2020-07-06.test.txt
  • test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-afr.eng.afr 56.3 0.740
Tatoeba-test.eng-ang.eng.ang 5.2 0.141
Tatoeba-test.eng-dan.eng.dan 56.9 0.719
Tatoeba-test.eng-deu.eng.deu 39.3 0.607
Tatoeba-test.eng-enm.eng.enm 1.9 0.214
Tatoeba-test.eng-fao.eng.fao 7.6 0.303
Tatoeba-test.eng-frr.eng.frr 7.6 0.179
Tatoeba-test.eng-fry.eng.fry 18.5 0.419
Tatoeba-test.eng-gos.eng.gos 1.4 0.218
Tatoeba-test.eng-got.eng.got 0.3 0.011
Tatoeba-test.eng-gsw.eng.gsw 1.4 0.182
Tatoeba-test.eng-isl.eng.isl 22.0 0.496
Tatoeba-test.eng-ksh.eng.ksh 1.3 0.164
Tatoeba-test.eng-ltz.eng.ltz 17.8 0.354
Tatoeba-test.eng.multi 44.8 0.627
Tatoeba-test.eng-nds.eng.nds 18.5 0.434
Tatoeba-test.eng-nld.eng.nld 52.3 0.695
Tatoeba-test.eng-non.eng.non 0.7 0.154
Tatoeba-test.eng-nor.eng.nor 4.1 0.254
Tatoeba-test.eng-pdc.eng.pdc 6.6 0.219
Tatoeba-test.eng-sco.eng.sco 32.2 0.536
Tatoeba-test.eng-stq.eng.stq 5.7 0.365
Tatoeba-test.eng-swe.eng.swe 57.0 0.712
Tatoeba-test.eng-swg.eng.swg 1.2 0.178
Tatoeba-test.eng-yid.eng.yid 7.2 0.297

opus-2020-07-14.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-14.zip
  • test set translations: opus-2020-07-14.test.txt
  • test set scores: opus-2020-07-14.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-afr.eng.afr 56.3 0.742
Tatoeba-test.eng-ang.eng.ang 5.8 0.148
Tatoeba-test.eng-dan.eng.dan 56.7 0.718
Tatoeba-test.eng-deu.eng.deu 39.2 0.606
Tatoeba-test.eng-enm.eng.enm 1.4 0.211
Tatoeba-test.eng-fao.eng.fao 8.1 0.310
Tatoeba-test.eng-frr.eng.frr 6.4 0.128
Tatoeba-test.eng-fry.eng.fry 16.5 0.416
Tatoeba-test.eng-gos.eng.gos 2.5 0.195
Tatoeba-test.eng-got.eng.got 0.3 0.012
Tatoeba-test.eng-gsw.eng.gsw 0.9 0.135
Tatoeba-test.eng-isl.eng.isl 23.0 0.499
Tatoeba-test.eng-ksh.eng.ksh 0.9 0.141
Tatoeba-test.eng-ltz.eng.ltz 19.3 0.379
Tatoeba-test.eng.multi 45.6 0.633
Tatoeba-test.eng-nds.eng.nds 19.1 0.440
Tatoeba-test.eng-nld.eng.nld 52.5 0.696
Tatoeba-test.eng-non.eng.non 0.7 0.176
Tatoeba-test.eng-pdc.eng.pdc 5.9 0.177
Tatoeba-test.eng-sco.eng.sco 31.0 0.527
Tatoeba-test.eng-stq.eng.stq 5.5 0.337
Tatoeba-test.eng-swe.eng.swe 57.2 0.713
Tatoeba-test.eng-swg.eng.swg 1.1 0.159
Tatoeba-test.eng-yid.eng.yid 6.4 0.294

opus-2020-07-19.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-19.zip
  • test set translations: opus-2020-07-19.test.txt
  • test set scores: opus-2020-07-19.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-afr.eng.afr 55.9 0.740
Tatoeba-test.eng-ang.eng.ang 5.7 0.151
Tatoeba-test.eng-dan.eng.dan 57.1 0.720
Tatoeba-test.eng-deu.eng.deu 39.6 0.609
Tatoeba-test.eng-enm.eng.enm 1.4 0.211
Tatoeba-test.eng-fao.eng.fao 8.4 0.309
Tatoeba-test.eng-frr.eng.frr 6.4 0.204
Tatoeba-test.eng-fry.eng.fry 17.3 0.416
Tatoeba-test.eng-gos.eng.gos 2.9 0.197
Tatoeba-test.eng-got.eng.got 0.4 0.012
Tatoeba-test.eng-gsw.eng.gsw 1.0 0.143
Tatoeba-test.eng-isl.eng.isl 23.1 0.501
Tatoeba-test.eng-ksh.eng.ksh 1.2 0.150
Tatoeba-test.eng-ltz.eng.ltz 20.3 0.395
Tatoeba-test.eng.multi 45.8 0.634
Tatoeba-test.eng-nds.eng.nds 19.7 0.445
Tatoeba-test.eng-nld.eng.nld 52.5 0.696
Tatoeba-test.eng-non.eng.non 0.7 0.171
Tatoeba-test.eng-nor.eng.nor 49.4 0.671
Tatoeba-test.eng-pdc.eng.pdc 4.2 0.173
Tatoeba-test.eng-sco.eng.sco 29.0 0.517
Tatoeba-test.eng-stq.eng.stq 5.4 0.365
Tatoeba-test.eng-swe.eng.swe 57.3 0.714
Tatoeba-test.eng-swg.eng.swg 1.1 0.158
Tatoeba-test.eng-yid.eng.yid 6.5 0.299

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
newssyscomb2009-engdeu.eng.deu 20.3 0.517
news-test2008-engdeu.eng.deu 20.5 0.506
newstest2009-engdeu.eng.deu 20.1 0.512
newstest2010-engdeu.eng.deu 22.2 0.523
newstest2011-engdeu.eng.deu 20.0 0.504
newstest2012-engdeu.eng.deu 20.4 0.503
newstest2013-engdeu.eng.deu 23.9 0.529
newstest2015-ende-engdeu.eng.deu 27.5 0.565
newstest2016-ende-engdeu.eng.deu 32.3 0.601
newstest2017-ende-engdeu.eng.deu 26.0 0.555
newstest2018-ende-engdeu.eng.deu 38.7 0.642
newstest2019-ende-engdeu.eng.deu 34.4 0.608
Tatoeba-test.eng-afr.eng.afr 56.1 0.739
Tatoeba-test.eng-ang.eng.ang 6.3 0.152
Tatoeba-test.eng-dan.eng.dan 57.2 0.722
Tatoeba-test.eng-deu.eng.deu 39.9 0.612
Tatoeba-test.eng-enm.eng.enm 1.3 0.219
Tatoeba-test.eng-fao.eng.fao 9.4 0.318
Tatoeba-test.eng-frr.eng.frr 3.6 0.124
Tatoeba-test.eng-fry.eng.fry 16.6 0.419
Tatoeba-test.eng-gos.eng.gos 2.2 0.182
Tatoeba-test.eng-got.eng.got 0.3 0.012
Tatoeba-test.eng-gsw.eng.gsw 0.9 0.134
Tatoeba-test.eng-isl.eng.isl 23.0 0.504
Tatoeba-test.eng-ksh.eng.ksh 0.8 0.143
Tatoeba-test.eng-ltz.eng.ltz 20.8 0.392
Tatoeba-test.eng.multi 46.0 0.636
Tatoeba-test.eng-nds.eng.nds 19.1 0.441
Tatoeba-test.eng-nld.eng.nld 52.7 0.697
Tatoeba-test.eng-non.eng.non 0.6 0.171
Tatoeba-test.eng-nor.eng.nor 49.5 0.671
Tatoeba-test.eng-pdc.eng.pdc 4.0 0.165
Tatoeba-test.eng-sco.eng.sco 29.8 0.520
Tatoeba-test.eng-stq.eng.stq 2.8 0.327
Tatoeba-test.eng-swe.eng.swe 57.5 0.715
Tatoeba-test.eng-swg.eng.swg 0.8 0.153
Tatoeba-test.eng-yid.eng.yid 6.3 0.293

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): eng
  • target language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
newssyscomb2009-engdeu.eng.deu 20.9 0.521
news-test2008-engdeu.eng.deu 21.1 0.511
newstest2009-engdeu.eng.deu 20.5 0.516
newstest2010-engdeu.eng.deu 22.5 0.526
newstest2011-engdeu.eng.deu 20.5 0.508
newstest2012-engdeu.eng.deu 20.8 0.507
newstest2013-engdeu.eng.deu 24.6 0.534
newstest2015-ende-engdeu.eng.deu 27.9 0.569
newstest2016-ende-engdeu.eng.deu 33.2 0.607
newstest2017-ende-engdeu.eng.deu 26.5 0.560
newstest2018-ende-engdeu.eng.deu 39.4 0.648
newstest2019-ende-engdeu.eng.deu 35.0 0.613
Tatoeba-test.eng-afr.eng.afr 56.5 0.745
Tatoeba-test.eng-ang.eng.ang 6.7 0.154
Tatoeba-test.eng-dan.eng.dan 58.0 0.726
Tatoeba-test.eng-deu.eng.deu 40.3 0.615
Tatoeba-test.eng-enm.eng.enm 1.4 0.215
Tatoeba-test.eng-fao.eng.fao 7.2 0.304
Tatoeba-test.eng-frr.eng.frr 5.5 0.159
Tatoeba-test.eng-fry.eng.fry 19.4 0.433
Tatoeba-test.eng-gos.eng.gos 1.0 0.182
Tatoeba-test.eng-got.eng.got 0.3 0.012
Tatoeba-test.eng-gsw.eng.gsw 0.9 0.130
Tatoeba-test.eng-isl.eng.isl 23.4 0.505
Tatoeba-test.eng-ksh.eng.ksh 1.1 0.141
Tatoeba-test.eng-ltz.eng.ltz 20.3 0.379
Tatoeba-test.eng.multi 46.5 0.641
Tatoeba-test.eng-nds.eng.nds 20.6 0.458
Tatoeba-test.eng-nld.eng.nld 53.4 0.702
Tatoeba-test.eng-non.eng.non 0.6 0.166
Tatoeba-test.eng-nor.eng.nor 50.3 0.679
Tatoeba-test.eng-pdc.eng.pdc 3.9 0.189
Tatoeba-test.eng-sco.eng.sco 33.0 0.542
Tatoeba-test.eng-stq.eng.stq 2.3 0.274
Tatoeba-test.eng-swe.eng.swe 57.9 0.719
Tatoeba-test.eng-swg.eng.swg 1.2 0.171
Tatoeba-test.eng-yid.eng.yid 7.2 0.304