Folders and files Name Name Last commit message
Last commit date
parent directory
View all files
dataset: opus
model: transformer
source language(s): eng
target language(s): akl_Latn ceb cha dtp hil iba ilo ind jav jav_Java mad max_Latn min mlg pag pau sun tmw_Latn war zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
download: opus-2020-07-27.zip
test set translations: opus-2020-07-27.test.txt
test set scores: opus-2020-07-27.eval.txt
testset
BLEU
chr-F
Tatoeba-test.eng-akl.eng.akl
3.4
0.135
Tatoeba-test.eng-ceb.eng.ceb
11.4
0.434
Tatoeba-test.eng-cha.eng.cha
1.6
0.187
Tatoeba-test.eng-dtp.eng.dtp
0.5
0.131
Tatoeba-test.eng-hil.eng.hil
17.3
0.518
Tatoeba-test.eng-iba.eng.iba
13.8
0.361
Tatoeba-test.eng-ilo.eng.ilo
33.3
0.588
Tatoeba-test.eng-jav.eng.jav
6.3
0.293
Tatoeba-test.eng-mad.eng.mad
1.3
0.145
Tatoeba-test.eng-mlg.eng.mlg
33.6
0.508
Tatoeba-test.eng-msa.eng.msa
30.9
0.558
Tatoeba-test.eng.multi
17.2
0.418
Tatoeba-test.eng-pag.eng.pag
16.5
0.485
Tatoeba-test.eng-pau.eng.pau
1.2
0.123
Tatoeba-test.eng-sun.eng.sun
35.1
0.447
Tatoeba-test.eng-war.eng.war
12.7
0.438
dataset: opus2m
model: transformer
source language(s): eng
target language(s): akl_Latn ceb cha dtp hil iba ilo ind jav jav_Java mad max_Latn min mlg pag pau sun tmw_Latn war zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
download: opus2m-2020-08-01.zip
test set translations: opus2m-2020-08-01.test.txt
test set scores: opus2m-2020-08-01.eval.txt
testset
BLEU
chr-F
Tatoeba-test.eng-akl.eng.akl
3.0
0.143
Tatoeba-test.eng-ceb.eng.ceb
11.4
0.432
Tatoeba-test.eng-cha.eng.cha
1.4
0.189
Tatoeba-test.eng-dtp.eng.dtp
0.6
0.139
Tatoeba-test.eng-hil.eng.hil
17.7
0.525
Tatoeba-test.eng-iba.eng.iba
14.6
0.365
Tatoeba-test.eng-ilo.eng.ilo
34.0
0.590
Tatoeba-test.eng-jav.eng.jav
6.2
0.299
Tatoeba-test.eng-mad.eng.mad
2.6
0.154
Tatoeba-test.eng-mlg.eng.mlg
34.3
0.518
Tatoeba-test.eng-msa.eng.msa
31.1
0.561
Tatoeba-test.eng.multi
17.5
0.422
Tatoeba-test.eng-pag.eng.pag
19.8
0.507
Tatoeba-test.eng-pau.eng.pau
1.2
0.129
Tatoeba-test.eng-sun.eng.sun
30.3
0.418
Tatoeba-test.eng-war.eng.war
12.6
0.439
dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): akl ceb cha hil iba ilo ind jak jav mad max min mlg msa pag pau plt sun tmw war zlm zsm
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
valid language labels: >>abl<< >>abs<< >>abx<< >>ace<< >>agk<< >>agz<< >>akb<< >>akl<< >>akl_Latn<< >>atd<< >>atl<< >>ban<< >>bbc<< >>bcl<< >>bdg<< >>bdl<< >>bdr<< >>beg<< >>bew<< >>bgs<< >>bjn<< >>bkd<< >>bkz<< >>blf<< >>bln<< >>bno<< >>bnq<< >>bsu<< >>btd<< >>bth<< >>btm<< >>bto<< >>bts<< >>btx<< >>btz<< >>buc<< >>bug<< >>ceb<< >>cgc<< >>cha<< >>cia<< >>cja<< >>cje<< >>cjm<< >>cps<< >>dbj<< >>drg<< >>dtr<< >>dun<< >>dup<< >>duq<< >>duw<< >>eno<< >>fbl<< >>fil<< >>gay<< >>goq<< >>gor<< >>hil<< >>hro<< >>huq<< >>iba<< >>ibg<< >>ibh<< >>ibl<< >>ify<< >>ilk<< >>ilo<< >>ind<< >>jak_Latn<< >>jav<< >>jav_Java<< >>jra<< >>kak<< >>kaw<< >>kge<< >>kjc<< >>kjk<< >>kqr<< >>krj<< >>ktq<< >>kvr<< >>kxd<< >>kyi<< >>kyj<< >>kyk<< >>kys<< >>lbl<< >>lbw<< >>lbx<< >>lce<< >>lcf<< >>ley<< >>liw<< >>ljp<< >>llk<< >>loc<< >>lra<< >>mad<< >>mak<< >>max_Latn<< >>mba<< >>mbb<< >>mbd<< >>mbi<< >>mbs<< >>mbt<< >>mdh<< >>mdr<< >>mfa<< >>mfb<< >>mhy<< >>min<< >>mkm<< >>mkx<< >>mlg<< >>mog<< >>mqk<< >>mqn<< >>mrw<< >>msa<< >>msa_Latn<< >>msb<< >>msm<< >>mta<< >>mtd<< >>mui<< >>mwv<< >>mxr<< >>myl<< >>mzq<< >>nia<< >>nij<< >>nrm<< >>obo<< >>otd<< >>pag<< >>pam<< >>pau<< >>pdo<< >>pel<< >>pku<< >>plt<< >>pse<< >>rad<< >>raz<< >>rbl<< >>ree<< >>rej<< >>rgs<< >>roc<< >>rog<< >>rth<< >>sas<< >>sda<< >>sdo<< >>sgd<< >>sjm<< >>skh<< >>slm<< >>sml<< >>smr<< >>smw<< >>sne<< >>snl<< >>snv<< >>ssb<< >>sse<< >>sun<< >>sxn<< >>sya<< >>tbl<< >>tdi<< >>tdn<< >>tdu<< >>tdx<< >>tjg<< >>tkg<< >>tlk<< >>tmw_Latn<< >>tne<< >>tnt<< >>tnw<< >>tom<< >>twy<< >>txs<< >>txy<< >>ubl<< >>ulu<< >>vkl<< >>vko<< >>war<< >>wow<< >>wru<< >>xkq<< >>xmv<< >>xmw<< >>xmz<< >>yka<< >>zbc<< >>zbe<< >>zbw<< >>zlm<< >>zlm_Latn<< >>zsm_Latn<<
download: opus1m+bt-2021-04-10.zip
test set translations: opus1m+bt-2021-04-10.test.txt
test set scores: opus1m+bt-2021-04-10.eval.txt
testset
BLEU
chr-F
#sent
#words
BP
Tatoeba-test.eng-akl
2.1
0.122
27
96
1.000
Tatoeba-test.eng-ceb
10.2
0.422
378
2086
1.000
Tatoeba-test.eng-cha
2.2
0.212
237
1080
1.000
Tatoeba-test.eng-hil
14.3
0.476
22
125
1.000
Tatoeba-test.eng-iba
14.5
0.395
30
284
0.853
Tatoeba-test.eng-ilo
32.3
0.580
1093
7241
1.000
Tatoeba-test.eng-ind
35.9
0.618
4289
28294
0.962
Tatoeba-test.eng-jav
5.7
0.287
259
1615
1.000
Tatoeba-test.eng-jav_Java
5.9
0.000
3
3
1.000
Tatoeba-test.eng-mad
2.0
0.158
7
39
1.000
Tatoeba-test.eng-max_Latn
4.8
0.262
127
917
1.000
Tatoeba-test.eng-min
6.3
0.263
19
147
1.000
Tatoeba-test.eng-mlg
34.5
0.505
51
242
1.000
Tatoeba-test.eng-msa
32.0
0.579
5000
33629
0.989
Tatoeba-test.eng-multi
25.8
0.530
8725
58062
1.000
Tatoeba-test.eng-pag
15.8
0.504
49
320
1.000
Tatoeba-test.eng-pau
1.4
0.130
34
148
1.000
Tatoeba-test.eng-sun
36.9
0.438
26
122
1.000
Tatoeba-test.eng-tmw_Latn
2.9
0.171
5
23
1.000
Tatoeba-test.eng-war
12.1
0.420
1512
11024
1.000
Tatoeba-test.eng-zlm_Latn
3.7
0.280
24
163
1.000
Tatoeba-test.eng-zsm_Latn
12.7
0.392
536
4085
1.000
You can’t perform that action at this time.