opus-2020-06-28.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): akl_Latn ceb hil ilo pag pmn war
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-06-28.zip
test set translations: opus-2020-06-28.test.txt
test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-akl.eng.akl	6.2	0.245
Tatoeba-test.eng-ceb.eng.ceb	10.6	0.436
Tatoeba-test.eng-hil.eng.hil	17.1	0.490
Tatoeba-test.eng-ilo.eng.ilo	33.9	0.587
Tatoeba-test.eng.multi	13.6	0.392
Tatoeba-test.eng-pag.eng.pag	16.8	0.484
Tatoeba-test.eng-pmn.eng.pmn	0.5	0.163
Tatoeba-test.eng-war.eng.war	12.8	0.437

opus-2020-07-27.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): akl_Latn ceb hil ilo pag war
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-27.zip
test set translations: opus-2020-07-27.test.txt
test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-akl.eng.akl	3.0	0.190
Tatoeba-test.eng-ceb.eng.ceb	11.1	0.434
Tatoeba-test.eng-hil.eng.hil	18.5	0.511
Tatoeba-test.eng-ilo.eng.ilo	32.9	0.590
Tatoeba-test.eng.multi	12.8	0.391
Tatoeba-test.eng-pag.eng.pag	18.5	0.505
Tatoeba-test.eng-war.eng.war	12.5	0.437

opus2m-2020-08-01.zip

dataset: opus2m
model: transformer
source language(s): eng
target language(s): akl_Latn ceb hil ilo pag war
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus2m-2020-08-01.zip
test set translations: opus2m-2020-08-01.test.txt
test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-akl.eng.akl	7.1	0.245
Tatoeba-test.eng-ceb.eng.ceb	10.5	0.435
Tatoeba-test.eng-hil.eng.hil	18.0	0.506
Tatoeba-test.eng-ilo.eng.ilo	33.4	0.590
Tatoeba-test.eng.multi	13.1	0.392
Tatoeba-test.eng-pag.eng.pag	19.4	0.481
Tatoeba-test.eng-war.eng.war	12.8	0.441

opus1m+bt-2021-04-10.zip

dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): akl ceb hil ilo pag war
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels: >>agk<< >>agz<< >>akl<< >>akl_Latn<< >>atd<< >>atl<< >>bcl<< >>bgs<< >>bkd<< >>blf<< >>bln<< >>bno<< >>bnq<< >>bto<< >>ceb<< >>cgc<< >>cps<< >>fbl<< >>fil<< >>gor<< >>hil<< >>ibg<< >>ibl<< >>ify<< >>ilk<< >>ilo<< >>kak<< >>krj<< >>kyj<< >>kyk<< >>lbl<< >>loc<< >>mba<< >>mbb<< >>mbd<< >>mbi<< >>mbs<< >>mbt<< >>mdh<< >>mkx<< >>mog<< >>mqk<< >>mrw<< >>msb<< >>msm<< >>mta<< >>obo<< >>pag<< >>pam<< >>rbl<< >>rth<< >>sgd<< >>snl<< >>sxn<< >>tbl<< >>tdn<< >>tne<< >>tnt<< >>tnw<< >>tom<< >>txs<< >>ubl<< >>war<<
download: opus1m+bt-2021-04-10.zip
test set translations: opus1m+bt-2021-04-10.test.txt
test set scores: opus1m+bt-2021-04-10.eval.txt

Benchmarks

testset	BLEU	chr-F	#sent	#words	BP
Tatoeba-test.eng-akl	2.2	0.199	27	96	1.000
Tatoeba-test.eng-ceb	10.8	0.429	378	2086	1.000
Tatoeba-test.eng-hil	18.3	0.513	22	125	1.000
Tatoeba-test.eng-ilo	33.4	0.586	1093	7241	1.000
Tatoeba-test.eng-multi	19.4	0.490	3081	20897	1.000
Tatoeba-test.eng-pag	16.3	0.504	49	320	1.000
Tatoeba-test.eng-war	12.9	0.431	1512	11024	1.000

opus4m+btTCv20210807-2021-09-30.zip

dataset: opus4m+btTCv20210807
model: transformer
source language(s): eng
target language(s): akl bcl ceb fil gor hil ibg ilo pag pam sxn war
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels: >>agk<< >>agz<< >>akl<< >>akl_Latn<< >>atd<< >>atl<< >>bcl<< >>bgs<< >>bkd<< >>blf<< >>bln<< >>bno<< >>bnq<< >>bto<< >>ceb<< >>cgc<< >>cps<< >>fbl<< >>fil<< >>gor<< >>hil<< >>ibg<< >>ibl<< >>ify<< >>ilk<< >>ilo<< >>kak<< >>krj<< >>kyj<< >>kyk<< >>lbl<< >>loc<< >>mba<< >>mbb<< >>mbd<< >>mbi<< >>mbs<< >>mbt<< >>mdh<< >>mkx<< >>mog<< >>mqk<< >>mrw<< >>msb<< >>msm<< >>mta<< >>obo<< >>pag<< >>pam<< >>rbl<< >>rth<< >>sgd<< >>snl<< >>sxn<< >>tbl<< >>tdn<< >>tne<< >>tnt<< >>tnw<< >>tom<< >>txs<< >>ubl<< >>war<<
download: opus4m+btTCv20210807-2021-09-30.zip
test set translations: opus4m+btTCv20210807-2021-09-30.test.txt
test set scores: opus4m+btTCv20210807-2021-09-30.eval.txt

Benchmarks

testset	BLEU	chr-F	#sent	#words	BP
Tatoeba-test-v2021-08-07.eng-multi	12.4	0.355	4081	26965	1.000
Tatoeba-test-v2021-08-07.multi-multi	12.4	0.355	4081	26965	1.000

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

opus-2020-06-28.zip

Benchmarks

opus-2020-07-27.zip

Benchmarks

opus2m-2020-08-01.zip

Benchmarks

opus1m+bt-2021-04-10.zip

Benchmarks

opus4m+btTCv20210807-2021-09-30.zip

Benchmarks

Files

README.md

Latest commit

History

README.md

File metadata and controls

opus-2020-06-28.zip

Benchmarks

opus-2020-07-27.zip

Benchmarks

opus2m-2020-08-01.zip

Benchmarks

opus1m+bt-2021-04-10.zip

Benchmarks

opus4m+btTCv20210807-2021-09-30.zip

Benchmarks