Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

Reproducing result: WMT German-English bleu score is less than the half of the expected score #341

Open
molipet opened this issue May 29, 2018 · 3 comments

Comments

@molipet
Copy link

molipet commented May 29, 2018

Thanks for sharing this great work!

Although, I strictly tried to follow the instructions in the ReadMe, I am unable the reproduce the WMT German-English benchmark results on newstest2015.

Here are my details:

I got the following inference results for newstest_2015:

  • deen_model_1 -- real bleu: 11.7, expected bleu: 27.6 (command to run inference: python -m nmt.nmt --src=de --tgt=en --ckpt=deen_model_1/translate.ckpt --hparams_path=nmt/standard_hparams/wmt16.json --out_dir=deen_model_1_output --vocab_prefix=wmt16/vocab.bpe.32000 --inference_input_file=wmt16/newstest2015.tok.bpe.32000.de --inference_output_file=deen_model_1_output/output_infer --inference_ref_file=wmt16/newstest2015.tok.bpe.32000.en)
  • deen_model_2 -- real bleu: 11.8, expected bleu: 28.9 (command to run inference: python -m nmt.nmt --src=de --tgt=en --ckpt=deen_model_2/translate.ckpt --hparams_path=nmt/standard_hparams/wmt16.json --out_dir=deen_model_2_output --vocab_prefix=wmt16/vocab.bpe.32000 --inference_input_file=wmt16/newstest2015.tok.bpe.32000.de --inference_output_file=deen_model_2_output/output_infer --inference_ref_file=wmt16/newstest2015.tok.bpe.32000.en)

Could you please provide any hint or help what am I doing wrong?

Thank you!

@potato1996
Copy link

potato1996 commented Jun 1, 2018

The same here. My setup: python3.5 + tf1.8

@ajithAI
Copy link

ajithAI commented Jun 5, 2018

I am also having similar issue. I am translating English to German ( newstest2015.tok.bpe.32000.en ) using "Ours — NMT + GNMT attention (8 layers)" model. The cited Bleu score : 27.6. But I got 21.0
command used :
python -m nmt.nmt --src=en --tgt=de --ckpt=../ende_gnmt_model_8_layer/translate.ckpt --hparams_path=nmt/standard_hparams/wmt16_gnmt_8_layer.json --out_dir=../ende_model_gnmt_8_output_news15 --vocab_prefix=tmp/wmt16/vocab.bpe.32000 --inference_input_file=tmp/wmt16/newstest2015.tok.bpe.32000.en --inference_output_file=../ende_model_gnmt_8_output_news15/output_infer --inference_ref_file=tmp/wmt16/newstest2015.tok.bpe.32000.de

TF : 1.8 , Python 2.7 .

Anyone please let me know if I am doing anything not proper !!

Thanks in Advance :)

Hyper-Param File wmt16_gnmt_8_layer.json contains :

{
"attention": "normed_bahdanau",
"attention_architecture": "gnmt_v2",
"batch_size": 128,
"colocate_gradients_with_ops": true,
"dropout": 0.2,
"encoder_type": "gnmt",
"eos": "",
"forget_bias": 1.0,
"infer_batch_size": 32,
"init_weight": 0.1,
"learning_rate": 1.0,
"max_gradient_norm": 5.0,
"metrics": ["bleu"],
"num_buckets": 5,
"num_layers": 8,
"num_encoder_layers": 8,
"num_decoder_layers": 8,
"num_train_steps": 340000,
"decay_scheme": "luong10",
"num_units": 1024,
"optimizer": "sgd",
"residual": true,
"share_vocab": false,
"subword_option": "bpe",
"sos": "<s", # IgnoreThis ( some GIT text edit error )
"src_max_len": 50,
"src_max_len_infer": null,
"steps_per_external_eval": null,
"steps_per_stats": 100,
"tgt_max_len": 50,
"tgt_max_len_infer": null,
"time_major": true,
"unit_type": "lstm",
"beam_width": 10,
"length_penalty_weight": 1.0
}

ChinaShrimp added a commit to ChinaShrimp/nmt that referenced this issue Jun 25, 2018
ChinaShrimp added a commit to ChinaShrimp/nmt that referenced this issue Jun 25, 2018
@qwerybot
Copy link

I know it's been a while since you posted but it seems that when I run the download to get the wmt16 data I'm getting a different output from the BPE processing, resulting in a different vocabulary.

I was hoping someone might be able to provide me with their working vocab, inference files etc for English-German

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants