Reproducing result: WMT German-English bleu score is less than the half of the expected score #341

molipet · 2018-05-29T20:04:20Z

Thanks for sharing this great work!

Although, I strictly tried to follow the instructions in the ReadMe, I am unable the reproduce the WMT German-English benchmark results on newstest2015.

Here are my details:

python 3.6.2, Tensorflow 1.5.1
I used the provided nmt/scripts/wmt16_en_de.sh to download and pre-process the data files.
I patched the nmt/standard_hparams/wmt16.json by adding two lines "num_encoder_layers": 4, "num_decoder_layers": 4, in order to avoid the problem described in Error loading pretrained model #264, and Update WMT16 standard hparams #265.
I used the following pre-trained models:
- http://download.tensorflow.org/models/nmt/deen_model_1.zip
- http://download.tensorflow.org/models/nmt/deen_model_2.zip

I got the following inference results for newstest_2015:

deen_model_1 -- real bleu: 11.7, expected bleu: 27.6 (command to run inference: python -m nmt.nmt --src=de --tgt=en --ckpt=deen_model_1/translate.ckpt --hparams_path=nmt/standard_hparams/wmt16.json --out_dir=deen_model_1_output --vocab_prefix=wmt16/vocab.bpe.32000 --inference_input_file=wmt16/newstest2015.tok.bpe.32000.de --inference_output_file=deen_model_1_output/output_infer --inference_ref_file=wmt16/newstest2015.tok.bpe.32000.en)
deen_model_2 -- real bleu: 11.8, expected bleu: 28.9 (command to run inference: python -m nmt.nmt --src=de --tgt=en --ckpt=deen_model_2/translate.ckpt --hparams_path=nmt/standard_hparams/wmt16.json --out_dir=deen_model_2_output --vocab_prefix=wmt16/vocab.bpe.32000 --inference_input_file=wmt16/newstest2015.tok.bpe.32000.de --inference_output_file=deen_model_2_output/output_infer --inference_ref_file=wmt16/newstest2015.tok.bpe.32000.en)

Could you please provide any hint or help what am I doing wrong?

Thank you!

The text was updated successfully, but these errors were encountered:

potato1996 · 2018-06-01T16:53:07Z

The same here. My setup: python3.5 + tf1.8

ajithAI · 2018-06-05T06:48:53Z

I am also having similar issue. I am translating English to German ( newstest2015.tok.bpe.32000.en ) using "Ours — NMT + GNMT attention (8 layers)" model. The cited Bleu score : 27.6. But I got 21.0
command used :
python -m nmt.nmt --src=en --tgt=de --ckpt=../ende_gnmt_model_8_layer/translate.ckpt --hparams_path=nmt/standard_hparams/wmt16_gnmt_8_layer.json --out_dir=../ende_model_gnmt_8_output_news15 --vocab_prefix=tmp/wmt16/vocab.bpe.32000 --inference_input_file=tmp/wmt16/newstest2015.tok.bpe.32000.en --inference_output_file=../ende_model_gnmt_8_output_news15/output_infer --inference_ref_file=tmp/wmt16/newstest2015.tok.bpe.32000.de

TF : 1.8 , Python 2.7 .

Anyone please let me know if I am doing anything not proper !!

Thanks in Advance :)

Hyper-Param File wmt16_gnmt_8_layer.json contains :

{
"attention": "normed_bahdanau",
"attention_architecture": "gnmt_v2",
"batch_size": 128,
"colocate_gradients_with_ops": true,
"dropout": 0.2,
"encoder_type": "gnmt",
"eos": "",
"forget_bias": 1.0,
"infer_batch_size": 32,
"init_weight": 0.1,
"learning_rate": 1.0,
"max_gradient_norm": 5.0,
"metrics": ["bleu"],
"num_buckets": 5,
"num_layers": 8,
"num_encoder_layers": 8,
"num_decoder_layers": 8,
"num_train_steps": 340000,
"decay_scheme": "luong10",
"num_units": 1024,
"optimizer": "sgd",
"residual": true,
"share_vocab": false,
"subword_option": "bpe",
"sos": "<s", # IgnoreThis ( some GIT text edit error )
"src_max_len": 50,
"src_max_len_infer": null,
"steps_per_external_eval": null,
"steps_per_stats": 100,
"tgt_max_len": 50,
"tgt_max_len_infer": null,
"time_major": true,
"unit_type": "lstm",
"beam_width": 10,
"length_penalty_weight": 1.0
}

qwerybot · 2018-12-12T13:01:26Z

I know it's been a while since you posted but it seems that when I run the download to get the wmt16 data I'm getting a different output from the BPE processing, resulting in a different vocabulary.

I was hoping someone might be able to provide me with their working vocab, inference files etc for English-German

ChinaShrimp added a commit to ChinaShrimp/nmt that referenced this issue Jun 25, 2018

fix prediction error:tensorflow/nmt#341

0d44954

ChinaShrimp added a commit to ChinaShrimp/nmt that referenced this issue Jun 25, 2018

fix prediction error: tensorflow/nmt#341

6776960

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing result: WMT German-English bleu score is less than the half of the expected score #341

Reproducing result: WMT German-English bleu score is less than the half of the expected score #341

molipet commented May 29, 2018

potato1996 commented Jun 1, 2018 •

edited

Loading

ajithAI commented Jun 5, 2018 •

edited

Loading

qwerybot commented Dec 12, 2018

Reproducing result: WMT German-English bleu score is less than the half of the expected score #341

Reproducing result: WMT German-English bleu score is less than the half of the expected score #341

Comments

molipet commented May 29, 2018

potato1996 commented Jun 1, 2018 • edited Loading

ajithAI commented Jun 5, 2018 • edited Loading

qwerybot commented Dec 12, 2018

potato1996 commented Jun 1, 2018 •

edited

Loading

ajithAI commented Jun 5, 2018 •

edited

Loading