Skip to content

Latest commit

 

History

History
126 lines (74 loc) · 5.01 KB

jason.md

File metadata and controls

126 lines (74 loc) · 5.01 KB

运行本项目记录

  1. 按照 README.md 指导运行,在python3 环境下
  2. 运行 scripts/prepro_labels.py 时所需时间较长,在服务器上花费了将近两个小时。
  3. 生成的词表大小为 4461

其他类似项目遇到的问题: ruotianluo/ImageCaptioning.pytorch#49

注意

20201207 在查看多GPU模式下的CIDEr训练图时发现,效果图和单GPU有较大差别

图像如下: 多GPU训练CIDEr日志 单GPU训练

20201209 使用best_model 进行测评后,发现多GPU比单个的效果差很多,相关结果已经写在周报中。重新开始训练,见训练日志 .log

错误记录

  • TypeError: Unicode-objects must be encoded before hashing

img['image_id'] 改成 img['image_id'].encode("utf-8")

  • AttributeError: module 'sys' has no attribute 'maxint'

sys.maxint 改成 sys.maxsize

  • NameError: name 'unicode' is not defined

unicode 改为 str

  • NameError: name 'xrange' is not defined

xrange 改为 range

  • AttributeError: 'collections.defaultdict' object has no attribute 'iteritems'

ref.iteritems 改为 ref.items

  • TypeError: write() argument must be str, not bytes

w 改为 wb ; 参考: https://stackoverflow.com/questions/5512811/builtins-typeerror-must-be-str-not-bytes

  • ModuleNotFoundError: No module named 'pyciderevalcap'

下载README中的link所给的模型,然后放在根目录下,分别命名为 cider , AI_Challenger 即可.

  • File "AI_Challenger/Evaluation/caption_eval/coco_caption/pycxevalcap/bleu/bleu_scorer.py", line 60 def cook_test(test, (reflen, refmaxcounts), eff=None, n=4): ^ SyntaxError: invalid syntax

(reflen, refmaxcounts) 改为 ref_refmaxcounts 并且设 reflen,refmaxcounts = reflen_refmaxcounts

  • FileNotFoundError: [Errno 2] No such file or directory: 'data/chinese_bu_att/74e1bd18c6836e7e0b88e42923f1f7d9a87d9a91.jpg.npz'

将相关特征文件下载并解压到根目录data下

  • FileNotFoundError: [Errno 2] No such file or directory: 'log_dense_box_bn/infos_dense_box_bn.pkl'

将 run_train.sh 中的 "--use_bn 1 --use_box 1" 改为 "--use_bn 0 --use_box 0" ??? 好像不是

  • NameError: name 'reduce' is not defined

from functools import reduce

  • WARNING:tensorflow:From train.py:42: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead

tf.summary.FileWriter 改为 tf.compat.v1.summary.FileWriter

  • Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument

weight = F.softmax(dot) 改为 weight = F.softmax(dot,dim=1)

  • Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument

output = F.log_softmax(self.logit(output)) 改为 output = F.log_softmax(self.logit(output),dim=1)

  • train_loss = loss.data[0] IndexError: invalid index of a 0-dim tensor. Use tensor.item() in Python or tensor.item<T>() in C++ to convert a 0-dim tensor to a number

train_loss = loss.data[0] 改为 train_loss = loss.item()

  • NameError: name 'reload' is not defined

import importlib importlob.reload('sys')

  • AttributeError: module 'sys' has no attribute 'setdefaultencoding'

删除这个语句

  • FileNotFoundError: [Errno 2] No such file or directory: 'data/eval_reference_new.json'

更改名称为 eval_reference.json

  • TypeError: a bytes-like object is required, not 'str

tmp_file.write(sentences) 改为 tmp_file.write(sentences.encode())

把打开文件的方式由 r 改为 rb

评测错误

  • infos = cPickle.load(f) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

把打开文件的方式由 r 改为 rb

  • assert vars(opt)[k] == vars(infos['opt'])[k], k + ' option not consistent AssertionError: input_att_dir option not consistent

eval 文件中 opt.input_att_dir = infos['opt'].input_box_dir 改为 opt.input_box_dir = infos['opt'].input_box_dir

  • NameError: name 'reduce' is not defined

from functools import reduce

数据文件说明

  • chinese_talk.json

存储着词典(数字-词语),图像的id,图像尺寸,存储地址,属于训练、验证或者测试的分类信息。

  • chinese_talk_label.h5

100M大小,里面存储着4个文件,分别是 label_end_ix ,label_start_ix,label_length,labels .前面两个维度是240000,第三个维度 1199985,第四个 1199985*20

  • fc--.npy

存储一张图像的平均特征,(2048) 维度

  • box--.npy

存储图像中box 的坐标位置,注意box数量这里并不固定,例如维度为 (43,4)

  • att--.npz

存储这box 各自的特征,例如 (43,2048)