tag_encoder and text_decoder #191

Stephen-K1 · 2024-07-01T14:52:18Z

Hi, thanks for open sourcing your great work !

When reading the codes, I'm confused by the next-token prediction in calculating the loss_t2t, and I don't understand why the first four (prompt_length) labels are ignored (set as -100) during training. So I start to read the inference code hoping to figure this out. However, I found that both inference_ram.py and inference_ram_openset.py did not use the tag_encoder and text_decoder during inference, which makes me more confused. So I want to kindly ask that:

Can you explain the next-token prediction in calculating the loss_t2t and why some labels are set as -100?
Why tag_encoder and text_decoder are not used in the inference?

Thanks in advance !

Stephen-K1 · 2024-07-02T08:07:37Z

Alright, I think I can figure this out by reading inference_tag2text.py. Thanks for sharing the codes anyway.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tag_encoder and text_decoder #191

tag_encoder and text_decoder #191

Stephen-K1 commented Jul 1, 2024

Stephen-K1 commented Jul 2, 2024

tag_encoder and text_decoder #191

tag_encoder and text_decoder #191

Comments

Stephen-K1 commented Jul 1, 2024

Stephen-K1 commented Jul 2, 2024