Padding mask #34

digbose92 · 2021-06-10T22:22:50Z

Hi,
I have variable length sequences for my task and those sequences are padded to a prespecified maximum length.
How can I ensure that the padded part of the sequence does not contribute to the attention computation ? There is an argument called key_padding_mask in https://github.com/yaohungt/Multimodal-Transformer/blob/master/modules/multihead_attention.py. Any leads on how to use this argument ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Padding mask #34

Padding mask #34

digbose92 commented Jun 10, 2021

Padding mask #34

Padding mask #34

Comments

digbose92 commented Jun 10, 2021