You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenNMT uses the order seq batch dim, while x-transformers uses batch seq dim. Currently, we transpose tensors to limit the x-transformers order to the parts where it is strictly necessary, while retaining the OpenNMT order everywhere else.
To simplify the code and avoid unnecessary transposes, we should use batch seq dim everywhere, starting from the data loader all the way through to the translation decoding.
The text was updated successfully, but these errors were encountered:
OpenNMT uses the order
seq batch dim
, while x-transformers usesbatch seq dim
. Currently, we transpose tensors to limit the x-transformers order to the parts where it is strictly necessary, while retaining the OpenNMT order everywhere else.To simplify the code and avoid unnecessary transposes, we should use
batch seq dim
everywhere, starting from the data loader all the way through to the translation decoding.The text was updated successfully, but these errors were encountered: