You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Token indices sequence length is longer than the specified maximum sequence length for this model (587 > 512). Running this sequence through the model will result in indexing errors
#60
Open
LinaDongXMU opened this issue
Jan 10, 2025
· 4 comments
Hello authors,
Thanks for building rxnmapper for atom-atom mapping assignments. When testing on lots of items of data, I fould the errors as following:
Some weights of the model checkpoint at /miniconda3/envs/rxnmapper/lib/python3.6/site-packages/rxnmapper/models/transformers/albert_heads_8_uspto_all_1310k were not used when initializing AlbertModel: ['predictions.decoder.weight', 'predictions.dense.bias', 'predictions.bias', 'predictions.dense.weight', 'predictions.decoder.bias', 'predictions.LayerNorm.weight', 'predictions.LayerNorm.bias']
This IS expected if you are initializing AlbertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing AlbertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Token indices sequence length is longer than the specified maximum sequence length for this model (587 > 512). Running this sequence through the model will result in indexing errors
I also have an item as the example of items that had such errors as following:
CC(C)CC@HC(=O)NC@@HC(=O)NC@@HC(=O)NC@HC(C)C>>CC@HC(=O)O
I do need your help to do atom-atom mapping on such items. I'm looking forward to your answer, and thanks!
The text was updated successfully, but these errors were encountered:
The warning about some model weights not being used is fine, you can ignore it.
The first SMILES you posted, CC(C)C[C@H]C(=O)N[C@@H]C(=O)N[C@@H]C(=O)N[C@H]C(C)C>>C[C@H]C(=O)O, works for me. Can you post the code you used?
The second one (the long one) has more tokens than the model can handle (587, the model works up to 512). A new model with longer context would need to be trained for that.
Thanks for answering. The first SMILES is the same as the second one, but I don't know why it is not fully displayed. So I sent it again. For the new model you mentioned, do you mean that I need to provide longer training data for retraining a new model and then test this one? In fact, I have lots of data terms that are longer then 512 (they all have errors as shown above) but they are not mapped, and I don't know how to deal with them (make them mapped) and can you give me some suggestions about it?
Hello authors,
Thanks for building rxnmapper for atom-atom mapping assignments. When testing on lots of items of data, I fould the errors as following:
Some weights of the model checkpoint at /miniconda3/envs/rxnmapper/lib/python3.6/site-packages/rxnmapper/models/transformers/albert_heads_8_uspto_all_1310k were not used when initializing AlbertModel: ['predictions.decoder.weight', 'predictions.dense.bias', 'predictions.bias', 'predictions.dense.weight', 'predictions.decoder.bias', 'predictions.LayerNorm.weight', 'predictions.LayerNorm.bias']
Token indices sequence length is longer than the specified maximum sequence length for this model (587 > 512). Running this sequence through the model will result in indexing errors
I also have an item as the example of items that had such errors as following:
CC(C)CC@HC(=O)NC@@HC(=O)NC@@HC(=O)NC@HC(C)C>>CC@HC(=O)O
I do need your help to do atom-atom mapping on such items. I'm looking forward to your answer, and thanks!
The text was updated successfully, but these errors were encountered: