Fix ValueError for LMTokenMask.log_prob()
when the mask has no support
#21
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There are various scenarios in which token masking results in a
LMTokenMask
distribution that has a null support. This can occur when a mask rules out all tokens in the vocab, or when multiple sequential observations ofmask_dist()
are mutually incompatible (i.e., their set intersection is empty).Currently, this corner case is not well accounted for and results in a fairly cryptic
ValueError: zero-size array to reduction operation maximum which has no identity
:This PR proposes to address this issue by defining
LMTokenMask.log_prob(v) := -inf
when the mask has no support underv
(i.e., there are nogood_tokens
forv
). From a practical standpoint, this fix is useful for addressing thezero-size array
error above. However, it is still possible to instantiate "degenerate" LMTokenMask distributions (we simply define their density to be 0 everywhere).From a more theoretical standpoint, this issue is a bit tricky because
LMTokenMask
objects are closely intertwined withLMContext.model_mask
. Other partial/complementary solutions include:LMTokenMask
is instantiated with null supportLMContext.model_mask
to ensure that it never becomes the empty set@alex-lew let me know what you think, happy to discuss.