You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if re.search(r'\w[¡!]$', node.form):
# Separate the punctuation and attach it to the rest.
punct = node.create_child()
punct.shift_after_node(node)
punct.form = node.form[-1:]
node.form = node.form[:-1]
punct.lemma = punct.form
punct.upos = 'PUNCT'
punct.xpos = 'faa' if punct.form == '¡' else 'fat'
punct.feats['PunctType'] = 'Excl'
punct.feats['PunctSide'] = 'Ini' if punct.form == '¡' else 'Fin'
punct.misc['SpaceAfter'] = node.misc['SpaceAfter']
node.misc['SpaceAfter'] = 'No'
punct.deprel = 'punct'
The method shift_after_node() correctly updates ids and basic heads that are after the new position of the shifted node. Unfortunately it fails to also update the enhanced heads when enhanced representation is present. Hence the following source
-19 Yahoo! Yahoo! PROPN np0000o _ 16 appos 16:appos ClusterId=CESS-CAST-A-20000503-1687-s5.sn.51|ClusterType=Spec.organization|MentionSpan=19
-20 con con ADP sps00 _ 21 case 21:case _
-21 intenciones intención NOUN ncfp000 Gender=Fem|Number=Plur 8 obl 8:obl ClusterId=CESS-CAST-A-20000503-1687-s5.sn.57|ClusterType=Gen|MentionSpan=21-22
results in the following (note the mismatch in the parent of the preposition con):
+19 Yahoo Yahoo! PROPN np0000o _ 16 appos 16:appos ClusterId=CESS-CAST-A-20000503-1687-s5.sn.51|ClusterType=Spec.organization|MentionSpan=19|SpaceAfter=No
+20 ! ! PUNCT fat PunctSide=Fin|PunctType=Excl 19 punct _ _
+21 con con ADP sps00 _ 22 case 21:case _
+22 intenciones intención NOUN ncfp000 Gender=Fem|Number=Plur 8 obl 8:obl ClusterId=CESS-CAST-A-20000503-1687-s5.sn.57|ClusterType=Gen|MentionSpan=21-22
It just occurred to me that the MentionSpan would also need updating but for that one would probably need to activate the CorefUD sub-API first?
The text was updated successfully, but these errors were encountered:
After two years, I ran into this issue again. The following ugly workaround seems to help with the enhanced relations but not with the position of the empty nodes. It might be useful until the bug is fixed properly.
# Bug in Udapi: shift_before_node() does not update enhanced relations.# Before using the method, deserialize the whole graph, i.e., convert# parent node ids to parent node object references.egraph= []
iflen(node.deps) >0:
forninnode.root.descendants_and_empty:
edeps= []
foredinn.deps:
edeps.append({'parent': ed['parent'], 'deprel': ed['deprel']})
egraph.append((n, edeps))
# Now shift the node, which will update numeric IDs (ords) of all# subsequent node objects.numbernode.shift_before_node(node)
# Not sure if this is needed: Re-set the edeps to make sure that# Udapi will have to serialize the egraph with the updated numbers.foreginegraph:
n=eg[0]
edeps=eg[1]
n.deps=edeps
I have the following code to fix tokenization issues in Spanish AnCora (UniversalDependencies/UD_Spanish-AnCora#6):
The method
shift_after_node()
correctly updates ids and basic heads that are after the new position of the shifted node. Unfortunately it fails to also update the enhanced heads when enhanced representation is present. Hence the following sourceresults in the following (note the mismatch in the parent of the preposition con):
It just occurred to me that the
MentionSpan
would also need updating but for that one would probably need to activate the CorefUD sub-API first?The text was updated successfully, but these errors were encountered: