Skip to content

Releases: bill-baumgartner/reference-coreference-scorers

v9.0

26 Sep 20:37
b27beb6
Compare
Choose a tag to compare

This release adds support for discontinuous mentions as well as partial mention matching.

There are two major changes in this commit.
A) Added partial mention matching functionality. The partial
mention matching scheme currently used is very simple. A mention in the
response is allowed to match a key mention if it includes/overlaps the
first token of the key mention and if the key mention hasn't already
been exactly matched by a response mention. See the PartialMatch
subroutine near line 450. Partial matching can be enabled using the
$allow_partial input argument.
B) Added handling for discontinuous mentions, i.e. mentions
composed of non-contiguous tokens. Discontinuous mentions are indicated
in the coreference column of a key or response input file by the
addition of a sequence of characters after the chain identifier, e.g.
in the test document shown below, the entity chain with ID 0 is
composed of 5 mentions:
1) a mention spanning tokens 0-1
2) a discontinuous mention composed of token 3 and tokens
5-7
3) another discontinuous mention composed of tokens 9 and 11
4) a mention at token 14
5) a mention spanning tokens 16-18

      # begin document;
      test1	0	0	a	(0
      test1	0	1	b	0)
      test1	0	2	c	-
      test1	0	3	d	(0a)
      test1	0	4	e	-
      test1	0	5	f	(0a
      test1	0	6	g	-
      test1	0	7	h	0a)
      test1	.	8	.	-

      test2	0	0	i	(0b)
      test2	0	1	j	-
      test2	0	2	k	(0b)
      test2	0	3	l	-
      test2	0	4	m	-
      test2	0	5	n	(0)
      test2	0	6	o	-
      test2	0	7	p	(0
      test2	0	8	q	-
      test2	0	9	r	0)
      test2	0	10	.	-
      #end document