Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample usage commands needed #3

Open
Nikoschenk opened this issue May 12, 2020 · 1 comment
Open

Sample usage commands needed #3

Nikoschenk opened this issue May 12, 2020 · 1 comment

Comments

@Nikoschenk
Copy link

I'd like to "merge" Ontonotes (coref) and PropBank (SRL) annotations.
Could someone provide me with a detailed instruction to do this?

@chiarcos
Copy link
Contributor

The merging is pretty trivial, but this case requires some additional preprocessing, because neither OntoNotes coref nor PropBank are originally disseminated in CoNLL formats. So, there is an additional step of conversion needed (see bottom). We will put a single script into a separate repository.

merging:

For merging two (or more) TSV files (FILE1.conll, FILE2.conll) via their first column, use
$> cmd/merge.sh FILE1.conll FILE2.conll

For merging (exactly) two TSV files over columns other than the first (here, 1 from FILE1 and 2 from FILE2), use

$>java -cp $CLASSPATH org/acoli/conll/merge/CoNLLAlign FILE1.conll FILE2.conll 1 2

Don't forget to set $CLASSPATH to bin/ and jars in lib/. For other parameters, see CoNLLAlign log.

preprocessing:

For converting a OntoNotes coref file to CoNLL, use

$> cmd/ontonotes.coref2conll.sh FILE1.COREF > FILE1.conll

For creating a CoNLL-style PropBank file from PropBank + Penn annotations, see cmd/propbank2conll/readme.txt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants