Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

documentation about semantic representations #15

Open
arademaker opened this issue Jul 1, 2021 · 5 comments
Open

documentation about semantic representations #15

arademaker opened this issue Jul 1, 2021 · 5 comments

Comments

@arademaker
Copy link
Member

One missing featuree of the github wiki is that it does not send emails notifying changes in the wiki, right? I would be nice to let people know when we make modifications..

I just added https://github.com/delph-in/docs/wiki/RmrsDmrs from the information I got from delph-in/pydelphin#329 (comment).

I believe we need, for all DELPH-IN semantic representations, a uniform documentation about:

  1. the semantics
  2. the abstract syntax
  3. the concrete syntax (XML, RDF, JSON etc)

More? Why the name of the page is RmrsDmrs? Can we rename it to Dmrs only? Why using the prefix Rmrs?

@arademaker
Copy link
Member Author

Probably because of historical reasons, the DTD (XML representation schemas) of MRS, DMRS etc are in http://svn.emmtee.net/trunk/, mixed with the LKB source code. I would like to propose to move those DTD to this repository under a folder called schemas.

@goodmami
Copy link
Member

goodmami commented Jul 1, 2021

One missing featuree of the github wiki is that it does not send emails notifying changes in the wiki, right?

I'm quite happy to not receive emails for wiki edits and rely on the "All activity" notices at github.com, but, unfortunately for those who wish to receive more email, it's not exactly possible to configure GitHub to send emails for wiki edits. You can, in theory, get an Atom feed of the wiki (source), however it timed out when I tried it. Maybe our wiki is too big for that functionality.

Why the name of the page is RmrsDmrs? Can we rename it to Dmrs only? Why using the prefix Rmrs?

It's just historical. I believe there was a time when all MRS-related wikis were created under the "Rmrs" namespace.

I would like to propose to move those DTD to this repository under a folder called schemas.

I think that's a great idea. I also created RelaxNG versions of the schemas before, and they were a bit less ERG-centric, too.

@arademaker
Copy link
Member Author

In 1981f0e, I created the folder schemas and copied from the lkb/src the dtd files I found.

@arademaker
Copy link
Member Author

Hi @goodmami, do you still have your RelatexNG schemas? I found in https://github.com/delph-in/docs/wiki/MrsRFC something for the MRX.

@goodmami
Copy link
Member

goodmami commented Sep 3, 2021

The MRX one is as on the wiki. I have added it and the DMRX one to this repo. I could only find a partially-finished dmrs.rnc file, so I just now finished it up. The notes below regard dmrs.rnc.

There are two main differences from the DTD:

  • I added the top and index optional attributes on <dmrs> elements, as these are now commonly used
  • I restrict cfrom and cto values to the xsd:int type, following the MRS schema

You might find that this does not validate DMRSs from PyDelphin nor from the LKB (I tested LKB-FOS) for several reasons:

  1. Missing <dmrs-list> top-element (both LKB and PyDelphin; depends on how encoding functions are called)
  2. Property names are printed in upper-case (PyDelphin, see Upper/lower case not normalized when encoding/decoding DMRX pydelphin#333)
  3. Underspecified property values are not u, but bool/pers/etc. (both LKB and PyDelphin)
  4. Specified boolean values are + and -, not plus and minus (PyDelphin)
  5. prontype property name is spelled pt (both LKB and PyDelphin)

(4) is funny because the DTD only has plus and minus due to the inability of DTDs to specify + as an attribute value, so it appears to be just a hackish workaround, and it seems the LKB anticipates this and outputs plus and minus but PyDelphin does not. (3) and (5) are a mismatch between the grammar definitions and the DTD.

I therefore made the DTD easy to customize for a grammar. One could either edit the file directly or create a new RelaxNG file and import dmrs.rnc to replace some definitions. Here's an example of the latter:

# File: dmrs-erg-2020.rnc
# Note: assumes dmrs.rnc is in the same directory

include "dmrs.rnc" {

  # Allow either <dmrs> or <dmrs-list> as root
  start = Dmrs | DmrsList

  # Redefine property attributes for ERG-2020
  Properties = attribute num { "sg"|"pl"|"number" }?,
               attribute pers { "1"|"2"|"3"|"pers" }?,
               attribute gend { "m"|"f"|"n"|"m-or-f"|"gender" }?,
               attribute sf { "prop"|"ques"|"comm"|"prop-or-ques"|"sf" }?,
               attribute tense { "past"|"pres"|"fut"|"tensed"|"untensed"|"tense" }?,
               attribute mood { "indicative"|"subjunctive"|"mood" }?,
               attribute pt { "std"|"zero"|"refl"|"notpro"|"pt" }?,
               # Allow all of plus, minus, +, and - to accommodate both the LKB and PyDelphin
               attribute prog { "plus"|"minus"|"+"|"-"|"bool" }?,
               attribute perf { "plus"|"minus"|"+"|"-"|"bool" }?,
               attribute ind { "plus"|"minus"|"+"|"-"|"bool" }?

}

You can then use it with Jing as follows:

$ jing -c dmrs-erg-2020.rnc dmrs.xml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants