-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is ⲁⲣⲭⲏ not in TUFTS? #179
Comments
This is a bug in the XML data - this entry is missing the |
Perhaps it fell off in the latest version?The first one contains it, it would seem:<etym><ref type="greek_lemma::grl_ID">289</ref><ref type="greek_lemma::grl_lemma">ἀρχή</ref><ref type="greek_lemma::grl_meaning">beginning, origin, first principle, authority</ref><ref type="greek_lemma::grl_pos">noun</ref><ref type="greek_lemma::grl_ref">LSJ 252a; Preisigke 1:219</ref></etym>On mobile currently. I'll do a sanity check tomorrow, I really need these entries to be in perfect shape, even though I supply my own hyperlinks into Perseus.Tnx Amir!Cheers
-------- Original message --------From: Amir Zeldes ***@***.***> Date: 28/09/2021 21:42 (GMT+01:00) To: KELLIA/dictionary ***@***.***> Cc: Martijn Linssen ***@***.***>, Author ***@***.***> Subject: Re: [KELLIA/dictionary] Is ⲁⲣⲭⲏ not in TUFTS? (#179)
This is a bug in the XML data - this entry is missing the <ref type="greek_lemma::grl_lemma"> as well as the DDGLC grl_ID - @simondschweitzer @KaJohn-DDGLC can this be fixed in the data? I don't want to just assign it an arbitrary ID and clash with the DDGLC database, but without these attributes the interface won't behave properly with this entry.
—You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android.
|
Hi Amir,
it is a bit more confusing
BBAW_Lexicon_of_Coptic_Egyptian-v4-2020.xml contains one entry on ἀρχή:
<entry type="foreign" xml:id="C199">
That one doesn't point to the Greek lexicon Perseus TUFTS via anything
like grl_.
Then there is DDGLC_Lexicon_of_Greek_Loanwords_in_Coptic-v2-2020.xml,
and there is another entry on ἀρχή: <entry xml:id="C8473">, and that
does contain "all the TUFTS goodies".
All that pertains to version 1.2 of KELLIA, release date July 22nd 2020
If I observe the very first version, I see that C8473 was present in the
DDGLC xml. Similarly for the BBAW xml, there doesn't seem (to me) to be
a signifcant difference regarding this matter, and C199 just appears to
be "a premature Greek word", given the fact that after C8042 a few
thousand Greek words appear in the dictionary - I think it is safe to
assume that C199 never had anything different from what it has now, nor
do I think that it needs fixing by adding a pointer; I think it must be
cleansed from the dictionary, harmonised with its sibling entry, so that
only one remains
There are a few "early Greek words" in the dictionary, and I count 3
when I use WbGWKDT as selection criterion: C148, C175, C199. Each of
those occurs twice, once in the early range (below C200) and once in the
Greek addition range, C8000+
It's not a bug, this is what is called data contamination, and it is
"stuff that happens" when you start out on a, let's call it "Agile" path.
Attached (from v1.0!) 561 non-unique data entries when considering
Coptic word and grammar classification - most of these are alright
because some Coptic words are "extensive homonyms", but some of these
are nomads, orphans, stray cats that should be put away. Just go by them
all, and when the C-numbers are more than 5-10 apart, I'd check it:
9197 ⳥Noun
9840 ⳥Noun
5837 ϣⲁⲗⲁⲛⲟⲥNoun
10922 ϣⲁⲗⲁⲛⲟⲥNoun
2649 ϣⲉ ⲛⲛⲟϩNoun masculine
5700 ϣⲉ ⲛⲛⲟϩNoun masculine
Also consider C5389 for example, and C4645: the former obviously was
added superfluously - question just is: why has it gone unnoticed for so
long?
I have taken this from my own incorporation of the v1.0 KELLIA CDO that
serves as a base for my Thomas Translation, but the idea should be clear
I think: dictionary entries must be unique across combination of word,
grammar type and definition (I have purposely excluded that latter as
the issue at hand seems to be that additional definitions got created to
new entries instead of existing ones).
The bad news? C199 is marked as feminine noun, and C8473 as genderless -
so these two don't even pop up, and the data contamination likely is
much larger
Martijn
…On 2021-09-28 21:42, Amir Zeldes wrote:
This is a bug in the XML data - this entry is missing the |<ref
type="greek_lemma::grl_lemma">| as well as the DDGLC |grl_ID| -
@simondschweitzer <https://github.com/simondschweitzer> @KaJohn-DDGLC
<https://github.com/KaJohn-DDGLC> can this be fixed in the data? I
don't want to just assign it an arbitrary ID and clash with the DDGLC
database, but without these attributes the interface won't behave
properly with this entry.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#179 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANOYDZ6MDWKS5QUMPML2K5DUEILCXANCNFSM5EQCSE5Q>.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Thanks for this information - unfortunately I do not have access to the BBAW or DDGLC source databases that would need to be adjusted or the scripts that generate the lexicon XML itself, so I'm adding @dwerning to ask if this can be fixed in the source files. Is there someone else we need to contact about this? |
Describe the bug
I would expect a Greek pointer to ⲁⲣⲭⲏ, https://coptic-dictionary.org/entry.cgi?tla=C199, just like e.g. ⲁⲛⲁⲣⲭⲟⲥ
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I'd expect the pointer to http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.04.0058%3Aentry%3Da)rxh%2F
Screenshots
N/A
Desktop (please complete the following information):
Smartphone (please complete the following information):
N/A
Additional context
Did you miss it because the primary meaning you assigned to it, 'authority' is so very Christian, and unlike its original? I'd swap 1 and 2 anyway, unless you want to argue that first this Greek loanword came to mean 'authority' in Coptic, and only afterwards 'beginning'
And yes, I'm back at the Commentary
The text was updated successfully, but these errors were encountered: