-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syn-invalid-codepoint-escaped-bad-01.rq should pass #167
Comments
Thank you for reporting this! RDF literal values must be valid UTF-8 strings and |
Jena gets this right (negative synatx) in strict SPARQL 1.1/1.2 parsing; Jena gets it wrong in normal mode and that's a bug. To pass the test, a system has to check the string for surrogates and reject them because they are not legal in an XSD string (which is what that literal is). In RDF 1.2 there are "RDF Strings" (no surrogates) for lexical forms which will make it clearer. The way JavaCC has built-in support for bytes-to-java that includes Jena's two parsers (SPARQL strict and normal-with-extensions) configure JavaCC differently. (And Java's character handling of UTF-8 is not strict as @Tpt says.) |
Very interesting, thank you for the explanation! So, I am wondering: Should a parser syntactically reject the above query or is the test actually targeting functionality that should be implemented in a pre- or post-processing step and apart from parsing? Because I think that grammatically, the above should be fine, see String in the grammar. |
I checked the spec and ... it's not completely clear. Issue raised: w3c/sparql-query#189
rdf-test is useful in reflecting community interpretation and agreement.
|
A test for a simple Lark SPARQL 1.1 parser fails for
sparql/sparql11/syntax-query/syn-invalid-codepoint-escaped-bad-01.rq
because the parser is able to parse the query and doesn't fail.I feel like this should pass actually, since the object is just a string literal.
I tried this query with the GraphDB and Wikidata/Blazegraph SPARQL interfaces and both accept the query.
The text was updated successfully, but these errors were encountered: