-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement std.parseXmlJsonml #704
base: master
Are you sure you want to change the base?
Conversation
jsonml.go
Outdated
b.currDepth++ | ||
case xml.CharData: | ||
t := token.(xml.CharData) | ||
s := strings.TrimSpace(string(t)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for trimming it? I think whitespace is actually meaningful in XML because of things like <b>foo</b>bar
vs <b>foo</b> bar
. If we're only supporting the (admittedly common) case of the XML file being whitespace agnostic (i.e. whoever consumes the XML doesn't care about the whitespace) then we can document it as not supporting that. However I think just not trimming the strings here will fix it, will it not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason behind trimming whitespaces is to simplify jsonml. Currently due to formatted xml, jsonml becomes bit complex to work with due to empty content. We want to ignore empty lines but I agree current implementation would also trim spaces around the content. I could take of it.
I am thinking of allowing users to control this behaviour. We can provide an argument trimWhitespaces = true
to the function which would control if user wants to remove formatting whitespaces.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this which looks interesting. http://xml.coverpages.org/rfc-wshp19990416.html I wonder if the current behaviour of the manifester is wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I understand from these link is it is safe to remove boundary whitespaces as a default behaviour. We would provide an argument preserveWhitespaces = false
to preserve them if anyone wants to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be some diversity in what other parsers are doing so I think anything we do is fine as long as it's documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Hi @sparkprime |
Implement
std.parseXmlJsonml
as standard functioncpp-jsonnet PR: google/jsonnet#1092