Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected behaviour of zipping non-XML text (other than using base64). #299

Open
LeifW opened this issue Feb 22, 2020 · 1 comment
Open

Comments

@LeifW
Copy link

LeifW commented Feb 22, 2020

When data comes in for pxp:zip as:

<c:data xml:base="mimetype" content-type="application/octet-stream" encoding="base64">YXBwbGljYXRpb24vZXB1Yit6aXA=</c:data>

it gets gets saved in the .zip file with the contents: application/epub+zip
However, if it comes in as:

<c:data xml:base="mimetype" content-type="text/plain">application/epub+zip</c:data>

it gets saved into the zip file as <?xml version="1.0" encoding="UTF-8"?><c:data xmlns:c="http://www.w3.org/ns/xproc-step" content-type="text/plain">application/epub+zip</c:data>

I was hoping for just the plain text contents in the file, no XML.

Ah, looking at the source code introduced in #133 , I got it to work - just have to change the element from <c:data/> to <c:result/>:
https://github.com/ndw/xmlcalabash1/blob/saxon99/src/main/java/com/xmlcalabash/extensions/Zip.java#L595
If I'm reading that if-statement right, if it's in the xproc-step namespace, or not-namespaced, and has a encoding="base64" attribute, the text contents get base64-decoded and saved. If it's a <c:data/> element and has a content-type that starts with "text/", the text content gets saved.

That seems a little non-obvious / undocumented. The XProc spec doesn't seem to document the attributes on <c:result/>, while <c:data/> is documented, and is what is returned by an evaluation of p:data. In the spec (e.g. for validated with relax ng or xquery steps), it mentions treating the contents of c:data as text.

Not clear on the differences between those two elements - some steps return <c:data/>, and some return <c:result/>?
Anyways, as a consequence of this - running something sourced with <p:data content-type="text/plain" href="mimetype"/> through pxp:zip will save the file you read as plain text with an XML wrapper, while <p:data href="mimetype"/> will get saved as plain text (no wrapper), because p:data defaults to emitting the element as content-type="application/octet-stream".

@LeifW
Copy link
Author

LeifW commented Feb 22, 2020

I'm not seeing <c:result content-type="text/plain"/> elements being generated by any of the steps I'm looking at when reading through the spec - they come back with no attributes. Curious when that would come up (besides writing one literally inside <p:inline>).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant