-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion error with pdfsizeopt #110
Comments
Thank you for reporting this problem! Object 523 in the input file nnGm.pdf looks like this (after some transformations):
The line It would be awesome if pdfsizeopt was able to repair broken PDF files such as nnGm.pdf. However, adding and maintaining such repair code is not feasible until it gets funding. As of now, to get nnGm.pdf processed by pdfsizeopt successfully, you need the regenerate nnGm.pdf with non-broken software first. Or you may want to preprocess it with pdftk or qpdf (and feed the output of those tools to pdfsizeopt), which may be more lenient on these kinds of syntax errors. |
I not sure whether the raw XML is wrong. It seems just structured. The 2863 0 obj
<</Type/Metadata/Subtype/XML/Length 4228>>
stream
<?xpacket begin="<U+FEFF>" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.6-c015 84.159810, 2016/09/10-02:41:30 ">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
xmlns:pdfx="http://ns.adobe.com/pdfx/1.3/"
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
xmlns:stRef="http://ns.adobe.com/xap/1.0/sType/ResourceRef#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<pdf:Producer>Acrobat Distiller 7.0 (Windows)</pdf:Producer>
<pdfx:codeMantraↂ002Cↂ0020LLC>http://www.codemantra.com</pdfx:codeMantraↂ002Cↂ0020LLC>
<pdfx:Universalↂ0020PDF>The process that creates this PDF constitutes a trade secret of codeMantra, LLC and is prote
cted by the copyright laws of the United States</pdfx:Universalↂ0020PDF>
<xmp:CreateDate>2011-08-13T07:43:57+05:30</xmp:CreateDate>
<xmp:ModifyDate>2019-03-18T16:05:59+08:00</xmp:ModifyDate>
<xmp:MetadataDate>2019-03-18T16:05:59+08:00</xmp:MetadataDate>
<xmp:CreatorTool>PScript5.dll Version 5.2</xmp:CreatorTool>
<xmpMM:DocumentID>uuid:DF57C9D151C5E0119B6BD1C4AAD1A2F9</xmpMM:DocumentID>
<xmpMM:InstanceID>uuid:66b746de-c760-f544-a64a-127d172cc809</xmpMM:InstanceID>
<xmpMM:DerivedFrom rdf:parseType="Resource">
<stRef:documentName>uuid:131bfa5e-206c-4a25-aa69-1a9c002a577a</stRef:documentName>
<stRef:documentID>uuid:ff0ad5d3-c572-4519-8102-3197dccd28d4</stRef:documentID>
</xmpMM:DerivedFrom>
<dc:format>application/pdf</dc:format>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">Population Genetics of Bacteria : A Tribute to Thomas S. Whittam</rdf:li>
</rdf:Alt>
</dc:title>
<dc:creator>
<rdf:Seq>
<rdf:li>Walk, Seth T.(Editor)</rdf:li>
</rdf:Seq>
</dc:creator>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
[DOZENS OF SPACE CHARS HERE]
<?xpacket end="w"?>
endstream
endobj |
pdfsizeopt doesn't have a problem with object 2863 (in fact, pdfsizeopt keeps such XML objects intact), it is complaining about the syntax error in object 523. Unfortunately I'm not able to advise you how to fix the input PDF beyond the advice I've already given (i.e. try pdftk or qpdf). If you manage the fix it, please update this issue! |
I managed to fix it with Then, it works. And I even find qdf will leads to smaller file.
I am facing #111 now. |
Using
pdfsizeopt_libexec_darwin-v1.tar.gz
.nnGm.pdf
The text was updated successfully, but these errors were encountered: