-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
double quotes " can break decoding #159
Comments
Hi @markusramsak -- A quoted part takes precedence. Specifically, "An 'encoded-word' MUST NOT appear within a 'quoted-string'.", see https://tools.ietf.org/html/rfc2047#section-5 I believe what you're trying to say is the mime-encoded part isn't "decoded", but that's correct behaviour as far as I'm aware. It would be hard to build an exception for what you want without breaking what should be considered valid because the quotes are supposed to take precedence at least as far as I can tell. Feel free to correct me with relevant examples, including handling by popular mail parsers or clients, or rfcs or other libraries that specifically are handling your situation differently to facilitate a discussion about it. |
I know that it shouldn't happen but I am the programmer of a mail client with more than 100.000 mails to parse and display and the only thing I can say is, it happens. Other mail clients like gmail oder Apple Mail could decode this mail subject correctly - and I would like too. Maybe it is just a matter of replacing |
Unfortunately the way the parser works, the 'part looking for quotes' is separate from the 'part looking for mime encoded parts'. It's semantically okay for a mime-encoded part to have a quote in it, it just won't be handled as a 'control character' terminating (or starting) a quoted-part. |
if it can't be done on your side, than I would implement on my side to replace these wrong characters in the "From " line before it is parsed by your parser. |
by the way you did an excellent job with this library! About 9995 out of 10000 emails can be parsed on average from my web mail client (backed by your library) without any issues. |
I'm not sure that it can't, but it would be an effort -- I'd have to change the precedence of how things are parsed, which would make some valid but extremely unlikely cases invalid, like If you're able to sanitize for exceptions you know of like that, I think that would be the way to go at least for now... we can leave this open and look when there's time or if it's affecting more people. You could also try emailing the folks at Amway to tell them there's an issue with their emails :) maybe they're using a house-built system that needs to be patched, or maybe it's a huge commercial system that means handling this scenario should be prioritized.
Excellent, very happy to hear that! |
the following simplified original version CANNOT be parsed correctly because of the closing quote in the "From: " line.
if I move the closing quote after ?=, it works.
please fix that so the parser can handle this.
The text was updated successfully, but these errors were encountered: