-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix Encoding::CompatibilityError when nested message/rfc822 parts contain non-ASCII header bytes #569
Conversation
This patch changes DummySource to follow a similar pattern to the Maildir source, which works by opening the files directly for parsing. Previously DummySource was parsing strings fed through StringIO, which was introducing extra (or at least, different) complications with encoding. See for example the comments in issue sup-heliotrope#205.
This was fixed in rmail 1.1.2: terceiro/rmail@0479219
In the code for handling message/rfc822 MIME parts, message.rb line 498, we were calling the #normalize_whitespace method on the body string before it was decoded. I'm not too sure if messing with whitespace is the right thing to do there, but that aside, that method was then also calling #fix_encoding! which would forcibly transcode the raw body to UTF-8. Instead, we want to keep the body as ASCII-8BIT at that point, and let it be decoded using all the normal message decoding mechanisms. The only other calls to #normalize_whitespace are in the UI, and in the code path which handles body text of messages, message.rb line 592, where the body text has already been decoded. So it seems like we can safely make #normalize_whitespace just mess with whitespace and leave the string encoding alone. Fixes sup-heliotrope#205.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @danc86. I was able to reproduce the crash, and the fix resolves. Code looks good to me. Nicely done!
While reproducing, I spotted two preexisting cosmetic-only issues related to enclosed messages, but these aren't blockers because they aren't introduced by this PR - indeed, they are even present in Sup 0.22.1. I'll raise new issues for these.
Indeed, I noticed the mess with EnclosedMessage as well, that was going to be next on my list to fix, alongside the many other string encoding problems I keep finding :-) |
In class EnclosedMessage (within lib/sup/message_chunks.rb): - Ternary operator logic was backwards, e.g., setting @from to "unknown sender" if and only if "from" was truthy(!) - @lines was constructed using local variables instead of instance variables, which (happily?) masked the effects of the above backwards ternary logic. But this also created issue sup-heliotrope#571: Redwood::Person object representations were shown directly to the user, rather than the human-friendly parsed strings intended to be shown by Redwood::Person#full_address. - @lines was constructed as a single string (with literal line breaks), instead of following the normal convention of an array of lines. This hindered testability. In this commit, we: (1) Fix the above three issues in EnclosedMessage. (2) Uncomment and expand upon the "TODO" tests for EnclosedMessage headers. These placeholders were added in PR sup-heliotrope#569, credit @danc86. (3) Modify test/fixtures/non-ascii-header-in-nested-message.eml to follow best practices for a testing email address, i.e., use reserved ".invalid" domain instead of "b.c" - to ensure that the testing email address remains invalid forever.
In class EnclosedMessage (within lib/sup/message_chunks.rb): - Ternary operator logic was backwards, e.g., setting @from to "unknown sender" if and only if "from" was truthy(!) - @lines was constructed using local variables instead of instance variables, which (happily?) masked the effects of the above backwards ternary logic. But this also created issue #571: Redwood::Person object representations were shown directly to the user, rather than the human-friendly parsed strings intended to be shown by Redwood::Person#full_address. - @lines was constructed as a single string (with literal line breaks), instead of following the normal convention of an array of lines. This hindered testability. In this commit, we: (1) Fix the above three issues in EnclosedMessage. (2) Uncomment and expand upon the "TODO" tests for EnclosedMessage headers. These placeholders were added in PR #569, credit @danc86. (3) Modify test/fixtures/non-ascii-header-in-nested-message.eml to follow best practices for a testing email address, i.e., use reserved ".invalid" domain instead of "b.c" - to ensure that the testing email address remains invalid forever.
Fixes the remaining crash in #205.