GNOME Bugzilla – Bug 552121
Drop UUENCODE inline filter
Last modified: 2010-10-08 12:54:07 UTC
Hi all, Sending a message with "begin 2011" creates an attachment of an unknown type. This doesn't always happen, since it depends on the size of the paragraph and the amounts of text after and before the "begin 2011" phrase. Note that "begin 2011" is Dutch for "at the beginning of 2011", so it is quite normal for me and other Dutch folks to use it. This happened when I wrote someone a letter with the 2.6.3 version that comes with Debian stable. I've tested version 2.23.91 with the Live CD of Ubuntu 8.10 (Alpha 5) and the same thing happens. So this bug has been around for some time and is still around. Please see the attachments Evolution_and_begin_2011.txt and Evolution_and_begin_2011.png for more information. With best regards, Mr. N.L.M. de Jonge
Created attachment 118661 [details] Example text that creates the "begin 2011" bug. Look at Evolution_and_begin_2011.png to see what happens when this text is being sent.
Created attachment 118662 [details] Screenshot of the result of the "begin 2011" bug. This is what happens when the example text in Evolution_and_begin_2011.txt is being sent.
I'm not sure, but it could be related to the emif_scan function in em-inline-filter.c. That function has a case EMIF_PLAIN where a strncmp is done for "begin ".
Can reproduce in 2.22.3. An example plain text body that triggered this: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. begin 2011 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Note it is a display issue. No real attachments are created if you compose the above message.. Not sure what is going on but the pointer to emif_scan() looks indeed relevant. Thanks! Dear lazyweb: what sort of inline stuff starts with "begin $SOMEDIGITS"?
I guess I found it: http://en.wikipedia.org/wiki/Uuencode
A lot of strange layout can be generated by feeding evolution invalid uuencoded inline attachments. Interesting stuff. Problem seems to be basically that inline text starting with "begin $SOMEDIGITS" will already trigger (simply said) uuencoded inline attachmanents handling routines. Maybe evolution should test more thoroughly whether some inline text really is a valid uuencoded attachment and therefore only render it as an (inline) attachment after the "end" line is reached and all lines between "begin [...]" and "end" are actually valid. Probably needs some major rewrite of the inline stuff. Worth the effort?
This text with an (totally invalid) uuencoded inline attachment crashes 2.22.3: "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. begin 20000 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. ` end" It looks like "20000" triggers it (Note: uuencoded files should have 3 or 4 digits after "begin" and not 5 digits. Also note that the uuencoding stuff doesn't crash on lines with not encoded text, which are invalid, as far as I can tell). backtrace:
+ Trace 207182
Haven't pinpointed the troubling location yet. To be investigated.
Minimal test case ("commented out" to not crash evolution while readering bugmail again): # begin 2000 foo, # bar # ` # end
it doesn't crash in trunk any more, because of bug #460204 and possibly bug #541045. Thus only the issue with wrongly recognized part of the plain text part lefts here.
I guess one thing we could do, if we encounter what looks like a uuencoding "begin" header, is try to scan ahead for the closing "end" lines and only treat the text as uuencoded if we find _both_ a valid header and footer. Haven't looked at the code at all to see how feasible this would be.
(In reply to comment #10) > Haven't looked at the code at all to see how feasible this would be. I have done that only a little. See comment #6. You might also have to scan the lines in between to see if those are valid too. Easy way out is of course to not try to do this at all: how common are uuencoded inline attachments anyway? As far as I can tell those are rather ancient ... (That feeling is just based on what passes through my Inbox.)
I agree with Matt, we can get rid of that part recognition in the evolution/mail, it's quite obsolete these days, I believe, but the proper way is that the check should be made properly, with looking for the end tag too, and adding any attachment only when found the end tag. Note: the filter is working in chunks (in general), thus you cannot just pre-read the buffer. Paul, would you mind to create such patch? I can review/test for you.
(In reply to comment #12) > Paul, would you mind to create such patch? I can review/test for you. I wouldn't mind at all. It might take me a while (since it is not very urgent, as it is a seldom encountered problem, and my minimal test case is rumored not to crash evolution 2.24 anymore.) It is not entirely clear what you propose though: 0) rip out the code for handling uuencoded inline attachments entirely; or 1) handle uuencoded inline attachments properly (i.e. only add them as attachments after an "end" footer is found [*]? Milan, is it 0) or 1)? [*] This will probably mean that: - all lines of a possible uuencoded inline attachment need to set aside in a temporary buffer; - this buffer can be added as an inline attachment only after the "end" footer is found (and it and all its preceding lines are valid too); - this buffer needs to be rescanned (for other inline stuff) once it is found not to be a valid uuencoded inline attachment. In short: quite some code will be needed.
(In reply to comment #9) > it doesn't crash in trunk any more, I seem to remember mcrha saying (on irc) that he wondered whether it also doesn't crash for other people. So, for what its worth, I can confirm it doesn't crash (anymore) on 2.23.92 (evolution-2.23.92-1.fc10.i386, which is current Fedora rawhide).
(In reply to comment #13) > Milan, is it 0) or 1)? I'm sorry for confusion. Even I prefer 1) to be done for all those sub-filters there, then I do not think the uuencode inline attachments are used at all these days, and we could drop the code. But, I'm not a mail expert here, these are just my thoughts. (For example for inline PGP signature, it should break in a similar way, when you have there only half of the signature (like only the beginning line), right?)
I agree on both issues of comment #15 (but I'm certainly not a mail expert): - uuencoded inline attachments look ancient to me (as I said in comment #11). Its parser is probably more often an annoyance than a feature; - I haven't looked at the code recently, but I remember all inline parsers to be triggered by their respective header lines. This would indeed break in similar ways if the inline stuff is found to be (unsolvable) invalid further on. Ripping out uuencoded inline stuff should be easy (and boring). I wouldn't mind doing that one of these days. Redoing all the inline parsers is not simple. Probably requires a mail expert.
bug 597483 dupe ?
Can anyone of you try with actual master, especially after changes in bug #531912, please? Thanks in advance.
Paul, N. de Jonge, any chance you could try git master? If the issue is gone, I guess we can close as dup of bug 531912.
(In reply to comment #19) > If the issue is gone, I guess we can close as dup of bug 531912. 0) I can't make Fedora rawhide's version of evolution (ie, 2.31.5, so not git master, but close enough I guess) crash on any of the dozens of test mails I created two years ago to research this issue. I'm pretty certain that a number of them used to crash evolution. 1) I'm going to close this RESOLVED FIXED, as I'm not sure whether it actually was a dup, and at the end of the day closing is what really matters. 2) Invalid inline uuencoded attachments are still displayed rather strangely. If somebody really cares they could open a new bug for that issue. I don't really care. But, for what it's worth, I again suggest to simply remove all inline uuencoded attachment support (see mcrha's comment #15 for similar thoughts).
I originally submitted this bug report and I didn't write it made Evolution crash. I'm not a coding expert, but think it's strange you write that "closing is what really matters" while you're not sure it is a duplicate. Dutch folks write "begin 2011" (which is "at the beginning of 2011" in English) a lot, and if an Evolution user receives an e-mail with these words, not only is an attachment created out of nowhere but words disappear, so important information might get lost/becomes unreadable. The current Debian stable has version 2.22.3.1 which has this problem. Maybe a developer who has a version where this... https://bugzilla.gnome.org/show_bug.cgi?id=531912 ...bug is fixed can send to him- or herself in the body of an e-mail: https://bugzilla.gnome.org/attachment.cgi?id=118661 If the e-mail looks fine, the bug is fixed; if it does not, in my opinion my bug report should not yet be closed.
(In reply to comment #21) > I originally submitted this bug report and I didn't write it made Evolution > crash. Yes that's correct. I didn't realize that when closing. (The crashes were the issue I cared most about.) I'll just change the title (again, but subtly) to not imply the crashing problem I ran into. > I'm not a coding expert, but think it's strange you write that "closing > is what really matters" while you're not sure it is a duplicate. This is off topic, but it just means that since the crashes were fixed this bug could be closed. Whether the actual cause of those crashes was also reported in another report isn't that interesting: it would just have meant it would be closed RESOLVED DUPLICATE. The difference between FIXED and DUPLICATE isn't that interesting here.
Created attachment 171945 [details] [review] evo patch for evolution; This is dropping the UUENCODE inline filter from Evolution. After a bit reading of the code it seems to me as it might not ever work properly anyway, thus I'm just dropping it. If there will be any need to include it in the future, then we can rethink the way it might work. N. de Jonge, feel free to ask your Debian maintainer to backport this patch. It's just about line removing, thus shouldn't be too hard even when it'll not apply cleanly to sources.
Created commit 8f413c5 in evo master (2.91.1+) Created commit b0230b1 in evo gnome-2-32 (2.32.1+)