Bug 101510 – multipart/mixed / begin-end Bug

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 101510 - multipart/mixed / begin-end Bug


Summary:	multipart/mixed / begin-end Bug


Status:	RESOLVED FIXED

Product:	Pan
Classification:	Other
Component:	general
Version:	pre-0.13.3 betas
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	0.14.3
Assigned To:	Charles Kerr
QA Contact:	Pan QA Team

URL:
Whiteboard:

Duplicates:	111852 130522 (view as bug list)
Depends on:
Blocks:

Reported:	2002-12-18 10:09 UTC by kettensaege
Modified:	2006-06-18 05:25 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
message showing this behaviour. notice that this message intentionally tries to fool the reader into thinking it's a uuencoded message. how cute ... (2.05 KB, text/plain) 2002-12-18 19:34 UTC, Christophe Lambin	Details
a newer post that tries to break OE (1.09 KB, text/plain) 2003-02-03 19:42 UTC, Charles Kerr	Details

Description kettensaege 2002-12-18 10:09:54 UTC

Pan is subject to the annoying begin-end bug invented by Outlook Express: 

A text/plain Message containing "begin" and "end" Keywords is interpreted
as multipart/mixed (or uuencoded). Pan seems to even add a multipart/mixed
header which is not there in the original message.

Problem can be seen i.e. in <slrnavtto8.9u1.news@news.jors.net>.

Comment 1 Christophe Lambin 2002-12-18 19:34:04 UTC

Created attachment 13096 [details]
message showing this behaviour. notice that this message intentionally tries to fool the reader into thinking it's a uuencoded message. how cute ...

Comment 2 Charles Kerr 2003-01-07 16:07:18 UTC

*sigh* I thought this was fixed already? :)

Comment 3 Charles Kerr 2003-02-03 19:41:31 UTC

The problem with this attachment is that it's a syntactically
correct uuencoded message, so clipping out the contents is a
valid response.

The poster who's been putting these in the de.* hierarchy seems
to have realized that too, as he's now making the "begin" line
incorrect by removing the file permission number, so that compliant
newsreaders will bomb out of uu mode and show his messages as
text.

Pan does show the contents of his newer messages.

Comment 4 Charles Kerr 2003-02-03 19:42:14 UTC

Created attachment 14070 [details]
a newer post that tries to break OE

Comment 5 Christophe Lambin 2003-04-30 23:41:10 UTC

*** Bug 111852 has been marked as a duplicate of this bug. ***

Comment 6 Charles Kerr 2003-10-31 18:30:04 UTC

Date: Fri, 31 Oct 2003 18:14:06 +0100
From: "Juergen P. Meier" <bugzilla-pan@jors.net>
To: charles@rebelbase.com
Cc: pan-qa-maint@bugzilla.gnome.org
Subject: Pan - 0.13.2.93 Bug 101510
User-Agent: Mutt/1.4.1i

Hello,

some PAN users informed me that the pan newsreader shows the same
stupid bug as Microsoft Outlook Express and older versions of
IBM Lotus Notes.

I read your comment about me prudcing uuencoded lookalikes. Well, i
stongly reject this and have to inform you, that i use the english
wort "begin" (see
http://www.m-w.com/cgi-bin/dictionary?book=Dictionary&va=begin)
on a quite regular basis. And neither the Microsoft Corporation
nor free software autors can force me to abandon this word in favor
of alternatives like "start", "commence" or similar words.

I do not know how much experience you have with the Usenet media,
but in more than a decade of experience, i have only found two
software vendors who actually believe they can ignore common sense
and invent new definitions for "text/plain" content type Postings
by posing the pretty naive assumption that all uu-code
solely relies on the occurance of a common english word followed by
an optional number and another even more common english word occuring
somewhere later in the text. In my opinion this is a pretty silly idea.

(Google finds more then 35 *million* websites containing "begin", and
more than *one hundred million* hits on "end".)

Even the /usr/bin/uudcode Unix tool from the 1980s is more intelligent
than both Microsoft Outlook Express and your software, by correctly
identifying my postings as not beeing uuencoded content.

Now i do not ask you to change your product, nor do i speak for the
Users of your product. I just feel inclined to explain why your
comment about my posting style is quite silly.

Side Note: In postings that comply to the MIME standard, uuendoded
content is MIME encoded and the content is declared as
content-transfer-encoding: uuencode
Now, my postings are pretty obviously declared as text/plain and
the transfer-encoding beeing 8bit (meaning unencoded).

regards,
Juergen Meier

Comment 7 claus 2003-11-01 13:01:45 UTC

''content-transfer-encoding: uuencode'' is not a valid MIME encoding; 
there's only "binary", "8bit", "7bit", "base64" and 
"quoted-printable".
It's quite stupid to use UUENCODE when a superior solution (MIME with 
base64) is available but it might happen if the user decides to be 
stupid (or cater for old software of the intended recipient).

Detecting UUENCODDE in plain text (and yes, it might be labelled as 
"text/plain" with "8bit" or even "quoted-printable" encoding -- the 
whole point of UUENCODE is embedding binary data in plain text) is 
always based on heuristics. These heuristics have to be more 
sophisticated, however, than just looking for "begin"/"end" because 
plain text can include anything. There are a lot of ways to make them 
more reliable:
. Don't use /^begin [0-9]* / but /^begin [0-7]{3,} / (at least three 
octal digits for the file permissions).
. Check the byte count of each line. Display lines with a wrong byte 
count after the "attachment" (and don't assume an attachment if too 
many lines have a wrong byte count).
. Do a frequency analysis on the encoding alphabet used. UUENCODEers 
usually don't mix different alphabets (actually, there are only two 
known alphabets -- one with " " as 0 and one with "`", so more than 65 
different characters (one more "just in case") indicate non-UUENCODEd 
data.
. Check the line lengths. If there are more than 3 ("standard" length, 
one shorter line at the end, one just in case) different line lengths 
don't assume UUENCODE.
. Check for the last zero-length line (usually just a `) at the end.

Comment 8 Erik Tews 2003-11-11 10:00:29 UTC

The following message breaks pan too:

begin 1 followup Ignatios Souvatzis <ignatios@newton.cs.uni-bonn.de>:
> Andreas Bogk schrieb:
> 
>> Solange man keine schmutzigen Tricks macht, und ich meine *wirklich*
>> schmutzige Tricks, wie bei einer doppelt verketteten Liste beide
>> Pointer XORen und in nur einem Word speichern,
> 
> Wie soll das funktionieren?

Doppelt verkettete Liste. Du kennst immer einen der beiden Pointer.
(solange du dir gemerkt hast, woher du kamst.)

Das ist Eklig.

Juergen
-- 
Juergen P. Meier - "This World is about to be Destroyed!"
end
If you think technology can solve your problems you don't understand
technology and you don't understand your problems.  (Bruce Schneier)


So if you would look at begin a little bit more carefully, it would work.

Comment 9 Charles Kerr 2003-11-13 22:08:05 UTC

The test cases here pass if Pan is more stringent about
checking for three octal digits for the mode as suggested
by claus.

http://cvs.gnome.org/bonsai/cvsview2.cgi?diff_mode=context&whitespace_mode=show&subdir=pan/pan/base&command=DIFF_FRAMESET&file=util-mime.c&rev1=1.58&rev2=1.59&root=/cvs/gnome

Comment 10 Christophe Lambin 2004-01-05 19:18:59 UTC

*** Bug 130522 has been marked as a duplicate of this bug. ***