After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 104479 - Answering to an article which contains a 8-bit character in the headers
Answering to an article which contains a 8-bit character in the headers
Status: RESOLVED FIXED
Product: Pan
Classification: Other
Component: general
pre-0.13.3 betas
Other Linux
: Normal major
: 0.14.0
Assigned To: Charles Kerr
Pan QA Team
: 104459 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2003-01-26 18:49 UTC by Mahaleo
Modified: 2006-06-18 05:25 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
This is the full message Pan creates to the bugreport 104479 (2.30 KB, text/plain)
2003-03-10 01:00 UTC, Mahaleo
Details

Description Mahaleo 2003-01-26 18:49:26 UTC
Answering to an article which contains a 8-bit character in the headers
(like, "From: Mahaléo"), Pan posts in Utf-8 by declaring Iso-8859-1.
Comment 1 Jean-Marc Desperrier 2003-01-26 22:58:44 UTC
I did a logical analyze of what can cause this bug, wich leads to the
following conclusion.

- Pan does not expect some people will send raw 8 bit data in headers.

- Pan is copying the content of the "From" header to an internal UTF-8
buffer without testing if there is raw 8 bit inside it. This is done
when pan tries to create the "%s has written in message:" line in
front of the answer.

- As a result there is some invalid ISO-8859-1 data inside this UTF-8
buffer

- Later a conversion fonction is called to convert the date to
iso-8859-1 that stops at the first invalid character (this is the
behavious of iconv for example).

- This error is not detected, and Pan goes on sending as output most
of the buffer content as UTF-8 instead of the expected ISO-8859-1

This analyze is confirmed that raw UTF-8 data in headers will not
cause the bug, everything is correct in the output, and the name in
the "%s wrote" line is correct and is encoded in iso-8859-1.

Recommended solution for correction :
- test if the header is valid UTF-8 (iconv from/to UTF-8 is a stupid
but simple way to do it).
- if not, convert it from the encoding used for the body to UTF-8.
- if this fails again ?? Maybe remove all non-7 bit data ?
Comment 2 Christophe Lambin 2003-01-27 20:30:52 UTC
Jean-Marc:  thanks for that excellent analysis! 

Note that this only happens for %a (which doesn't convert to UTF-8),
not for %n (which does).
Comment 4 Mahaleo 2003-02-01 00:59:09 UTC
The bug seems to persist in the latest beta version 0.13.3.91

e.g, an original post with some headers:

=================================================
[...]
Reply-To: "Mahaléo" <mahaleo@wanadoo.fr>
From: "Mahaléo" <quidam@pour_le_bot.org.antispam>
Newsgroups: fr.test
Subject: zzz test ignore
Date: Sat, 1 Feb 2003 01:18:01 +0100
X-Newsreader: Microsoft Outlook Express 6.00.2800.1106
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
Message-ID: <3e3b11e9$0$247$626a54ce@news.free.fr>
NNTP-Posting-Date: 01 Feb 2003 01:16:41 MET
MIME-Version: 1.0
Content-Type: text/plain

Etre ou ne pas être, c'est ça la question.
=================================================

And the follow-up message:

=================================================
[...]
From: "Mahaleo" <quidam@pour_le_bot.org.antispam>
Subject: Re: zzz test ignore
Date: Sat, 01 Feb 2003 01:26:27 +0100
User-Agent: Pan/0.13.3.91 (How did the starling get into the bar?)
Message-ID: <pan.2003.02.01.00.26.24.214810@mahaleowanadoofr>
Newsgroups: fr.test
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 8bit

Sat, 01 Feb 2003 01:18:01 +0100, dans
<3e3b11e9$0$247$626a54ce@news.free.fr>, Mahaléo a écrit:

[...]
> Reply-To: "Mahaléo" <mahaleo@wanadoo.fr>
> From: "Mahaléo" <quidam@pour_le_bot.org.antispam>
> Newsgroups: fr.test
> Subject: zzz test ignore
> Date: Sat, 1 Feb 2003 01:18:01 +0100
> X-Newsreader: Microsoft Outlook Express 6.00.2800.1106
> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
> Message-ID: <3e3b11e9$0$247$626a54ce@news.free.fr>
> [...]
> MIME-Version: 1.0
> Content-Type: text/plain
> 
> Etre ou ne pas être, c'est ça la question.

Bouh!
=================================================

Mahaleo
Comment 5 Charles Kerr 2003-02-13 19:47:46 UTC
Punting for Christophe's return
Comment 6 Christophe Lambin 2003-03-09 21:42:53 UTC
/me scratches his head ...

Mahaleo : could you let me know the charset of the group you're
posting to (group properties) and the locale you're using ('locale'
from the command line) ?  

Could you also attach the full message Pan creates to this bugreport
(http://bugzilla.gnome.org/createattachment.cgi?id=104479) ?


Comment 7 Mahaleo 2003-03-10 01:00:14 UTC
Created attachment 14880 [details]
This is the full message Pan creates to the bugreport 104479
Comment 8 Mahaleo 2003-03-10 01:10:21 UTC
The charset of the group is: ISO-8859-1 (but the same bug appears 
with ISO-8859-15). 
 
My 'locale': 
 
[mahaleo@localhost mahaleo]$ locale 
LANG=fr_FR.UTF-8 
LC_CTYPE="fr_FR.UTF-8" 
LC_NUMERIC="fr_FR.UTF-8" 
LC_TIME="fr_FR.UTF-8" 
LC_COLLATE="fr_FR.UTF-8" 
LC_MONETARY="fr_FR.UTF-8" 
LC_MESSAGES="fr_FR.UTF-8" 
LC_PAPER="fr_FR.UTF-8" 
LC_NAME="fr_FR.UTF-8" 
LC_ADDRESS="fr_FR.UTF-8" 
LC_TELEPHONE="fr_FR.UTF-8" 
LC_MEASUREMENT="fr_FR.UTF-8" 
LC_IDENTIFICATION="fr_FR.UTF-8" 
LC_ALL= 
 
Comment 9 Christophe Lambin 2003-03-11 22:54:38 UTC
*** Bug 104459 has been marked as a duplicate of this bug. ***
Comment 10 Christophe Lambin 2003-03-11 22:57:50 UTC
Hmmm, I can only explain this if either your group's charset is UTF-8,
or if you're switching between profiles that have '%n' in the
attribution (definite bug there, brought out if your locale is in
UTF-8, which is the case for you).

Looking further ...

Comment 11 Mahaleo 2003-03-14 03:29:37 UTC
The bug does not appear if I change my 'locale' by editing
'/etc/sysconfig/i18n' and by changing line
LANG="fr_FR.UTF-8"
by
LANG="fr_FR@euro"

My 'locale' becomes:
[mahaleo@localhost mahaleo]$ locale
LANG=fr_FR@euro
LC_CTYPE="fr_FR@euro"
LC_NUMERIC="fr_FR@euro"
LC_TIME="fr_FR@euro"
LC_COLLATE="fr_FR@euro"
LC_MONETARY="fr_FR@euro"
LC_MESSAGES="fr_FR@euro"
LC_PAPER="fr_FR@euro"
LC_NAME="fr_FR@euro"
LC_ADDRESS="fr_FR@euro"
LC_TELEPHONE="fr_FR@euro"
LC_MEASUREMENT="fr_FR@euro"
LC_IDENTIFICATION="fr_FR@euro"
LC_ALL=

But I am not sure that it is a good thing to change Utf-8...
Comment 12 Mahaleo 2003-03-15 18:51:06 UTC
I have just installed 0.13.91 (on Red Hat 8, with 'locale':
LANG=fr_FR.UTF-8), with the same attribution which involved the bug
(%n a écrit), and the bug does not appear any more.

Thanks for that good job.
Comment 13 Christophe Lambin 2003-03-15 22:15:31 UTC
Marking as 'fixed' based on user feedback.

Thanks, Mahaleo!