Bug 300115 – RFC2047 encoding always separates the last non-ASCII UTF-8 character.

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 300115 - RFC2047 encoding always separates the last non-ASCII UTF-8 character.


Summary:	RFC2047 encoding always separates the last non-ASCII UTF-8 character.


Status:	RESOLVED FIXED

Product:	evolution-data-server
Classification:	Platform
Component:	Mailer
Version:	1.2.x (obsolete)
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	evolution-mail-maintainers
QA Contact:	Evolution QA team

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2005-04-10 18:49 UTC by Changwoo Ryu
Modified:	2005-08-09 05:44 UTC

See Also:
GNOME target:	---
GNOME version:	2.9/2.10

Attachments
Ignores separation which does not occur between lines. (943 bytes, patch) 2005-04-16 19:49 UTC, Changwoo Ryu	accepted-commit_now	Details \| Review
updated (1.69 KB, patch) 2005-08-09 05:08 UTC, Not Zed	committed	Details \| Review

Description Changwoo Ryu 2005-04-10 18:49:44 UTC

Please describe the problem:
In camel/camel-mime-utils.c:rfc2047_encode_word(), if the last unicode character
in the input string is non-ASCII, it is always encoded as a separated chunk.

Separating is valid in RFC2047, but separating in this case is not an intended
result.


Steps to reproduce:
A simple test program:
===============================================
#include <camel/camel-mime-utils.h>
#include <stdio.h>

int
main (int argc, char *argv[])
{
	char *encoded;
	char *header;

	header = "\xed\x95\x98\xed\x95\x98\xed\x95\x98";
	encoded = camel_header_encode_phrase (header);
	printf ("%s\n", encoded);
	g_free (encoded);
	header = "\xed\x95\x98\xed\x95\x98";
	encoded = camel_header_encode_phrase (header);
	printf ("%s\n", encoded);
	g_free (encoded);
}
===============================================
1. 
2. 
3. 


Actual results:
$ ./test-rfc2047
=?UTF-8?Q?=ED=95=98=ED=95=98?= =?UTF-8?Q?=ED=95=98?=
=?UTF-8?Q?=ED=95=98?= =?UTF-8?Q?=ED=95=98?=
$

Expected results:
=?UTF-8?Q?=ED=95=98=ED=95=98=ED=95=98?=
=?UTF-8?Q?=ED=95=98=ED=95=98?=


Does this happen every time?
Always

Other information:

Comment 1 Changwoo Ryu 2005-04-16 19:49:34 UTC

Created attachment 45334 [details] [review]
Ignores separation which does not occur between lines.

This patch makes the RFC2047 encoder ignores RFC2047 separation which does not
occur between lines. 

Someone may worry about invalid UTF-8 sequences, but the original code could
never detect truncated UTF-8 sequence anyway.

Comment 2 André Klapper 2005-05-24 19:19:16 UTC

adding patch keyword

Comment 3 Not Zed 2005-08-04 07:10:06 UTC

does this pass the make check tests in camel/tests?

Comment 4 Not Zed 2005-08-09 05:07:03 UTC

oh well, the tests pass.  in future ensure you include changelog entries.

i will commit this tomorrow after i've sent the patch to the list

Comment 5 Not Zed 2005-08-09 05:08:06 UTC

Created attachment 50436 [details] [review]
updated

Comment 6 Not Zed 2005-08-09 05:44:43 UTC

naah i'll commit it now.
thanks again.