After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 596819 - High-bit ISO-8859-1 characters are not always converted to UTF-8 in the output
High-bit ISO-8859-1 characters are not always converted to UTF-8 in the output
Status: RESOLVED FIXED
Product: doxygen
Classification: Other
Component: general
1.6.1
Other Windows
: Normal minor
: ---
Assigned To: Dimitri van Heesch
Dimitri van Heesch
Depends on:
Blocks:
 
 
Reported: 2009-09-30 04:48 UTC by Gisbert
Modified: 2009-12-30 13:38 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Config and input file for reproducing the problem. (3.30 KB, application/octet-stream)
2009-09-30 04:48 UTC, Gisbert
Details

Description Gisbert 2009-09-30 04:48:00 UTC
I have an input file in ISO-859-1 and have set INPUT_ENCODING accordingly. The output is, in general, in UTF-8, as expected. Occasionally, however, an ISO-85-1 character with its high bit set will be copied unchanged to the output. This occurs in HTML (which will just show up as a funny glyph), in LaTeX (which will make LaTeX choke because it expects clean UTF-8) and in Perlmod.

The attached file exhibits the problem: input line 33 of test.h, output in section "Macro definitions" of test_8h.html.
Comment 1 Gisbert 2009-09-30 04:48:46 UTC
Created attachment 144356 [details]
Config and input file for reproducing the problem.
Comment 2 Dimitri van Heesch 2009-10-03 15:17:52 UTC
The main problem here is that in Latex source code output is reformatted (to prevent page overflows) and the reformatted is not UTF8 aware. As a result it could insert characters in the middle of a multibyte character.
I'll correct this.
Comment 3 Dimitri van Heesch 2009-10-03 15:50:15 UTC
Actually Comment 2 above is more in line with what you reported in bug #596807.
I actually didn't see any invalid characters in test_8h.html anymore (they were there in the official 1.6.1 release though)
Comment 4 Gisbert 2009-10-05 11:59:03 UTC
OK, I'll re-test when I get the next release. Thnx, Dimitri!
(Btw: great response on all accounts! This is a marvellous job. And I mean all of doxygen.)
Comment 5 Dimitri van Heesch 2009-12-30 13:38:59 UTC
This bug was previously marked ASSIGNED, which means it should be fixed in
doxygen version 1.6.2. Please verify if this is indeed the case and reopen the
bug if you think it is not fixed (include any additional information that you
think can be relevant).