Bug 702491 – UTF-16LE BOM not handled by source browser and \snippet

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 702491 - UTF-16LE BOM not handled by source browser and \snippet


Summary:	UTF-16LE BOM not handled by source browser and \snippet


Status:	VERIFIED FIXED

Product:	doxygen
Classification:	Other
Component:	general
Version:	1.8.4
Hardware:	Other Windows

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Dimitri van Heesch
QA Contact:	Dimitri van Heesch

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2013-06-17 16:59 UTC by Kevin Puetz
Modified:	2013-09-16 15:27 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
example showing that parsing handles BOM and snippets/source browser do not (23.17 KB, application/x-zip-compressed) 2013-06-17 16:59 UTC, Kevin Puetz	Details

Description Kevin Puetz 2013-06-17 16:59:23 UTC

Created attachment 247050 [details]
example showing that parsing handles BOM and snippets/source browser do not

Although the documentation extract/parse functionality of doxygen seems to support utf16 with BOM (which seems to have been part of fixing bug 593928 and 576950), the source browser and snippet functionality still don't seem to.

In the attached test project, I have created two essentially identical documented classes, A and B. The source code for A is ASCII (being parsed as UTF8), the source code for B is in a file encoded in UTF16LE, with a BOM. Both classes' documentation extracts properly, but the source listing is garbled for test-utf16.h, and the snippet included in B::getB() is missing.

At build time, I get the error "C:/Desktop/UTF16 BOM test/test-utf16.h:13: Warning: block marked with [snippet_getB] for \snippet should appear twice in file test-utf16.h, found it 0 times", which fits with the snippet being missing; it's searching for the marked sans BOM, 

I hit this when trying to document examples including a microsoft-style .rc file - rc.exe supports unicode only as CP_ACP (legacy ANSI code page, locale dependent) or utf16-le with a BOM. It does not support utf-8.

P.S. the little //~ \~snippet [marker] in the example is using a language filter to keep the snippet markers from showing up as part of the documentation (by making them part of an OUTPUT_LANGUAGE I will never build). It's a little weird :-). If I'm overlooking some cleaner way for a file to snippet bits of itself suggestions are welcome :-)

Comment 1 Dimitri van Heesch 2013-07-02 18:59:02 UTC

Confirmed. Should be fixed in the next GIT update.

Comment 2 Dimitri van Heesch 2013-08-23 15:04:34 UTC

This bug was previously marked ASSIGNED, which means it should be fixed in
doxygen version 1.8.5. Please verify if this is indeed the case. Reopen the
bug if you think it is not fixed and please include any additional information
that you think can be relevant.

Comment 3 Kevin Puetz 2013-09-16 15:27:05 UTC

Verified fixed, thank you very much.