GNOME Bugzilla – Bug 343275
accented characters uncorrectly displayed in man pages
Last modified: 2006-06-11 21:28:53 UTC
Please describe the problem: whether I use LC_ALL=fr_FR@euro (ISO8859-15) or LC_ALL=fr_FR.UTF-8, instead of showing accented characters properly, yelp displays a little question mark in man pages Steps to reproduce: 1. apt-get install manpages-fr 2. export LC_ALL=fr_FR@euro (or fr_FR.UTF-8) 3. yelp man:ls Actual results: question mark displayed instead of accented characters Expected results: show accented characters properly Does this happen every time? yes Other information: I'm not using gnome environment but only yelp as help browser with icewm on a debian testing system
The problem is that yelp doesn't understand any encodings other than ascii right now. I'm working on a patch to fix this for HEAD so that yelp understands UTF-8. However, it is not going to respect the character set specified by the LC_* variables. Instead, it has an language -> encoding mapping hardcoded, so that for example any files in /usr/share/man/fr/ would be converted to UTF-8 from ISO-8859-1 The crux of the problem is that there is no standard marking that tells you what encoding a man page is in. So we just guess based on the path.
Created attachment 66414 [details] [review] rather large and intrusive patch to support translated man pages This patch does a couple of things: 1) Changes most of the major bits in yelp-man-parser.c to support UTF-8 2) Fixes TOC code in yelp-toc-pager.c so that the language of a particular man page is stored 3) Indicates the main language the user was running when the cache file ~/.gnome2/yelp.d/manindex.xml was created, as an attribute to the root element, as well as indicating the language for each <dir> element 4) Perform the necessary encoding conversion to UTF-8, from the encoding of the manual page. 5) few miscellaneous bug fixes For people that want to test a different language, simply invoke yelp with LANGUAGE="<locale>" $prefix/bin/yelp The cache file will get recreated and any man pages present for that locale should appear in their translated form when clicked on through the TOC.
Additionally, if you are specifying an absolute path to a man page, then it is recommended to set the MAN_ENCODING environment variable as follows: MAN_ENCODING="EUC-JP" /opt/gnome2/bin/yelp man:/usr/share/man/ja/man1/su.1.gz I am considering putting in code to parse the path to determine the language (ja) and encoding (EUC-JP) for the man page.
Thanks Brent for caring about non english speaking users and reacting so quickly ! I won't be able to test the patch you propose but will try it as soon as there will be a new yelp binary release available. Regards,
* src/yelp-man-pager.c: (man_pager_parse): * src/yelp-man-parser.c: (yelp_man_parser_parse_file), (yelp_man_parser_parse_doc), (parser_parse_line), (macro_ignore_handler), (macro_section_header_handler), (parser_handle_linetag), (parser_read_until), (parser_append_text): * src/yelp-man-parser.h: * src/yelp-toc-pager.c: (add_man_page_to_toc), (create_toc_from_index), (process_mandir_pending), (process_cleanup): * stylesheets/man2html.xsl: Add support for translated man pages, fixes #343275 Applied to HEAD. Will be available in the yelp 2.15.3 release.