GNOME Bugzilla – Bug 616606
All CJK or non-ASCII characters are not shown correctly in HTML report display page.
Last modified: 2018-06-29 22:38:17 UTC
I found the reason is encoding. Because all report are written in UTF-8, (or current locale). The encoding should be included by the generated HTML header. If I export the report, and use external browser to look at it, I have to manually select encoding to make it display correctly. I got same problem in 2.3.11, which might using GtkHtml, I think the stable version 2.2.9 should have same problem. There are 2 ways to fix the problem. 1. When display a report using WebKit/GtkHtml, send current encoding as parameters in the loading function. 2. During the html report generation, add following line in <head>...</head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> using correct charset to replace "UTF-8".
Based on the reply in maillist, the problem is not only affected CJK characters. It seems affect all non-ASCII characters.
*** Bug 611579 has been marked as a duplicate of this bug. ***
Before Gnucash 2.3.11 and webkit-1.1.5, we can avoid this bug by making pango.aliases. (See http://git.gnome.org/browse/pango/tree/README.win32?h=1.28) But from webkit-1.1.90, pango.aliases seems not to be read.
Please create a report which shows the problem, export to HTML and attach the HTML to this bug report.
Created attachment 159529 [details] The sample of generated html file This the generated html file.
Created attachment 159530 [details] Display in GnuCash which encoding is wrong
Created attachment 159531 [details] Exported report shown in Firefox After select encoding, the correct view in Firefox.
*** Bug 616602 has been marked as a duplicate of this bug. ***
Adding <meta> line. r19072
Created attachment 159655 [details] Bug 616606 CJK characters are shown as square The problem is still, but different. I tried the r19072, CJK still cannot be shown correctly. But, this time, it's not encoding problem, since it doesn't show CJK characters as junk chars, now they are shown as "square", it's same result in what happened in 2.3.11. I upload the attachment to show the current display.
Created attachment 159656 [details] After choosing Windows default font - "Microsft YaHei" I tried using "Default CSS" this time, since it can specify different font. After I changed the font from "Arial" to "Micrsoft YaHei", which is the Windows 7 default font for zh_CN, I found the characters can displayed correctly. So, the font to display in report windows matters. Could you use the system default font for report display, rather than hardcoded "Arial"? or/and add a option in preferences, to let end user choose the default font for report?
I will look into a) replacing "Default" by "Default CSS" as the default stylesheet. I'm not sure I can completely remove "Default" because users might have their own stylesheets built on in, but I should be able to make "Default CSS" the default. I will also see about changing the default font. At least you have a workaround.
Created attachment 159747 [details] Not everyone correctly after set font to "MS YaHei" Although I set all fonts to "Microsoft YaHei" in "Default CSS", some words still not shown correctly. So, not all strings font can be set through "Default CSS" settings. And even "Default CSS" works, that means that is the only style I can use, since I cannot set font in other style.
When I created Default CSS, I created string categories (title, amount, ...) for all of the strings in one of the reports. I went through the reports and changed strings to use these categories, but I do know there are one or two places where strings don't use these categories. I may need 1 more category.
r19083 changes the default stylesheet to "Default CSS". I'll have to see what is required to add font support for the other stylesheets. Don't know if it will make it into 2.4.0, probably a 2.4.X maintenance release.
I want to help on this issue, I think I can modify the .scm files to make it works. Before I do anything wrong, could you give me some hints on this? and should we use CSS only in the "Default CSS"? that is, can we include some css lines to other style, such as fancy, easy? I am currently awared that files under src/report/stylesheets/ are involved. Do I need to know anything else? Thanks
To create the new stylesheet, I created a stylesheet-css.scm under report/stylesheets. It creates the notebook tab for the "Edit Stylesheets" dialog to allow the fonts to be set for various text classes. It also updates the report infrastructure on how to access the text classes and apply them to different types of text (amounts, title, other text, and so on). A couple of things are needed: 1) Ensure all reports specify the text class for each piece of text (including amounts) 2) Ensure that the notebook tab allows all text classes to have their fonts set 3) Add the notebook tab to other stylesheets. Item #1 has been mainly done. I started with a basic Income Statement (I think) for the text classes, then used those classes in other reports. I do know that there are some reports where some text is not assigned a class. I don't know about invoices. There might be new text classes there (which leads to #2 - the notebook tab might need to have new classes added from invoices). Once we have a full set of text classes and a separable piece which handles the Edit Stylesheets notebook page, the other stylesheets can use that piece to allow them to specify fonts as well. One aspect of this that I don't know too much about yet is that users can create their own stylesheets using the standard ones as a base. I don't know what the effect will be if we add all of the new font options. Will users need to recreate their customized stylesheets?
Created attachment 160488 [details] [review] The patch for Bug 616606 I have attached this patch, which should fixed the font issue. 1) I modified the "src/html/gnc-html-webkit.c", get current Windows font and set it to all default fonts of webkit. 2) Fixed a bug in "src/report/stylesheets/stylesheet-css.scm", which missing process of "Oblique" style, which may cause font-family includes " Oblique" and cannot display the text using the font. 3) Remove 'nowrap' attribute which is obsoleted, use CSS "white-space: nowrap" instead. 4) Add 'date-cell' in 'stylesheet-css.scm' to make the style available to "Date" cell, which should be 'white-space: nowrap'. However, I don't know how to set the 'date-cell' to each date cell. I hope someone can go through all the reports and set 'date-cell' for each date cell. 5) Remove style of "*-neg", use composite style instead. 6) Add default style for all text in 'body', 'p', 'table', 'tr', 'td'. The font style is same as 'text-cell'. So, there will be no any text without font setting. There are 2 minor issue need to be fixed, and I don't know how: 1) 'date-cell' is not assigned by reports. Need go through the reports, and set 'date-cell' class for each date cell. 2) In the Account summary report, the header of "code" and "account name" are not set any class. after rendering, it's <td>. It should be same as "number-header", and should be "<th>".
I cannot find this bug since it's marked as closed, however it's not. I submitted a patch might fixed the bug.
OK. I've reopened it and assigned it to me. From comment 18: #1 - I understand why you do this for a Chinese font. For Latin fonts, do we want to set the same default font for cursive, monospaced, serif and non-serif? For Linux as well as Windows? #2 through #5 - OK #6 - probably OK. You're right that there should be no text without a font setting. Reports should be rewritten around the use of CSS and maybe get rid of stylesheets. Your other 2 issues: I think I can do them. I'm going to split your patch into 2 pieces, one for gnc-html-webkit.c and one for stylesheet-css.scm. I'll apply the patch for stylesheet-css.scm and wait on the other one until we sort out the answer to my question. We may want to move it to the devel list.
Created attachment 160658 [details] [review] Patch for gnc-html-webkit.c Section of previous patch which applies to gnc-html-webkit.c
Created attachment 160659 [details] [review] Patch for stylesheet-css.scm
In the reports I've looked at, the date is just made part of the title using sprintf() or some sort of string append. There aren't separate date cells.
About the question: According to Nikos' reply, Greek has same problem of the font, not only CJK. http://lists.gnucash.org/pipermail/gnucash-devel/2010-April/028148.html It would be good to set monospace, cursive or serif differently. It is possible for webkit, which has interface set them differently. However, how can we get such settings? I checked Google Chromium, it looks like they set font for different locale manually, rather than detect the system default fonts. http://src.chromium.org/viewvc/chrome/trunk/src/chrome/app/resources/ They treat the font setting as translation. So for each language, they set the following settings in the translation file. IDS_WEB_FONT_FAMILY IDS_FIXED_FONT_FAMILY IDS_SERIF_FONT_FAMILY IDS_SANS_SERIF_FONT_FAMILY IDS_CURSIVE_FONT_FAMILY IDS_FANTASY_FONT_FAMILY Linux and Windows don't share the same fonts, Chromium load different locale resource for different platform. I don't think it's good for us, but we can be enlightened by it. How about we set a translatable string in the .po file, if the value is 'y', we use system default font for every font. Otherwise, we don't set any font for webkit, keep it original setting. I didn't check it on Linux, so I don't know whether the problem will happened on Linux. I will check it later on my Ubuntu 10.04.
(In reply to comment #23) > In the reports I've looked at, the date is just made part of the title using > sprintf() or some sort of string append. There aren't separate date cells. The problem is the cells of date should be nowrap, however, currently it is wrapped. Since there is no class assigned for date cell, the addition of CSS style 'date-cell' will not work. If the reports can attach date cell as a class 'date-cell', then the wrap problem of date column will be fixed.
Regarding date cells: I agree. However this is going to have to wait for a more extensive rewrite of the reports system. Regarding fonts: now that all text has a font available, can the stylesheet not get the default font and use it? This would require modifying easy/technicolor/... to allow them to specify fonts, but we want this anyway. I'm going to shift the font discussion to the devel list for more broad feedback, and then summarize (or cut/paste) here.
*** Bug 617759 has been marked as a duplicate of this bug. ***
*** Bug 593821 has been marked as a duplicate of this bug. ***
Comment on attachment 160658 [details] [review] Patch for gnc-html-webkit.c Err... what should happen with attachment #160658 [details] ? I didn't understand the outcome of the above discussion, sorry for that.
(In reply to comment #29) The patch will get the font setting from GtkWidget, which will get the font setting from Gtk theme. And it will apply the setting to webkit. Otherwise the webkit will use the default font, which is 'Arial'. And this default setting will make CJK is not be able to display correctly. I saw the modified patch is applied to trunk. They works. And the only thing left is to use the same way to replace the hardcoded 'Arial' in stylesheet font settings.
Is there anything I can help on replacing the hardcoded 'Arial' in stylesheet? I'm not familiar with Scheme, but Chinese in reports is displayed in chaos by default unless manually set all fonts in settings to the right font. So, maybe I can help some.
Sure. I have separated out the font family name in src/reports/report-system/html-fonts.scm (line 63). What it should do instead is call into the C code to ask for the family name. The basic problem is getting a suitable GtkWidget (maybe the top level widget?) to extract the font info from. I think what you will want is: 1) C routine which returns the font family name. This should be newly allocated memory which the caller will free. 2) Addition to a .i file (perhaps in the report system, perhaps in the main ui) to form the wrapper function 3) Modification to html-fonts.scm to call this function rather than using hard-coded "Arial". This evening, I'll code this up so that the C routine (#1) returns the string "Arial". You can then modify that routine to get the real info.
r19285 adds gnc_get_default_report_font_family() function to src/report/report-system/gnc-report.c. Currently returns hard-coded "Arial" but can be modified to get font from a GtkWindow.
Please see also #622210 which might be similar (but is not a duplicate). Kaplan
Created attachment 165484 [details] [review] Patch for load default font from gtk widget for report I created a patch for load the default font from Gtk Toplevel widget rc_style for the report system, instead of the hardcode 'Arial'. The patch works on both my Ubuntu and Windows.
Comment on attachment 165484 [details] [review] Patch for load default font from gtk widget for report r19357
I got the daily build r19357, http://code.gnucash.org/builds/win32/trunk/ . The problem has been fixed for me. Can anyone else here confirm it has been fixed? Thanks.
Should this bug be closed? I think the problem has been fixed, but nobody except me confirmed the solution.
According to https://lists.gnucash.org/pipermail/gnucash-user/2013-April/048782.html this bug is still present in 2.4
GnuCash bug tracking has moved to a new Bugzilla host. This bug has been copied to https://bugs.gnucash.org/show_bug.cgi?id=616606. Please update any external references or bookmarks.