After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 108158 - postscript exporting m17n
postscript exporting m17n
Status: RESOLVED OBSOLETE
Product: dia
Classification: Other
Component: win32
0.91
Other Windows
: Normal enhancement
: ---
Assigned To: Dia maintainers
Dia maintainers
Depends on:
Blocks:
 
 
Reported: 2003-03-12 05:39 UTC by Ken Tsukahara
Modified: 2006-07-15 18:47 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Ken Tsukahara 2003-03-12 05:39:09 UTC
Hello.
I'm trying to improve postscript exporting with non-latin1 font.
At first, I have made a Japanese L10N patch. It works fine, but
it's not applicable for other languages.
Now I'm working on multilingualization(M17N) in the next stage.
This is a patch(-cjk1) tested only on Japanese Win32 platform.
I'd like to ask some developers to review, fix and improve it
on Korean/Chinese platform, and hope to add features for other
languages.

(1) Postscript font family name
    Refer legacy_fonts[] in lib/font.c
    newname "ms mincho" -> oldname "Ryumin-Light"
    The name is registered in DiaPsRenderer.
(2) Encoding
    Refer M17Nfont[] in app/diapsrenderer.c
    fontname "Ryumin-Light" -> encoding index DIA_ENCODING_SJIS
    The index is registered in DiaPsRenderer.
(3) Postscript font encoding
    Refer lookup_ps_encoding() in app/diapsrenderer.c
    fontname "Ryumin-Light" -> "Ryumin-Light-RKSJ-H" (Shift-JIS)
    This is the full font name to export.
(4) Charset conversion
    Refer lookup_iconv_encoding() in app/diapsrenderer.c
    "UTF-8" string -> "SJIS" string for Japanese.
    iconv is used for charset conversion. Please link libiconv.
(5) Change default encoding
    If JP_L10N symbol is defined, the default encoding will be
    changed to SJIS. It's for convenience on Japanese platform.

Should I post it to dia-list?

--- app/diapsrenderer.c.org	2003-01-16 17:23:14.000000000 +0900
+++ app/diapsrenderer.c	2003-03-11 20:23:18.000000000 +0900
@@ -26,12 +26,56 @@
 
 #include <string.h>
 #include <time.h>
+#include <ctype.h>
+#include <iconv.h>
 
 #include "diapsrenderer.h"
 #include "message.h"
 #include "dia_image.h"
 #include "font.h"
 
+#define DIA_ICONV_FROM ( "UTF-8" )
+#define DIA_ICONV_DEFAULT_TO ( "ASCII" )
+#define DIA_ENCODING_DEFAULT ( "latin1" )
+
+#ifdef JP_L10N
+#undef  DIA_ICONV_DEFAULT_TO
+#undef  DIA_ENCODING_DEFAULT
+
+#define DIA_ICONV_DEFAULT_TO ( "SJIS" )
+#define DIA_ENCODING_DEFAULT ( "RKSJ-H" )
+#endif
+
+static char* lookup_ps_encoding(DiaM17NEncodingIndex n)
+{
+  static char* encoding[] = {
+      "latin1",
+      "RKSJ-H",
+      "KSC-UHC-H",
+      "GBK-EUC-H",
+      "ETen-B5-H"
+  };
+  if (n < 0 ||  G_N_ELEMENTS(encoding) <= n) {
+      return DIA_ENCODING_DEFAULT;
+  }
+  return encoding[n];
+}
+
+static char* lookup_iconv_encoding(DiaM17NEncodingIndex n)
+{
+  static char* encoding[] = {
+      "ASCII",
+      "SJIS",
+      "EUC-KR",
+      "GB",
+      "BIG5"
+  };
+  if (n < 0 ||  G_N_ELEMENTS(encoding) <= n) {
+      return DIA_ICONV_DEFAULT_TO;
+  }
+  return encoding[n];
+}
+
 void
 lazy_setcolor(DiaPsRenderer *renderer,
               Color *color)
@@ -238,9 +282,44 @@
 set_font(DiaRenderer *self, DiaFont *font, real height)
 {
   DiaPsRenderer *renderer = DIA_PS_RENDERER(self);
+  int  i = 0;
+  char *fontname;
+  char *p;
+  static struct _M17Nfont_index {
+    char*  name;
+    DiaM17NEncodingIndex eidx;
+  } M17Nfont[] = {
+    { "Batang-",     DIA_ENCODING_KR },
+    { "BousungEG-",  DIA_ENCODING_GB },
+    { "Dotum-",      DIA_ENCODING_KR },
+    { "GBZenKai-",   DIA_ENCODING_GB }, 
+    { "GothicBBB-",  DIA_ENCODING_SJIS },
+    { "Gulim-",      DIA_ENCODING_KR }, 
+    { "MOESung-",    DIA_ENCODING_BIG5 },
+    { "Ryumin-",     DIA_ENCODING_SJIS },
+    { "ShanHeiSun-", DIA_ENCODING_BIG5 },
+    { "Song-",       DIA_ENCODING_GB },
+    { "ZenKai-",     DIA_ENCODING_BIG5 },
+    { NULL,          DIA_ENCODING_ASCII }
+  };
+
+  fontname = dia_font_get_psfontname(font);
+  renderer->fontfamily = fontname;
+  renderer->eidx = DIA_ENCODING_ASCII;
+  while (p = M17Nfont[i].name) {
+    if (strncmp(fontname, p, strlen(p)) == 0) {
+        renderer->eidx = M17Nfont[i].eidx;
+        break;
+    }
+    i++;
+  }
 
-  fprintf(renderer->file, "/%s-latin1 ff %f scf sf\n",
-          dia_font_get_psfontname(font), (double)height);
+  /*
+   * Trying CJK M17N, but tested only on Japanese platform.
+   * Need some test and fix on Korean/Chinese platform.
+   */
+  fprintf(renderer->file, "/%s-%s ff %f scf sf\n",
+          fontname, lookup_ps_encoding(renderer->eidx), (double)height);
 }
 
 static void
@@ -500,6 +579,12 @@
   char *buffer;
   const char *str;
   int len;
+  iconv_t c;
+  char*   psrc;
+  char*   pdst;
+  size_t  to_left;
+  size_t  from_left;
+  size_t  blen;
 
   if (1 > strlen(text))
     return;
@@ -509,9 +594,31 @@
   /* TODO: Use latin-1 encoding */
 
   /* Escape all '(' and ')':  */
-  buffer = g_malloc(2*strlen(text)+1);
+  psrc = text;
+  to_left = from_left = strlen(text) + 1;
+  str = pdst = g_malloc(to_left);
+  *pdst = 0;
+
+  /*
+   * Trying CJK M17N, but tested only on Japanese platform.
+   * Need some test and fix on Korean/Chinese platform.
+   */
+  c = iconv_open(lookup_iconv_encoding(renderer->eidx), DIA_ICONV_FROM);
+  if (c == NULL) {
+    g_free(str);
+    return;
+  }
+  blen = iconv(c, &psrc, &from_left, &pdst, &to_left);
+  if (blen == -1) {
+    g_free(str);
+    iconv_close(c);
+    return;
+  }
+  iconv_close(c);
+  pdst = str;
+
+  buffer = g_malloc(2*strlen(str)+1);
   *buffer = 0;
-  str = text;
   while (*str != 0) {
     len = strcspn(str,"()\\");
     strncat(buffer, str, len);
@@ -522,8 +629,18 @@
       str++;
     }
   }
-  fprintf(renderer->file, "(%s) ", buffer);
+  str = buffer;
+  fprintf(renderer->file, "(");
+  do {
+    if (isascii(*str) && isprint(*str)) {
+      fprintf(renderer->file, "%c", *((unsigned char*)str));
+    } else {
+      fprintf(renderer->file, "\\%03o", *((unsigned char*)str));
+    }
+  } while (*(++str));
+  fprintf(renderer->file, ") ");
   g_free(buffer);
+  g_free(pdst);
   
   switch (alignment) {
   case ALIGN_LEFT:
--- app/diapsrenderer.h.org	2002-12-08 00:38:38.000000000 +0900
+++ app/diapsrenderer.h	2003-03-11 20:23:52.000000000 +0900
@@ -5,6 +5,7 @@
 #include "color.h"
 
 #include "diarenderer.h"
+#include "font.h"
 
 G_BEGIN_DECLS
 
@@ -39,6 +40,8 @@
   double scale;
   Rectangle extent;
 
+  char *fontfamily;
+  DiaM17NEncodingIndex eidx;
 };
 
 struct _DiaPsRendererClass
--- lib/font.c.org	2003-01-22 17:23:30.000000000 +0900
+++ lib/font.c	2003-03-11 10:28:50.000000000 +0900
@@ -36,6 +36,17 @@
 #include "message.h"
 #include "intl.h"
 
+#define DIA_ENCODING_DEFAULT ( DIA_ENCODING_ASCII )
+#define DIA_FONT_DEFAULT ( "Courier" )
+
+#ifdef JP_L10N
+#undef  DIA_ENCODING_DEFAULT
+#undef  DIA_FONT_DEFAULT
+
+#define DIA_ENCODING_DEFAULT ( DIA_ENCODING_SJIS )
+#define DIA_FONT_DEFAULT ( "Ryumin-Light" )
+#endif
+
 static PangoContext* pango_context = NULL;
 
 /* This is the global factor that says what zoom factor is 100%.  It's
@@ -690,6 +701,8 @@
   { "Dotum", "Dotum", DIA_FONT_FAMILY_ANY },
   { "GBZenKai-Medium", "GBZenKai-Medium", DIA_FONT_FAMILY_ANY }, 
   { "GothicBBB-Medium", "GothicBBB-Medium", DIA_FONT_FAMILY_ANY },
+  { "GothicBBB-Medium", "ms gothic", DIA_FONT_FAMILY_ANY },
+  { "GothicBBB-Medium", "ms pgothic", DIA_FONT_FAMILY_ANY },
   { "Gulim", "Gulim", DIA_FONT_FAMILY_ANY }, 
   { "Headline", "Headline", DIA_FONT_FAMILY_ANY },
   { "Helvetica",             "Arial", DIA_FONT_SANS },
@@ -710,9 +723,12 @@
   { "Palatino-Italic",     "Palatino", DIA_FONT_FAMILY_ANY | 
DIA_FONT_ITALIC }, 
   { "Palatino-Roman",      "Palatino", DIA_FONT_FAMILY_ANY },
   { "Ryumin-Light", "Ryumin", DIA_FONT_FAMILY_ANY | DIA_FONT_LIGHT },
+  { "Ryumin-Light", "ms mincho", DIA_FONT_FAMILY_ANY },
+  { "Ryumin-Light", "ms pmincho", DIA_FONT_FAMILY_ANY },
   { "ShanHeiSun-Light", "ShanHeiSun", DIA_FONT_FAMILY_ANY | 
DIA_FONT_LIGHT },
   { "Song-Medium", "Song-Medium", DIA_FONT_FAMILY_ANY | DIA_FONT_MEDIUM },
   { "Symbol", "Symbol", DIA_FONT_SANS | DIA_FONT_MEDIUM },
+  { "Symbol", "Symbol", DIA_FONT_FAMILY_ANY },
   { "Times-Bold",       "Times New Roman", DIA_FONT_SERIF | 
DIA_FONT_BOLD },
   { "Times-BoldItalic", "Times New Roman", DIA_FONT_SERIF | 
DIA_FONT_ITALIC | DIA_FONT_BOLD },
   { "Times-Italic",     "Times New Roman", DIA_FONT_SERIF | 
DIA_FONT_ITALIC },
@@ -785,6 +801,5 @@
       }
     }
   }
-  return matched_name ? matched_name : "Courier";
+  return matched_name ? matched_name : DIA_FONT_DEFAULT;
 }
-
--- lib/font.h.org	2002-11-19 12:08:10.000000000 +0900
+++ lib/font.h	2003-03-11 20:23:38.000000000 +0900
@@ -107,6 +107,15 @@
     /* mutable */ char* legacy_name;    
 };
 
+typedef enum
+{
+  DIA_ENCODING_ASCII = 0, /* ASCII, -latin1 */
+  DIA_ENCODING_SJIS  = 1, /* Japanese Shift-JIS, Codepage 932, -RKSJ-H */
+  DIA_ENCODING_KR    = 2, /* Korean Codepage 949, -KSC-UHC-H */
+  DIA_ENCODING_GB    = 3, /* Simplified Chinese,  -GBK-EUC-H */
+  DIA_ENCODING_BIG5  = 4, /* Traditional Chinese, -ETen-B5-H */
+} DiaM17NEncodingIndex;
+
 /* Set the PangoContext used to render text.
  */
 void dia_font_init(PangoContext* pcontext);
Comment 1 Lars Clausen 2003-03-14 23:22:39 UTC
I'm a bit confused about this bug. 0.91 doesn't use PostScript fonts at all, so there shouldn't be any font encoding issues to fix.  Doesn't the freetype-based outline drawing render japanese fonts correctly?
Comment 2 Ken Tsukahara 2003-03-19 05:37:53 UTC
OK. It's not font rendering problem, but EPS exporting one.
The set_font function in app/diapsrenderer.c exports postscript
font name. It assumes that -latin1 encoding would be sufficient
for any language.
Dia exports Japanese text string using "Courier-latin1" font,
and the text always get garbled.
"Ryumin-Light-RKSJ-H" or "GothicBBB-Medium-RKSJ-H" would be good
PS font name for Japanese text. And the text must be encoded in
Shift-JIS charset, not UTF-8, with those fonts.
Comment 3 Lars Clausen 2003-03-19 15:42:52 UTC
Is this on a Windows machine?
Comment 4 Ken Tsukahara 2003-03-20 01:33:05 UTC
I've been building and testing Dia only on windows box now.
But I guess the issue is machine-independent, and similarly
on UNIX box.
Comment 5 Lars Clausen 2003-03-20 02:43:36 UTC
No, EPS export of fonts is different between Win32 and Unix.  Unix uses diapsft2renderer, which uses FreeType2 for font rendering, whereas Win32 uses standard fonts (whatever they may be).&#013;&#010;
Comment 6 Akira TAGOH 2003-03-21 15:30:02 UTC
I don't think Ryumin-Light-RKSJ-H" or "GothicBBB-Medium-RKSJ-H" would
be good PS font name for Japanese text. just say "on Windows" :)
Basically for non-8bit PostScript printer, only JIS can be safely used.
Although your patch assumes CMap, it's supported from PostScript Level
3. If it's even acceptable, I think just  using UTF8-H is best,
because dia doesn't need to convert the strings. I wonder if it's
acceptable though so that it means dia has no support about Level1 and
Level2, even if people has something like ghostscript or the printer
driver supports it.
To not depend on such thing, using embeded fonts is better way.
Comment 7 Hans Breuer 2003-07-18 23:07:06 UTC
I'd like to make the EPS (with text) output work again even
for the Unix build and non 7bit languages, see :
http://mail.gnome.org/archives/dia-list/2003-June/msg00050.html

Though I don't have not much of clue how to get properly multi
language here are some requirements for the patch :
- don't make things compile-time configurable, there are not
  many people capable compiling Dia on windoze 
- try to avoid direct iconv usage. Use the wrapper provided
  by glib if possible

Comment 8 alexander.winston 2004-01-24 04:22:24 UTC
Adding the PATCH keyword.
Comment 9 Lars Clausen 2005-03-06 09:38:43 UTC
Setting to NEEDINFO awaiting feedback on Akira's comment.  We're not being
explicit about PS level, but I think forcing level 3 would be a bit much.  I
also don't like the compile-time l10n at all.
Comment 10 Hans Breuer 2006-07-15 18:47:31 UTC
One year of NEEDINFO should be enough, marking as obsolete.