GNOME Bugzilla – Bug 313781
Hebrew vowels rendered wrong because shaper font cache gets polluted
Last modified: 2005-08-25 21:34:55 UTC
It is possible to make an application go into a permanent state where it renders Hebrew incorrectly, through pollution of pango's 'shaper font cache'. See test case (at the bottom). The bug occurs on both pango-1.8.2 and pango-1.9.1. I've given the output of the test case below. It seems my method of writing out the character number is not quite right, but that doesn't matter for this purpose. Running it with the '1' argument shows what the output SHOULD look like. Running it with the '2' shows - at the bottom of the output - that some of the letters are now being rendered using the "Basic" shape engine. This makes them get placed wrong when drawn. The hebrew word is 1489 1468 1464 1512 1464 1443 1488. The first, middle (1512), and last character are all hebrew *letters*, while the others are vowels and diacritical marks. Pango considers the vowels and diacritics to be "INHERIT" script type. This test tricks pango into polluting its 'shaper font cache' (as implemented by shaper_font_cache_get() in pango-context.c). It works like this: The string 32, 1468, 1464, 32, 1464, 1443, 32 contains no characters that are of the HEBREW script type. But, we tell it to use the Hebrew language. So, the vowels resolve to the "Basic" shape engine through the inheritance rules. Their script type is considered to be perhaps LATIN, but certainly not Hebrew. But the "he" language setting causes it to use the same cache as the good Hebrew text does. It then pollutes the cache by storing a mapping for the individual vowel characters to the Basic shape engine instead of the Hebrew one. From then on, Hebrew text containing the polluted vowels and diacritics is rendered wrong. It is difficult to avoid triggering this bug in a web browser, because it throws all sorts of crud at the rendering engine. I found the bug through browser testing. Steve ------ OUTPUT OF TEST CASE ------ aotearoa$ ./pango-test 1 Doing test 1 - Hebrew working properly (pango-test:7215): Pango-WARNING **: Cannot open font file for font Ezra SIL 12 1488 HebrewEngineFc 1443 HebrewEngineFc 1512 HebrewEngineFc 1512 HebrewEngineFc 1489 HebrewEngineFc 1489 HebrewEngineFc 1489 HebrewEngineFc --- aotearoa$ ./pango-test 2 Doing test 2 - Hebrew getting corrupted through cache poisoning (pango-test:7217): Pango-WARNING **: Cannot open font file for font Ezra SIL 12 32 BasicEngineFc 32 BasicEngineFc 32 BasicEngineFc 32 BasicEngineFc 32 BasicEngineFc 1443 BasicEngineFc 32 BasicEngineFc --- 1488 HebrewEngineFc 1443 BasicEngineFc 1464 BasicEngineFc 1512 HebrewEngineFc 1468 BasicEngineFc 1468 BasicEngineFc 1489 HebrewEngineFc --- pango-test.c ------------ #define PANGO_ENABLE_BACKEND #define PANGO_ENABLE_ENGINE #include <glib/gunicode.h> #include <gdk/gdkpango.h> #include <gdk/gdkrgb.h> #include <pango/pango.h> #include <string.h> #include <stdio.h> void dump(PangoContext* pc, gunichar2* text16, int length16) { gchar* text8; PangoLayoutLine* line; PangoLayout *layout; GSList *tmpList; text8 = g_utf16_to_utf8(text16, length16, NULL, NULL, NULL); layout = pango_layout_new(pc); pango_layout_set_text(layout, text8, strlen(text8)); line = pango_layout_get_line(layout, 0); for (tmpList = line->runs; tmpList && tmpList->data; tmpList = tmpList->next) { gint i; PangoLayoutRun *layoutRun = (PangoLayoutRun *)tmpList->data; for (i=0; i < layoutRun->glyphs->num_glyphs; i++) { gint thisOffset = (gint)layoutRun->glyphs->log_clusters[i] + layoutRun->item->offset; printf("%d %s\n", g_utf8_get_char(text8+thisOffset), G_OBJECT_CLASS_NAME(PANGO_ENGINE_SHAPE_GET_CLASS(layoutRun->item->analysis.shape_engine))); } } g_free(text8); printf("---\n"); } int usage(char* argv0) { fprintf(stderr, "Usage:\n"); fprintf(stderr, " %s 1 Show Hebrew working properly\n", argv0); fprintf(stderr, " %s 2 Show Hebrew getting corrupted through cache poisoning\n", argv0); return 1; } int main(int argc, char* argv[]) { PangoContext* pc; gint i; PangoFontDescription* fd; gint test_no; gtk_init(&argc, &argv); if (argc < 2) return usage(argv[0]); test_no = atoi(argv[1]); if (test_no < 1 || test_no > 2) return usage(argv[0]); printf("Doing test %d - %s\n", test_no, test_no == 1 ? "Hebrew working properly" : "Hebrew getting corrupted through cache poisoning"); pc = gdk_pango_context_get(); pango_context_set_language(pc, pango_language_from_string("he")); if (test_no == 2) { /* Formatting this string causes pango-1.8.2 to subsequently render Hebrew text brokenly. */ gunichar2 text16[] = {' ', 1468, 1464, ' ', 1464, 1443, ' '}; dump(pc, text16, 7); } { gunichar2 text16[] = {1489, 1468, 1464, 1512, 1464, 1443, 1488}; dump(pc, text16, 7); } } Makefile -------- CFLAGS = $(shell pkg-config --cflags gtk+-2.0) LDFLAGS = $(shell pkg-config --libs gtk+-2.0 pango) all: pango-test $(CC) -o pango-test pango-test.c $(CFLAGS) $(LDFLAGS) clean: rm -f pango-test
I have re-tested on pango-1.10.0, and it is fixed.