GNOME Bugzilla – Bug 64433
Unicode tables could be improved
Last modified: 2004-12-22 21:47:04 UTC
The tables for storing Unicode attributes in gunichartables.h, gunicomp.h, gunidecomp.h, and gunibreak.h could be improved. In particular, the tables are implemented as an array of pointers into other arrays. This causes the linker to insert many unnecessary relocations and forces the pointer arrays into the .data section instead of the .rodata section of an ELF executable. Given that glib is shared among many programs, getting as much data as possible in the .rodata section is important. Additionally, the footprint could be reduced by storing indices as shorts rather than pointers. Finally, it should be noted that all the relevent header files are automatically generated from the Unicode spec files by gen-unicode-tables.pl, a perl script.
Created attachment 6012 [details] [review] Proposed fix.
Here is a patch that fixes this bug. It implements the pages of Unicode tables as a two-dimensional array, and replaces the array of pointers with an array of indices (as shorts). The behavior of all functions that depend on these tables is unchanged. The footprint savings are summarized here: | .rodata .data relocs -------+----------------------- before | 117,415 34,692 4109 after | 119,975 29,508 3955 -------+----------------------- +2,560 -5,184 -154 In other words, about 2.5k is saved, another 2.5k is moved from .data to .rodata, and 154 relocations are saved. The .rodata and .data sizes were measured with objdump -h, the relocations with objdump -R.
Tue Nov 13 21:25:35 2001 Owen Taylor <otaylor@redhat.com> * glib/{gen-unicode-tables.pl,gunibreak.c,gunibreak.h, gunichartables.h, gunicomp.h, gunidecomp.[ch], guniprop.c}: Patch from Andrew Taylor to improve tables and reduce relocations by using indices rather than pointers. (#64433) * tests/unicode-normalize.c (main): Fix for changes to g_strsplit().