After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 313496 - Add iconic script indicators
Add iconic script indicators
Status: RESOLVED OBSOLETE
Product: pango
Classification: Platform
Component: general
1.4.x
Other All
: Normal enhancement
: ---
Assigned To: pango-maint
pango-maint
Depends on:
Blocks:
 
 
Reported: 2005-08-15 00:05 UTC by Behdad Esfahbod
Modified: 2015-04-30 16:46 UTC
See Also:
GNOME target: ---
GNOME version: Unversioned Enhancement


Attachments
iconic script indicators (144.17 KB, application/x-bzip2)
2005-08-19 13:18 UTC, Francesco Sepic
Details
Images for Coptic, Han, Braille, Tifinagh, Kharoshthi and the special value Inherited (11.59 KB, application/x-bzip)
2005-08-19 13:51 UTC, Chris Scobell
Details
Firsf five scripts (18.44 KB, application/zip)
2005-08-19 13:51 UTC, Roger Svensson
Details
Normalized images (164.53 KB, application/x-bzip2)
2006-02-13 02:29 UTC, Behdad Esfahbod
Details
cropped glyphs (49.61 KB, application/x-bzip2)
2006-02-13 02:31 UTC, Behdad Esfahbod
Details
Vectorized glyphs (53.79 KB, application/x-bzip2)
2006-02-13 02:35 UTC, Behdad Esfahbod
Details
EPS (53.48 KB, application/x-bzip2)
2006-02-13 02:38 UTC, Behdad Esfahbod
Details

Description Behdad Esfahbod 2005-08-15 00:05:23 UTC
As specified here:

  http://www.unicode.org/reports/tr24/tr24-8.html#Iconic_Indicators
Comment 1 Behdad Esfahbod 2005-08-19 06:24:01 UTC
Seems like the iconic indicators are actually rejected from becoming part of the
standard, but still would be useful.  Let me know if you need the graphics. 
There are some 60 images involved.  Better yet is if somebody (you
gnome-lovers!) grab the Unicode Standard PDF files and make big screenshots of
the characters involved, such that we can produce high-quality icons at a couple
sizes (24x24, 48x48).

Anybody?
Comment 2 Behdad Esfahbod 2005-08-19 06:43:38 UTC
Lemme explain.  What a real GNOME Lover can do is:

  - Open the page linked above, for every Script mentioned in Table 3 (including
the "Special Values" items), repeat the following:

  - Download the code charts of the script involved from here:

      http://www.unicode.org/charts/

  - Open the downloaded chart and open in Evince.

  - This is the fun part: Find the exact same character pictured in the icon in
the chart.

  - Set your Evince zoom on 400% and wait for the screen to re-render in full
quality.

  - Take an screenshot of the screen, including the guilty character.  You can
do this using the Print Screen button.  The "Save Screenshot" window would open.

  - Have gimp around, drag and drop the screenshot into gimp.  It would open it.

  - Choose the glyph we need with the selection tool.  Double check that it is
the same glyph.  Make sure you select it completely with its bounding box. 
Don't try to choose the exact box.  Let it be loose, I will clean up later.

  - Choose "Crop image".

  - Save as  "xx_scriptname.png" where scriptname is "latin", "greek", etc. and
"xx" is an index number starting from 01 going up to around 60, in the "latin",
"greek", "cyrillic", ... order of Table 3.



When all done, zip them all and attach here :).
After doing this, you would know about a heck a lot fo writing scripts that none
of your friends have even seen.  You will never look at written text the same
way you used to do. And you make our charmap bit every other system's charmap :).
Comment 3 Francesco Sepic 2005-08-19 13:18:10 UTC
Created attachment 50979 [details]
iconic script indicators

I'm not sure if the Arabic and the Ethiopic are correct. Also i cannot find the
images for Coptic, Han, Braille, Tifinagh, Kharoshthi and the special value
Inherited. I'll try to find them more carefully and eventually i'll send the
missing images later.
Comment 4 Chris Scobell 2005-08-19 13:51:21 UTC
Created attachment 50980 [details]
Images for Coptic, Han, Braille, Tifinagh, Kharoshthi and the special value
Inherited
Comment 5 Roger Svensson 2005-08-19 13:51:32 UTC
Created attachment 50981 [details]
Firsf five scripts

This zip archive contains the first five script letters
Comment 6 Chris Scobell 2005-08-19 13:53:15 UTC
I can't believe that you went and did it all the same as I did and then beat me
to the punch ;)
Comment 7 Behdad Esfahbod 2005-08-19 18:55:05 UTC
Thanks all.  No more submissions please.  I've already received two complete
sets of the icons :).  What a community!  Thanks again.
Comment 8 Ehsan Akhgari 2005-09-20 12:50:31 UTC
Hmm, based upon Behdad's latest comment, shouldn't the status of this bug be
changed from NEW?
Comment 9 Behdad Esfahbod 2005-09-20 14:23:39 UTC
Hi Ehsan,

Welcome aboard!

Hum, we've got the icons now, waiting for a good soul to actually hack up a
patch to use them.  I'm accepting the bug.  Thanks.
Comment 10 Behdad Esfahbod 2006-02-13 02:29:52 UTC
Created attachment 59224 [details]
Normalized images

These are the same images in the attachments above, but normalized to the same size and such that the glyph is almost centered in the image.  Created using:

convert -size 128x160 xc:white -gravity South -draw 'image over 0,0,0,0 -' $2 < $1

and then gimping a bit to center them.
Comment 11 Behdad Esfahbod 2006-02-13 02:31:14 UTC
Created attachment 59225 [details]
cropped glyphs

These are the actualy glyph bitmaps as 88x88 PNG images.  Created from the normalized ones using:

convert - -gravity Center -crop 88x88+0-4 png8:$2 < $1
Comment 12 Behdad Esfahbod 2006-02-13 02:35:51 UTC
Created attachment 59226 [details]
Vectorized glyphs

And finally, these are the SVGs.  I created them by enlarging the glyph, cut at a threshold to get a bilevel image, and use autotrace to trace the outline.  The command used is:

convert -geometry '1000%' -threshold '80%' -type BiLevel - - < $1 |
autotrace -background-color ffffff -dpi 300 -line-reversion-threshold .1 \
        -input-format png /dev/stdin -output-file $2


The quality is acceptable (except for the dots in the dotted circle maybe), but these have too many nodes and curves.  Converting each of these to a cairo path structure takes about 1kb.  Which is not much, but we can do better I believe.

So, I'm putting these here for some love.  Load them in inkscape and simplify the paths without them deforming too much.  Do not resize the images.  And please send a comment if you are starting this job, such that we don't get duplicated work.  Thanks.
Comment 13 Behdad Esfahbod 2006-02-13 02:38:40 UTC
Created attachment 59227 [details]
EPS

EPS too, for those who prefer EPS.

The outlines in the EPS files is very easy to convert to cairo paths.
Comment 14 Behdad Esfahbod 2006-02-13 03:17:26 UTC
Moving to Pango.  Matthias pointed out that OS X draws script-icons on missing glyphs and so we should do that too :-P.

The idea is to be able to set it on a PangoContext to use icon-boxes instead of hex-boxes.  Only cairo backend though.  Suggestion about an extendable API for setting this is welcome.

The immediate problem with having such an option in PangoContext is that we don't have a context around when getting glyph extents.  I don't know how that can be solved generally, but in the case of icon-boxes, we can change the cairo hexbox such that it's always an square, then when rendering, we decide whether to draw a hexbox or iconbox.  Even that doesn't quite work, since  we don't have a PangoContext with calls like draw_glyphs...  Ok, second try: use an attribute for turning iconic on/off.  Then, in pango_layout, we can walk over the glyphs returned by pango_shape and replace hexbox glyphs with new iconbox glyphs.  That solves all the other problems, at the cost of gtk/whatever having to set an attribute to turn it on.  This also means that we have to decide on an extensible scheme for allocating special glyphs.  Currently 0x0FFFFFFF is the empty glyph, and 0x10000000 is used as the hexbox flag.  We can:

  * Limit the hexbox glyphs to 0x10000000..0x100FFFFF only, which is all valid Unicode scalars, and use the rest for other allocations, like icon glyphs.  Every backend should only render those sections that it knows about, and fall back to whatever unknown glyph it has for the rest (a simple box.)

  * While this works, it also means that if a backend doesn't support iconboxes, it will draw an empty box instead of a hexbox.  So another expansion is to use say 0x30000000 as an iconbox flag.  The icon itself can be found by doing pango_script_for_unichar().  In other words, we declare the top four bits of a PangoGlyph as modifiers.  But, these modifiers are not flags anymore, they are indices.  So we have 16 modifiers, not 4.  The 0x1 modifier means "draw whatever unknown glyph you know of", the 2 modifier means "draw a hexbox", the 0x3 modifier means "draw an iconbox", ...  The good point about this is that backends can always fall back to whatever unknown glyph they know of.  But this also assumes that the rest of the glyph code is a unichar.  That doesn't limit us that much though.  Or I don't think so.  Later we can add a modifier for "draw zero-width glyphs" for example.

I probably make this change before 1.12 such that we don't ship with the current PANGO_GLYPH_UNKNOWN_FLAG in the API.

Thoughts?
Comment 15 Behnam Esfahbod 2006-02-13 12:38:12 UTC
Behdad, you've mistaken it!

What Mac OS X has, is just a font, Last Resort [1], design by Michael Everson.  The text rendering engeen uses it in any case no other font has a glyph for the character.  Of course it's useless for Gucharmap, and pango MUST has good API to disable this feature.

This bug IS about an icon for items in the list of Unicode Blocks and Scripts (and the others comming soon).

[1]: http://developer.apple.com/fonts/LastResortFont/

Please move it back to Gucharmap. ;)
Comment 16 Behdad Esfahbod 2006-02-13 18:01:36 UTC
Humm, that makes me wanting all those LastResort glyphs!  Anyway, we have it for all scripts and we can add a couple more (unassigned, private-use).  That covers the script view.  If somebody volunteers and make more for each Unicode block, the better...

Behnam, I don't understand.  What's wrong with what I'm proposing?  I want to add this feature in Pango, and I make it such that gucharmap can use it in the Scripts list.  If you read my comments, it's clear that it is not even turned on by default.  If you check the reporter, it's myself, so I cannot have forogtten what this bug is about.  Cool down and do some inkscape work instead!

About having icons per Unicode block, I'm not sure that's worth the extra effort.  So, I'm ok to match PangoScript only.  And right now we have exactly the set of icons for all PangoScript values.

And we cannot do it as a font, since we are not Mac OS X and assuming that a last resort font is installed just doesn't make much sense.  And these glyphs are simple enough to keep around as Cairo paths, or so I believe.
Comment 17 Behnam Esfahbod 2006-02-13 22:49:48 UTC
The problem is that you guys are going to put (embed?) a lot of data into the engines, just because you don't know if font X is installed, blah blah.

Also why you should put Icons (graphic data) for Unicode Blocks (standard data) in Pango (a text rendering engine)?!
Comment 18 Behdad Esfahbod 2006-02-14 03:36:25 UTC
First, there's no "you guys".  You are talking to an individual.

Second, yes, I want to embed some glyphs into Pango.  Either Pango or Cairo will embed the Hershey glyphs for ASCII too, sooner or later.  That's to not look like dorks when something goes wrong and no font is found at all (misconfiguration, etc.)

What do you mean by "you don't know if font X is installed"?  We can figure out.  But that doesn't solve any real problem.  Next, using such font needs special handling.  Can you write an OpenType font that renders these icons for all their range?  I'm afraid no.

As for "a lot of data", it doesn't matter at all.  First, we can put them in a separate .so, next, Linux loads on demand.  That's a nonissue really.

As for the actualy size:

[behdad@home eps]$ grep ' [ml] *$' * | wc -l
476
[behdad@home eps]$ grep ' [c] *$' * | wc -l
3194

Each moveto or lineto takes two 'items', and a curveto takes four.  Storing points as pairs of doubles, in cairo_path_data_t directly, it takes 200kb, but using gint16 instead, makes it 50kb.  That's without simplifying the glyph data.  With simplifying, that can go well below 20kb I believe.

If that doesn't belong into Pango, where does it?  What is your suggestion?
Comment 19 Behdad Esfahbod 2015-04-30 16:46:32 UTC
Closing this.  If someone wants to update the images to add new Unicode scripts and create a font from them (with cmap format 12), that would be great.
Comment 20 Behdad Esfahbod 2015-04-30 16:46:58 UTC
So basically this is to say, I changed my mind about wanting to implement this in pangocairo.