Bug 323249 – Additional Numberings in gnome-doc-utils

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 323249 - Additional Numberings in gnome-doc-utils


Summary:	Additional Numberings in gnome-doc-utils


Status:	RESOLVED FIXED

Product:	yelp-xsl
Classification:	Core
Component:	DocBook
Version:	2.31.x
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	Yelp maintainers
QA Contact:	Yelp maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2005-12-05 08:52 UTC by Theppitak Karoonboonyanan
Modified:	2012-01-19 14:29 UTC

See Also:
GNOME target:	---
GNOME version:	2.13/2.14

Attachments
alpha-thai patch (950 bytes, patch) 2005-12-26 06:45 UTC, Theppitak Karoonboonyanan	none	Details \| Review

Description Theppitak Karoonboonyanan 2005-12-05 08:52:02 UTC

From
http://mail.gnome.org/archives/gnome-doc-devel-list/2005-December/msg00000.html 
I wrote:

I'm going to translate gnome-doc-utils into Thai and find
two required Thai numberings are missing. One is Thai alphabetical,
and the other is Thai decimal digits.

Thai alphabetical numbering is run with Thai consonants in the range:

 U+0E01 (THAI CHARACTER KO KAI)
   :
 U+0E2E (THAI CHARACTER HO NOKHUK)

with three characters skipped, namely:

 - U+0E03 (THAI CHARACTER KHO KHUAT)
 - U+0E05 (THAI CHARACTER KHO KHON)
 - U+0E06 (THAI CHARACTER KHO RAKHANG)

(i.e. the sequence is: U+0E01, U+0E02, U+0E04, U+0E07 .. U+0E2E)

This is mainly used for numbering appendixes in Thai
documents, and occasionally used in ordered lists.

Numbering with Thai decimal digits is less used in general,
but exists  in most official or military documents. It just uses
Thai digits in the range (U+0E50..U+0E50) for 0..9 respectively.

I'm not sure about digits bahavior described by W3C's XSLT,
nor what have been done in gnome-doc-utils, but let me mention
a common mistake in some implementations: the assumed translation
of digits. We would need an explicit way to specify whether to use
Thai digits in numbering, rather than automatically translated.

Thank you for your attention. Any comment would be appreciated.

Hossein Noorikhah also added :

What about Persian(arabic script) numbering?
۰ ۱ ۲ ۳ ۴ ۵ ۶ ۷ ۸ ۹
Any comment would be appreciated!

Comment 1 Behdad Esfahbod 2005-12-05 09:01:22 UTC

Ok, the Thai requirements are apparently harder than Persian/Arabic.  The case
of simple alternative digit sets is easier to fix.

Comment 2 Shaun McCance 2005-12-05 19:04:11 UTC

Quoth Theppitak: "Thai digits in the range (U+0E50..U+0E50) for 0..9
respectively."  I assume you mean U+0E50 to U+0E59?

So I need to understand how all these numbering systems work.  For Thai decimal
digits (U+0E50) and Persian numbering (U+06F0), I'm assuming the numbering works
just like Western decimal numbering: 0, 1, ..., 9, 10, 11, ..., 19, 20, 21, ....
That is, you have a 0, you begin counting with 1, and when you run out of digits
in a position, you set the digit in the position to 0 and increment the digit to
the left.  Correct?

In contrast, alphabetic numbering has no "0" digit.  Rather, when you run out of
digits, you increment the digit to the left and set the current digit to the one
 you started counting with.  So we have a, b, c, ..., x, y, z, aa, ab, ....  See
the difference?

So does Thai alphabetic numbering follow this scheme?

Comment 3 Shaun McCance 2005-12-05 19:37:56 UTC

Does the Thai alphabet have upper- and lowercase letters?  For English
alphabetic numbering, we have both a,b,c,... and A,B,C,....

Comment 4 Shaun McCance 2005-12-05 19:53:58 UTC

I don't expect this to matter for anything we do, but how do the Thai and
Persian decimal systems indicate a negative value?

Comment 5 Behdad Esfahbod 2005-12-05 20:47:03 UTC

Shaun,

Let alphabetic case aside for now.  For decimal, AFAIK, all digit sets encoded
in Unicode behave the same way.  But that's not even relevant here:  glibc has
support for the 'I' (i18n) modifier to printf format strings for decimal and
floating point number, to use the locale-specific digit set instead of the ASCII
one.  So you write printf ("%Id", x) and get it using Persian numerals.

Now as a portability layer, we have worked with GNU gettext maintainers and
gettext >= 0.14 takes care of the i18n flag.  Means, if the runtime doesn't
support it, it removes the i18n flag from the translated message.  So a rather
safe way to use locale-specific digits is to printf (_("%d"), x).  That also
takes care of negative numbers, etc.  Hope that helps.

I don't have much ideas about alphabetic at this point.  Just know that CSS3 has
some stuff for enumerating like that.

Comment 6 Shaun McCance 2005-12-05 20:58:28 UTC

This is all done in XSLT, so I don't have direct access to libc functionality. 
Numbering system can either be implemented in libxslt as an additional
xsl:number format, or they can be implemented in high-level XSLT code.  Doing
things in libxslt means we can use libc; however, having the numbering system
automatically determined by libc from the environment isn't an acceptable
solution.  The XSLT will explicitly tell libxslt which number formatter to use,
and libxslt would need to respect that absolutely.

We also can't shunt the work off onto CSS, because most of these numbers are
used inside of block text, rather than as prefixes on the lists.

Comment 7 Behdad Esfahbod 2005-12-05 21:11:56 UTC

Ok, then I guess reassigning to libxslt is the best way to go.  I was discussing
locale-dependent sorting order with Daniel Veillard this summer, that's quite
possible.  Digits should not be any harder.

Comment 8 Shaun McCance 2005-12-05 21:32:24 UTC

It's also not too difficult to implement these directly in XSLT.  That has the
advantage of making the stylesheets portable across different XSLT implementations.

Comment 9 Danilo Segan 2005-12-06 11:32:39 UTC

Shaun, also check

 http://mail.gnome.org/archives/gnome-doc-devel-list/2005-September/msg00000.html

I've implemented something similar for Serbian in XSLT, and it seems it applies
directly to this as well. 

Easy enough to make it more general, I'd say.  Provided you find it useful, of
course :)

Comment 10 Theppitak Karoonboonyanan 2005-12-06 15:27:07 UTC

Shuan,

More information on Thai numbering:

- Thai decimal number just behaves like that of Western, including negative
number, as Behdad explained in #5. Just replace 0..9 with corresponding digits.
For floating points, we use comma as thousand separator, and period as decimal
point.

- Thai alphabet does not have case. And, yes, its behavior is just like what you
described at the end of #2.

Comment 11 Theppitak Karoonboonyanan 2005-12-26 06:43:26 UTC

Oops! Testing it with i18n/test-numbers in recent CVS, I found I missed something 
when describing Thai alphanumeric numbering.

Actually, there are two more skipped chars:
  - U+0E24  THAI CHARACTER RU
  - U+0E26  THAI CHARACTER LUSo, the sequence should be:
  U+0E01, U+0E02, U+0E04, U+0E07 .. U+0E23, U+0E25, U+0E27 .. U+0E2E

Comment 12 Theppitak Karoonboonyanan 2005-12-26 06:45:47 UTC

Created attachment 56395 [details] [review]
alpha-thai patch

Comment 13 Theppitak Karoonboonyanan 2006-09-11 14:39:07 UTC

*ping*

Can I commit the last patch?

Comment 14 Shaun McCance 2006-09-11 16:19:51 UTC

Yes, please commit to HEAD.

Comment 15 Theppitak Karoonboonyanan 2006-09-11 16:47:11 UTC

Patch committed to HEAD.