After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 553560 - Sort items according to @lang xsl:sort attribute
Sort items according to @lang xsl:sort attribute
Status: RESOLVED OBSOLETE
Product: libxslt
Classification: Platform
Component: general
unspecified
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2008-09-24 11:39 UTC by Gabor Kelemen
Modified: 2021-07-05 11:01 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
eo-sample vendor patch (1011 bytes, patch)
2009-02-21 13:51 UTC, Roumen Petrov
none Details | Review

Description Gabor Kelemen 2008-09-24 11:39:25 UTC
Please describe the problem:
I translated the 2.24 release notes to Hungarian, and the supported languages' list is now sorted alphabetically in the translation:
http://library.gnome.org/misc/release-notes/2.24/index.html.hu#rni18

However, it seems that it is not sorted based on the Hungarian sorting rules, but on the English ones. 

The visible part of the problem is that Estonian (Észt), is at the end of the list, instead of being between Danish (Dán) and Finnish (Finn).


Steps to reproduce:
1. 
2. 
3. 


Actual results:


Expected results:


Does this happen every time?


Other information:
Comment 1 Frederic Peters 2008-09-24 12:08:41 UTC
That would be a xsltproc, as the xslt file has a correct lang attribute.

FWIW I reduced it to:

<languages>
  <lang>Dán</lang>
  <lang>Észt</lang>
  <lang>Finn</lang>
</languages>

And 

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" encoding="UTF-8" omit-xml-declaration="yes"/>

<xsl:template match="languages">
  <xsl:for-each select="lang">
    <xsl:sort lang="hu"/>
    <xsl:value-of select="."/><xsl:text>&#10;</xsl:text>
  </xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Output is:

Dán
Finn
Észt

And for the record, French would sort in the same way and has the same problem.
Comment 2 Wieland Pusch 2009-02-17 20:26:00 UTC
I have the same bug.  :-)
Comment 3 Wieland Pusch 2009-02-18 17:45:00 UTC
I retried with the last SVN-snapshot on Linux.
And an illegal lang gives no warning:
<xsl:sort lang="eoxxZZ" select="."/>
Comment 4 Roumen Petrov 2009-02-18 20:10:55 UTC
Wieland inform me about this issue. I'm author of winapi port. Nick Wellnhofer is author. As example the French sort may not work as expected- GNU libc collating sequence don't define accented symbols and the order is based on position in unicode table. Bulgarian is not dictionary sort as the GNU libc define non-dictionary sort - it is lets call it "phonetic".

So I would like to know you platform: GNU libc based, winapi or OS X ?
Comment 5 Wieland Pusch 2009-02-18 23:07:40 UTC
I use GNU libc based (debian linux).
I sort languages in Esperanto:
xml:
<?xml version="1.0" encoding="UTF-8"?> 
<?xml-stylesheet type="text/xsl" href="lingvojsort.xsl"?>
<lingvoj>
  <lingvo>dana</lingvo>
  <lingvo>ĝuanga</lingvo>
  <lingvo>ĉina</lingvo>
  <lingvo>zulua</lingvo>
</lingvoj>

xsl:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method='html' version='1.0' encoding='UTF-8' indent='yes'/>
<xsl:template match="/">
  <html><body>
  <h2>Lingvoj</h2>
    <table border="1">
      <tr bgcolor="#9acd32">
        <th align="left">Nomo</th>
      </tr>
      <xsl:for-each select="lingvoj/lingvo">
      <xsl:sort lang="eo" select="."/>
      <tr>
        <td><xsl:value-of select="."/></td>
      </tr>
      </xsl:for-each>
    </table>
  </body>
  </html>
</xsl:template>
</xsl:stylesheet>
Comment 6 Wieland Pusch 2009-02-18 23:50:10 UTC
Maybe the bug is not related to xsltproc but glibc/locale:
$ /bin/echo -e "ĉina\nzulua" | env LC_COLLATE=eo sort
also is wrong.
Comment 7 Wieland Pusch 2009-02-19 08:52:11 UTC
I found by googling:
sudo apt-get install language-pack-eo

Now
$ /bin/echo -e "ĉina\nzulua" | env LC_COLLATE=eo sort
works fine.

And
$ locale -a
shows also eo and eo.utf8

And there is a dir
/usr/lib/locale/eo
not only a file /usr/share/i18n/locales/eo
and this file is new and changed.

But 
/usr/local/bin/xsltproc lingvojsort.xsl lingvoj.xml > lingvoj.html
still sorts wrong.

Any help?

Comment 8 Wieland Pusch 2009-02-19 09:44:23 UTC
It should have been:
$ /bin/echo -e "ĉina\nzulua" | LC_COLLATE=eo sort

I made a simple text test:
l.xml:
<?xml version="1.0" encoding="UTF-8"?> 
<?xml-stylesheet type="text/xsl" href="l.xsl"?>
<lingvoj>
  <lingvo>jida</lingvo>
  <lingvo>joruba</lingvo>
  <lingvo>ĝuanga</lingvo>
  <lingvo>ĉina</lingvo>
  <lingvo>zulua</lingvo>
</lingvoj>

l.xsl:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method='text' encoding='UTF-8'/>
<xsl:template match="/">Lingvoj:<xsl:for-each select="lingvoj/lingvo">
      <xsl:sort lang="eo" select="."/>
        <xsl:text>&#10;&#13;      </xsl:text>
        <xsl:value-of select="."/>
      </xsl:for-each>
      <xsl:text>&#10;&#13;</xsl:text>
</xsl:template>
</xsl:stylesheet>

To run the test:
$ /usr/local/bin/xsltproc l.xsl l.xml
Lingvoj:
      jida
      joruba
      zulua
      ĉina
      ĝuanga
$ /usr/local/bin/xsltproc -V
Using libxml 20631, libxslt 10122 and libexslt 813
xsltproc was compiled against libxml 20631, libxslt 10124 and libexslt 813
libxslt 10122 was compiled against libxml 20631
libexslt 813 was compiled against libxml 20631

Maybe I have to read the source of libxslt?
Comment 9 Wieland Pusch 2009-02-19 20:46:43 UTC
Looking into the source:
$ grep -n "xsl:sort lang attribute" libxslt/xsltutils.c
988:    /* TODO: xsl:sort lang attribute */

This is not yet implemented?
That's a reason it's not working correct.
Comment 10 Roumen Petrov 2009-02-19 21:17:56 UTC
Wieland, thanks efforts.
I would like to see section LC_COLLATE from you /usr/share/i18n/locales/eo .
Comment 11 Wieland Pusch 2009-02-20 07:25:48 UTC
Maybe the content was not changed, only the mod-time.

LC_COLLATE
copy "iso14651_t1"

collating-symbol <ccirc>
collating-symbol <gcirc>
collating-symbol <hcirc>
collating-symbol <jcirc>
collating-symbol <scirc>
collating-symbol <ubreve>

reorder-after <c>
<ccirc>
reorder-after <g>
<gcirc>
reorder-after <h>
<hcirc>
reorder-after <j>
<jcirc>
reorder-after <s>
<scirc>
reorder-after <u>
<ubreve>

reorder-after <U0043>
<U0108> <ccirc>;<CIR>;<CAP>;IGNORE % Ĉ
reorder-after <U0063>
<U0109> <ccirc>;<CIR>;<MIN>;IGNORE % ĉ
reorder-after <U0047>
<U011C> <gcirc>;<CIR>;<CAP>;IGNORE % Ĝ
reorder-after <U0067>
<U011D> <gcirc>;<CIR>;<MIN>;IGNORE % ĝ
reorder-after <U0048>
<U0124> <hcirc>;<CIR>;<CAP>;IGNORE % Ĥ
reorder-after <U0068>
<U0125> <hcirc>;<CIR>;<MIN>;IGNORE % ĥ
reorder-after <U004A>
<U0134> <jcirc>;<CIR>;<CAP>;IGNORE % Ĵ
reorder-after <U006A>
<U0135> <jcirc>;<CIR>;<MIN>;IGNORE % ĵ
reorder-after <U0053>
<U015C> <scirc>;<CIR>;<CAP>;IGNORE % Ŝ
reorder-after <U0073>
<U015D> <scirc>;<CIR>;<MIN>;IGNORE % ŝ
reorder-after <U0055>
<U016C> <ubreve>;<BRE>;<CAP>;IGNORE % Ŭ
reorder-after <U0075>
<U016D> <ubreve>;<BRE>;<MIN>;IGNORE % ŭ

reorder-end

END LC_COLLATE
Comment 12 Wieland Pusch 2009-02-20 23:06:05 UTC
The code xsltNewLocale in libxslt/xsltlocale.c
expects something like "pt-br" "<language-country> and tries to convert it
into "pt_BR.utf8"
This says the comment:  /* Convert something like "pt-br" to "pt_BR.utf8" */

But if the lang-attribute is only "eo" or "hu" the code has a bug.
This becomes "eo........." no ending '\0' but random sh.t.

Now I use "eo-us" and it works fine.
Maybe Frederic can try "hu-hu" instead of "hu".
The locale "hu_HU.utf8" must be working.

Of course this is a bug.
But fixing seems easy to me.
Comment 13 Roumen Petrov 2009-02-20 23:37:53 UTC
Wieland,
I miss one important part from you report : libxslt 10122 (!). The initial sort
 support is added after release 1.1.24. Quote from change log:
-----
Tue Jun  3 18:26:26 CEST 2008 Daniel Veillard <daniel@veillard.com>
  ... patch from Nick Wellnhofer adding xsl:sort lang support using the locale
  support from the C library.
...
Tue May 13 17:51:05 CEST 2008 Daniel Veillard <daniel@veillard.com>
  * configure.in doc/*: release of 1.1.24
-----

Next about Esperanto. It is vendor specific language and is not in GNU libc
HEAD.
The "eo" file is not from "language-pack-eo". This package in a debian based
distribution contain file /var/lib/locales/supported.d/eo with content:
  eo_US.UTF-8 UTF-8
  eo.UTF-8 UTF-8
Note eo_US. The "US" as region ("subtag" from rfc3066.txt) is not appropriate.
I don't know how to handle this. The simple solutions is to modify xsltlocale.c
after call xsltDefaultRegion do not return NULL if default region is missing.
As example:
+++++
    if (region == NULL)
        q--;
    else {
        *q++ = region[0];
        *q++ = region[1];
    }
+++++
    memcpy(q, ".utf8", 6);
Other modification is to modify method xsltDefaultRegion to return "US" for
"eo" :( but in this case locale command has to return eo_us in the list.
For now I won't propose patch for libxslt - not before Esperanto to be accepted
in GNU libc. Will glibc support a language without region or will assign a
neural region as example "UN" ?
Comment 14 Roumen Petrov 2009-02-20 23:52:53 UTC
About Hungarian : It work for me: xsltproc --version ... libxslt 10124-SVN1494 ...., i.e. trunk.

Input file:
<?xml version="1.0" encoding="UTF-8"?>
<languages>
  <lang>É</lang>
  <lang>E</lang>
  <lang>F</lang>
  <lang>D</lang>
</languages>

Output file:
D
E
É
F

I would like propose bug to be closed as invalid.
Comment 15 Wieland Pusch 2009-02-21 00:37:12 UTC
I later compiled a newer version. Sorry.
$ xsltproc -V
Using libxml 20631, libxslt 10124 and libexslt 813
xsltproc was compiled against libxml 20631, libxslt 10124 and libexslt 813
libxslt 10124 was compiled against libxml 20631
libexslt 813 was compiled against libxml 20631

The bug is, that other xslt-processors can handle lang="eo".
E.g. Firefox and saxon can handle it. 

I don't like to change to lang="eo-us".
It is better to use "eo" or "eo.utf8" than use no locale without warning.

But if you want to close the bug, it's ok.

Thanks a lot for your help and patience.
Comment 16 Roumen Petrov 2009-02-21 13:51:52 UTC
Created attachment 129213 [details] [review]
eo-sample vendor patch

Wieland, may be you linux vendor will accept patch similar to attached "eo-sample vendor patch" for next libxsl version. With patch esperanto work for me. For know I don't know how to deal properly with non-region aware locales.
Comment 17 Roumen Petrov 2009-02-26 21:47:31 UTC
Wieland,
you issue is not the same as for the reporter (Gabor). The problem is not same.
It is a feature request as the GNU libc don't support Esperanto as locale. May I propose you to open a new request for Esperanto.

About he original report (Hungarian) - it work for me with trunk version. This is reason to propose bug as invalid.

Daniel what about ?
Comment 18 Wieland Pusch 2009-02-26 22:29:12 UTC
I cloned this to 
http://bugzilla.gnome.org/show_bug.cgi?id=573327
Comment 19 Gabor Kelemen 2009-02-27 00:15:06 UTC
Probably I'm doing something wrong, but still not good, even after compiling from trunk:

$ xsltproc srt.xslt langs.xml 
D
E
F
É
gabor@gabor-desktop:~/Asztal$ cat srt.xslt
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" encoding="UTF-8" omit-xml-declaration="yes"/>

<xsl:template match="languages">
  <xsl:for-each select="lang">
    <xsl:sort lang="hu"/> # hu-hu or hu_HU.utf8 gives the same result
    <xsl:value-of select="."/><xsl:text>&#10;</xsl:text>
  </xsl:for-each>
</xsl:template>
</xsl:stylesheet>
gabor@gabor-desktop:~/Asztal$ cat langs.xml 
<?xml version="1.0" encoding="UTF-8"?>
<languages>
  <lang>É</lang>
  <lang>E</lang>
  <lang>F</lang>
  <lang>D</lang>
</languages>
gabor@gabor-desktop:~/Asztal$ which xsltproc 
/usr/local/bin/xsltproc
gabor@gabor-desktop:~/Asztal$ xsltproc -V
Using libxml 20632, libxslt 10124 and libexslt 813
xsltproc was compiled against libxml 20632, libxslt 10124 and libexslt 813
libxslt 10124 was compiled against libxml 20632
libexslt 813 was compiled against libxml 20632

This is on Ubuntu Jaunty, if that matters. However, sort works as expected:
$echo -e "é\ne\nf\nd" | sort
d
e
é
f
Comment 20 Roumen Petrov 2009-02-28 11:19:16 UTC
Gabor, my output is: $ ./xsltproc --version
Using libxml 20632, libxslt 10124-SVN1494 and libexslt 813
...

It seems to me you program use system libraries, not new one.
Comment 21 GNOME Infrastructure Team 2021-07-05 11:01:04 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxslt/-/issues/

Thank you for your understanding and your help.