After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 768074 - Japanese Kana are ignored
Japanese Kana are ignored
Status: RESOLVED OBSOLETE
Product: nautilus
Classification: Core
Component: Views: All
3.20.x
Other Linux
: Normal normal
: ---
Assigned To: Nautilus Maintainers
Nautilus Maintainers
Depends on:
Blocks:
 
 
Reported: 2016-06-26 22:00 UTC by Harald Brunner
Modified: 2021-06-18 15:50 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
file: repurpose compare_display_name() (3.06 KB, patch)
2016-06-27 13:03 UTC, Ernestas Kulik
committed Details | Review

Description Harald Brunner 2016-06-26 22:00:28 UTC
When sorting files/folders the algorithm appears to disregard Japanese characters and sort kana-only entries arbitrarily. e.g. this is the sorted result in Nautilus

きた
にし
きち
にさ
にし 1
にさ 2
きた 3
きち 5
みなみ
ひがし
みなみ 4
ひがし 6

So there is a mixture of sorting by length then using the digits, everything else rather arbitrary. A more correct order would be either of these schemes:

Apache:
きた 3
きた
きち 5
きち
にさ 2
にさ
にし 1
にし
ひがし 6
ひがし
みなみ 4
みなみ

JavaScript, C#, Python array/list sort:
きた
きた 3
きち
きち 5
にさ
にさ 2
にし
にし 1
ひがし
ひがし 6
みなみ
みなみ 4

(I saw another report where setting locales where mentioned but that does not make this any less of a bug; even sorting by Unicode code point would be more reasonable than this. Most sort algorithms also seem to do just fine by default. It cannot be expected that users switch their locale and restart their browsers/system just to work with a different set of files. And if the files are mixed there is no way to reasonably deal with the situation at all.)

While setting up a minimal sorting example i ran into another issue: The create new folder dialog falsely claims there is a conflict between names that consist only of kana, maybe that has the same root cause.

I tried to create the folders:

みなみ
きた
ひがし

The last one blocked.
Comment 1 Ernestas Kulik 2016-06-26 23:15:32 UTC
(In reply to Harald Brunner from comment #0)
> While setting up a minimal sorting example i ran into another issue: The
> create new folder dialog falsely claims there is a conflict between names
> that consist only of kana, maybe that has the same root cause.
> 
> I tried to create the folders:
> 
> みなみ
> きた
> ひがし
> 
> The last one blocked.

This one I figured out.

When creating directories or renaming directories or files, the code calls nautilus_directory_get_file_by_name() to look for duplicates. This, of course, does not work, as g_utf8_collate() is the deciding function (which is also what causes the sorting issue, due to it being locale-dependent).

I have fixed it (read: used HAX) locally by adding an additional check in case an existing file has been found.

Will attach a demo patch in a bit (so whomever is concerned in the morning can take a look).

Funnily enough, I fixed a different bug in the same code not too long ago.
Comment 2 Ernestas Kulik 2016-06-26 23:41:49 UTC
Nope, my quick hack does not fix it fully.

Message to Carlos: would it not be better for nautilus_file_compare_display_name() to use g_strcmp0() (or friends) instead of g_utf8_collate() (/and/ friends)? I see it’s only used by nautilus_directory_get_file_by_name().
Comment 3 Carlos Soriano 2016-06-27 07:32:06 UTC
(In reply to Ernestas Kulik from comment #2)
> Nope, my quick hack does not fix it fully.
> 
> Message to Carlos: would it not be better for
> nautilus_file_compare_display_name() to use g_strcmp0() (or friends) instead
> of g_utf8_collate() (/and/ friends)? I see it’s only used by
> nautilus_directory_get_file_by_name().

No, collate actually does some smart sorting. For instance "file1" "file10" "file5" are ordered correctly. Also the dot in the extensions are considered special so that the order is not directly the alphabetical order.

Also performance.

What's need here is glib to support better Asian languages, and there are few reports about it already.
Comment 4 Ernestas Kulik 2016-06-27 08:43:33 UTC
(In reply to Carlos Soriano from comment #3)
> (In reply to Ernestas Kulik from comment #2)
> > Nope, my quick hack does not fix it fully.
> > 
> > Message to Carlos: would it not be better for
> > nautilus_file_compare_display_name() to use g_strcmp0() (or friends) instead
> > of g_utf8_collate() (/and/ friends)? I see it’s only used by
> > nautilus_directory_get_file_by_name().
> 
> No, collate actually does some smart sorting. For instance "file1" "file10"
> "file5" are ordered correctly. Also the dot in the extensions are considered
> special so that the order is not directly the alphabetical order.
> 
> Also performance.
> 
> What's need here is glib to support better Asian languages, and there are
> few reports about it already.

But there is no need to do any kind of sorting if we’re only looking for an exact match (talking about the renaming issue specifically).
Comment 5 Ernestas Kulik 2016-06-27 13:03:49 UTC
Created attachment 330435 [details] [review]
file: repurpose compare_display_name()

nautilus_file_compare_display_name() is only used by
nautilus_directory_get_file_by_name() nowadays and it was written with
sorting in mind. As g_utf8_collate() and its locale dependence does not
work well with finding matching files by name, it makes sense to replace
the call to g_strcmp0(). That, however, makes the function less suitable
for sorting. This commit changes its purpose as described.
Comment 6 Ernestas Kulik 2016-06-27 13:04:41 UTC
Unless I am wrong in that there are extensions, depending on it. Realized that a bit too late.
Comment 7 Ernestas Kulik 2016-06-27 13:19:36 UTC
(In reply to Ernestas Kulik from comment #6)
> Unless I am wrong in that there are extensions, depending on it. Realized
> that a bit too late.

Nevermind, the API doesn’t call it.
Comment 8 Carlos Soriano 2016-06-27 13:24:27 UTC
Review of attachment 330435 [details] [review]:

I think this is fine, thanks!
Comment 9 Ernestas Kulik 2016-06-27 13:46:47 UTC
Comment on attachment 330435 [details] [review]
file: repurpose compare_display_name()

Attachment 330435 [details] pushed as fc25f7e - file: repurpose compare_display_name()
Comment 10 André Klapper 2021-06-18 15:50:38 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version of Files (nautilus), then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/nautilus/-/issues/

Thank you for your understanding and your help.