After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 355152 - Improve sorting by name for digits and case-sensitiveness
Improve sorting by name for digits and case-sensitiveness
Status: RESOLVED OBSOLETE
Product: nautilus
Classification: Core
Component: Views: All
3.1.x
Other All
: Normal normal
: ---
Assigned To: Nautilus Maintainers
Nautilus Maintainers
: 316960 328383 338154 398729 435505 605598 609991 619612 621245 631376 633406 638839 654721 681176 700950 (view as bug list)
Depends on: 761806 793747
Blocks:
 
 
Reported: 2006-09-09 16:51 UTC by Marcus Zurhorst
Modified: 2021-06-18 15:55 UTC
See Also:
GNOME target: ---
GNOME version: 2.31/2.32


Attachments
Istanbul captured the nautilus sorting bug! (849.05 KB, application/ogg)
2006-09-09 16:53 UTC, Marcus Zurhorst
Details
The bug (38.21 KB, image/png)
2009-09-08 01:22 UTC, Harrison T Smith
Details
Observing sorting behavior under Linux (22.35 KB, application/vnd.oasis.opendocument.text)
2017-05-11 13:38 UTC, Olivier Cailloux
Details
Observing sorting behavior under Linux (43.75 KB, application/pdf)
2017-05-11 13:39 UTC, Olivier Cailloux
Details

Description Marcus Zurhorst 2006-09-09 16:51:25 UTC
Please describe the problem:
Hello all, 
I found an really strange behaviour in the latest nautilus file browser.
The sorting of the files in the list view is not working!

It's hard to explain, but I captured the bug with istanbul.
I love this tool!!  ;-)


Kindly regards,
Marcus

Steps to reproduce:
1. Edit some filenames (at the end of the filename).
2. Watch how they are re-sorted!
3. 


Actual results:


Expected results:


Does this happen every time?
Yes!

Other information:
I created an Bugzilla-Report in Mandrivas system, too. (#25447)
However, this is severe enough to get asap to the Gnome developers!
Comment 1 Marcus Zurhorst 2006-09-09 16:53:07 UTC
Created attachment 72457 [details]
Istanbul captured the nautilus sorting bug!
Comment 2 Markus Nagler 2007-01-18 19:10:49 UTC
Replicated.
For those not wanting to watch the video:
name 4 files e.g. like this
bug25551, bug25552, bug25553, bug26554
any sort by name should always sort the last item last because of the *26* bit.
However, changing the closing 4 to a, i.e. mv bug26554 bug2655a causes it to be sorted _before_ the files starting with bug25*.

Additional info:
works on completely empty (text) files.
Keeps working no matter how many letter or numbers are appended to the end of the filename.
Comment 3 A. Walton 2008-05-18 16:45:03 UTC
*** Bug 398729 has been marked as a duplicate of this bug. ***
Comment 4 A. Walton 2008-05-18 16:58:00 UTC
*** Bug 435505 has been marked as a duplicate of this bug. ***
Comment 5 Harrison T Smith 2009-09-08 01:22:05 UTC
Created attachment 142663 [details]
The bug

This bug is still present in nautilus 2.26.3. Here, I have three folders named 2009.0, 2009.05, and 2009.1, and 2009.05 is ordered after 2009.1.
Comment 6 R Duke 2010-02-02 08:43:16 UTC
Bug 605598 also seems to be a duplicate of this bug.
Comment 7 Cosimo Cecchi 2010-02-20 14:10:00 UTC
*** Bug 609991 has been marked as a duplicate of this bug. ***
Comment 8 Cosimo Cecchi 2010-02-20 14:10:08 UTC
*** Bug 605598 has been marked as a duplicate of this bug. ***
Comment 9 Marcello Romani 2010-04-27 15:10:27 UTC
Hi, I'm seeing this bug on a fully updated (2010-04-27) Ubuntu 9.10 system, Nautilus 2.28.1.

I see this bug is still "unconfirmed", but IME it should be confirmed instead.
Comment 10 Cosimo Cecchi 2010-05-26 09:45:55 UTC
*** Bug 328383 has been marked as a duplicate of this bug. ***
Comment 11 Cosimo Cecchi 2010-05-26 09:46:01 UTC
*** Bug 619612 has been marked as a duplicate of this bug. ***
Comment 12 Cosimo Cecchi 2010-05-26 09:47:13 UTC
*** Bug 316960 has been marked as a duplicate of this bug. ***
Comment 13 Tom 2010-05-26 12:30:58 UTC
Oh for crying out loud, it has been around since 2006, we are putting all these duplicates against it, an it is _unconfirmed_?  Give me a fricking break.

3aa
4aa
20a

Sort... is just... SO WRONG...

This is killing me, Lucid Lynx forced me from Gqview to Geeqie, which uses whatever Gnomefrastructure that has the bug, and I have zillions of files with md5 sums as their names.

Guess what this bug does to MD5 sums as filenames?  Hash!

Guess I'm going to have to fix it myself...

Of switch to freaking Lubuntu (pcman-fm) or Kubuntu (just shoot me if it comes to that... I guess something called "Konqueror"...)

As mentioned, this bug breaks Geeqie, also...
Comment 14 Cosimo Cecchi 2010-05-26 12:33:51 UTC
Confirming.
Comment 15 Tom 2010-06-10 20:37:37 UTC
*** Bug 621245 has been marked as a duplicate of this bug. ***
Comment 16 Sven Witterstein 2010-07-19 10:45:16 UTC
Like I wrot in Bug 458707 please do consider the following:

I have now read tons and tons of pro and con of it.
Just my 2 cents:

Cheap, general solution:
- Make an option available in Gconf to change between "ascii" "natural" (=the
current with respect to numbers that are NOT prefixed with "0" ) and
"dos/win-Style" ( prefixes with . in front of the rest, folders separated")

Better solution:
- offer a per-directory config option in addition.
Reason: on my "linux" files, I can live perfectly happy with "natural" sort
order though it "feels" sometimes "weird" and "buggy" - but I do have to mount
a lot of stuff created under windumb - so "win-style" Order should be default
for ntfs-partitions and be selectable for all those dirs from my co-workers
that have all the "!" "__" "_" etc. prefixing to put important stuff, such as
"_main.cpp" on top of a source dir.

It would be so nice to find a solution that offers - thoug hidden in gconf -
choice over the flame war going on here...

Please change status back to "Enhancement"!

(Actually, in "natural" is so buggy and broken, why not make "ascii" or "winstyle" the default, and the "natural" as "innovation" optional?
Comment 17 Sven Witterstein 2010-07-19 11:11:24 UTC
Actually, on thinking about it, a good filesystem should give a per-user and per-directory configurability how things should be sorted...
Comment 18 Sven Witterstein 2010-07-19 11:14:36 UTC
e.g. all this sorting that an OS (or tool) does is just overcoming the lack of this information on a lower level. More: when taring, zipping, 7zipping or whatever the whole tree sort order should be retained as they were inteded by the archive creator as well, so that when mailing archives and unpacking them elsewhere the order remains... same as with the timestamps: on win, the refer to "last save time" and are retained across filesystems, on linux they get the date when they were unpacked. Sad. Sad.
Comment 19 mail 2010-09-16 04:04:12 UTC
As mentioned above I'd also really like to see the current behaviour of skipping non-alphanumeric characters addressed.

Current gnome scenario:

foldera
_importantfolder_
somefolder
=specialfolder=
yetanotherfolder


Proposed example:

_importantfolder_
=specialfolder=
foldera
somefolder
yetanotherfolder


I can understand that the alphanumeric sort order for files and folders may in certain cases need to be be different from their mac and windows counterparts but contend that for non-alphanumeric instances it primarily results in: 

a) An unnecessary adjustment to muscle memory when moving between file managers

b) A most irritating inability for people to simply and effectively (albeit somewhat crudely) order by importance


For those striving to familiarise themselves with desktop linux I believe this is a significant usability hurdle many struggle to overcome.

Lastly I propose that any alterations to the ordering of non-alphanumeric characters as described would have only a marginal effect on present gnome users. These users are unlikely to place any emphasis on non-alphanumerical ordering since it is currently a tactic that they are unable to employ.


Possible solutions for *non-alphanumeric* ordering:

a) Research current mac, windows and even kde ordering, and if considered an appropriate fit incorporate into default alphanumeric ordering.

b) Research windows/mac specifications. Implement ability to sort according to those standards either as an option under 'arrange icons' or via a windows/mac style sorting preference.


Should any developer that wishes to assign themselves to this bug want assistance I would be happy to conduct a thorough analysis of available options to present to the community for review. I would be interested in helping map and resolve both alphanumeric and non-alphanumeric issues. If required, I would also be happy to approach the KDE developers regarding consistency should such action be helpful. Hopefully with such information to hand any developer effort required would be minimal.
Comment 20 Tom 2010-10-19 19:57:53 UTC
Could we increase the priority of this bug?

I'm dealing with directories with lots of files named with base 16 numbers.

The trivial reproducing test where it sorts like this:

3aa.txt
4aa.txt
20a.txt

is not really my problem.  I'm dealing with trying to use directories with hundreds of md5 sum named files.

It would almost be an improvement if it just used rand() to sort, then I wouldn't have false expectations of useful ordering...
Comment 21 Olivier Cailloux 2010-10-19 21:27:38 UTC
Apparently there are several problems related to the sorting order on linux. Please see my detailed tries on
http://ubuntuforums.org/showthread.php?t=1588316 .
Comment 22 Fabio Durán Verdugo 2010-11-16 23:05:52 UTC
*** Bug 635017 has been marked as a duplicate of this bug. ***
Comment 23 gkinal 2010-11-16 23:51:36 UTC
HOW CAN A BUG LIKE THIS REMAIN UNFIXED AFTER FOUR FRIGGIN YEARS ?????????

BY THE WAY, IT IS NOT JUST SORTING WITH DIGITS !!!!

Two files,   aa.ext    and    bb.ext, when renamed  to   dd.ext and cc.ext, alos do not show up with the rearranged order.

REPEAT, FOUR YEARS AND COUNTING.  WTF ??????


GK
Comment 24 Marcus Zurhorst 2010-11-19 06:45:43 UTC
There was another occurance of this bug yesterday, I was really annoying:

For my mp3 collection, I'm using a folder per album.
The naming scheme is: "<artist> - <album title>"

Yesterday, I was looking onto the hard disk and saw this:

   ..
   ..
   Madonna - Frozen
   Madonna - Greatest Hits
   Madonna Hiphop Massaker - Heavy Rotation
   Madonna - Like A Prayer
   ...

This is just wrong.  Any user intent is completely nuked by a strange sorting algorithm.


Please, get this fixed. One of the main preaches in the linux world has always been:  full flexiblity and power to the users.  
Here, Nautilus doesn't leave any chance.


Maybe you make this configurable for the users.  If so, you might instrument this switch to get user feedback.  I'm not convienced that a lot of users love this behaviour at all.


Cheers,
Marcus
Comment 25 onetimeposter 2011-07-06 03:50:56 UTC
Guys,

i can still reproduce this bug with Comment 24 (from 2010-11-19!!!) on current Nautilus 2.32.2.1 compiled on Mai 2011:

Madonna - Greatest Hits
Madonna Hiphop Massaker - Heavy Rotation
Madonna - Like A Prayer

...created files still stay ordered like this!

PLEASE IMPLEMENT A SIMPLE ORDERING-ALGORITHM AS DESCRIBED IN THE "Proposed example:" by the guy on Comment 19.
I just want a simple renaming sheme how to keep files on top of a directory (like on Windows since 95 with "_" or other special symbols at the beginning of a filename...)

Why is it not possible to implement something as default if many linux-users got frickled up locales, or at least give us a place where we can define ordering under nautilus itself?

PLEASE:
If it is a matter of money, just write it here, and i will contact you (probably one of the devs) directly: 
I would pay 50EUR if a change comes live which shows up files on topmost all files or other directories listed topmost under the current directory if they start with an underline ("_"), and all the other dirs/files get sorted after " ",0,1,...,9,a,ä..o,ö,...,s,ß,...,u,ü,...,z.
That rule cant be so hard to repeat it for the longest filename or dirname if it is longer then the longest file in a current directory and sort all directories, before files, and before links!
At least this solution is less awkward then the current situation.
Comment 26 André Klapper 2011-07-07 11:35:17 UTC
(In reply to comment #25)
> i can still reproduce this bug

That is very likely as this bug report is not in RESOLVED FIXED state, but in NEW state.


> PLEASE IMPLEMENT

No need for capital letters, plus please avoid forum-style comments if possible.

> If it is a matter of money, just write it here, and i will contact you
> (probably one of the devs) directly: 

There are several platforms for open-source bounties available. Plus feel free to contact the nautilus mailing list to reach a bigger audience. Thanks!
Comment 27 Cosimo Cecchi 2011-09-13 02:22:12 UTC
*** Bug 654721 has been marked as a duplicate of this bug. ***
Comment 28 Cosimo Cecchi 2012-08-09 14:49:21 UTC
*** Bug 338154 has been marked as a duplicate of this bug. ***
Comment 29 Cosimo Cecchi 2012-08-09 14:52:09 UTC
*** Bug 681176 has been marked as a duplicate of this bug. ***
Comment 30 Cosimo Cecchi 2012-08-09 14:52:58 UTC
*** Bug 631376 has been marked as a duplicate of this bug. ***
Comment 31 Cosimo Cecchi 2012-08-16 16:19:00 UTC
*** Bug 638839 has been marked as a duplicate of this bug. ***
Comment 32 William Jon McCann 2012-08-17 11:58:55 UTC
The appears to be due to the behavior of g_utf8_collate_key_for_filename. In the test case in comment #20 what seems to be happening is that it determines 3 < 4 < 20. Possible dup of bug 352237.
Comment 33 Cosimo Cecchi 2012-10-18 19:46:56 UTC
*** Bug 633406 has been marked as a duplicate of this bug. ***
Comment 34 Charles M. 2012-10-19 02:28:40 UTC
I'm the submitter of bug 633406. I admit there is some similarity to the bug 355152, but it is in my opinion a different bug. From what I can see in the comments, 355152 is about problems caused by sorting using numeric ordering vs lexicographic ie 3, 4, 20 vs 20, 3, 4 and also about dropping punctuation symbols from sort keys. I'm not a fan of either of those things, but they do seem to be deliberate choices and some users would find the behaviors desireable.

My bug I believe is not something I think the developers wanted or which any user would consider desireable. It is a definite flaw due to the implementation of the natural numeric sorting. So the nautilus developers might want to consider addressing this problem even if it is determined to accept the other parts of 355152 as desired behavior. On the other hand maybe it is preferred to keep all the sort problems together on one bug. A configuration setting allowing users to have case-insensitive, but otherwise lexicographic sorting would for me solve this problem since I would prefer that, but users who continued to use the natural numeric sorting would still experience this problem.

Ther problem results from a bad interaction between the implementation of case-insensitive sorting and the natural numeric order sorting which causes case-insensitivity to fail when file names have letters differing only be case followed by differing numbers. For example, files sort like this

y1, y3, Y2

Instead of:

y1, Y2, y3.

Some time ago I looked into the source code and I believe the behaviour also was in g_utf8_collate_key_for_filename. I'm going to describe this from memory so there may be some details not exactly right, but I'm pretty sure I recall the basic problem.

The code nautilus is using that function to generate collation keys which provide case-insensitivity, ignore punctuation symbols, and also provides natural numeric ordering. It did this by splitting file names into separate chunks depending on whether they were numbers or letters. Punctuation characters were just dropped. A collation key was generated for each chunk, then those collation keys were concatenated together to create the final overall collation key returned by the function.

The collation key parts for chunks containing letters were created by passing the chunk to a collation key generator from a lower level library. I don't recall exactly the function name.

With LC_COLLATE=en_US.UTF-8 what I was seeing was that the lower level collation key function would return a string containing three fields. The first field was the passed string, all converted to a consistent case. Following this was a field representing the casing, like a 1 for upper or a 0 for lower, then a field representing diacriticals I think.

So AbC would come back as something like ABC101000, but aBc would come back ABC010000.

The extra fields for case and diacriticals would give you some consistency when sorting strings which only differ by case or diacriticals. 

For chunks of numbers the collation key segment was created using some kind of coding to make them sort in natural numeric order. (I don't think it was just simple zero padding, but I can't remember it exactly and for my example it doesn't really matter, so let's say it was zero padding to 3 digits for the sake of the example.)

So what happens in my example is Y2 is split into two chunks, Y and 2. Chunk #1 generates collation key Y10, then Chunk #2 is padded to 002. They get appended together, to become Y10002. For y3, the string is split to y and 3, then to Y00 and 003, concatenated to Y00003. The sorting using these collation keys goes like this:
y3 (Y00003)
Y2 (Y10002)

So y3 and Y2 end up in an unexpected order, which I don't think anyone wants.

Basically it seems like  g_utf8_collate_key_for_filename is not expecting the lower level function's returned collation key to include those extra fields for the case and diacriticals fields. I doubt the authors of that lower level function returning collation keys expected that their output would be concatenated with other stuff like this.

Ideally you would want to split the returned collation key segments for letter chunks into the 3 separate fields and agregate the case and diacritical fields at the end of the overall collation key.

So Y2 goes to Y and 2, then to Y10 and 002, then break Y10 into Y, 1, and 0, concatenate the Y, then the 002 then the 1 and the 0, resulting in Y00210.

Then the sort order would look like this
Y2 (Y00210)
y3 (Y00300)

With multiple letter chunks you would place all the case fields, followed by all the diacriticals fields at the end, so Y1A would be Y001A1100.

I'm not sure however if that lower level function always returns three fields for all possible values of LC_COLLATE so it might be complicated trying to deal with that.
Comment 35 André Klapper 2013-05-25 09:35:04 UTC
*** Bug 700950 has been marked as a duplicate of this bug. ***
Comment 36 James Murray 2015-08-27 16:04:17 UTC
This bug (or design decision) renders the file viewer useless for me too and I have to use "ls" instead.

I have a directory of photos of stock which have varying length part numbers.

I want to find a part no. starting say 4677 but I don't know how long it is, so where do I find it?

PLEASE give us back the option to sort as we've been able to since the 1980s.
Comment 37 lmanning17 2017-05-07 08:22:39 UTC
I'm just gonna throw my two cents in here. As someone who has used computers for over 20 years, this is the first time I've ever had trouble with sorting. Denounce Windows all you want, but the name sorting in Explorer is simply how everyone on earth expect them to be. symbols before numbers before letters. Windows added a neat feature back in Vista I believe where numbers that were delimited by spaces were treated as whole numbers which solved the zero-padding issue of days gone by.

_00
_AA
_aa
_BB
_bb
000
0AA
0aa
0BB
0bb
111
AAA
aaa
BBB
bbb
ccc 0
ccc 00
ccc 000
ccc 1
ccc 10
ccc 20
ccc 100
Comment 38 James Murray 2017-05-09 09:50:13 UTC
Ref comment 37. That might be nice for small numbers like that.

However, I have directories of photos of stock part numbers. These numbers vary in length. With this new "windowsy" sorting, it is next to impossible to find anything as I have to work out the _value_ of the part code instead of looking for it a digit at a time.

e.g.
71772198.jpg  would come before
123456789.jpg
Comment 39 lmanning17 2017-05-09 14:17:16 UTC
Yes, I can see why that could be annoying in some cases. If you are working with a whole bunch of photos, might it be worthwhile to look into using a photo application where you can tag photos with metadata so you can organize them away from the filesystem?
Comment 40 James Murray 2017-05-11 08:47:26 UTC
That seems like a sledgehammer to crack a nut. Leave file sorting alone and there's no need.

I access the files by "Add photos" in Ebay etc. which uses a regular file browser window, no idea how any metadata would help there?
Comment 41 Olivier Cailloux 2017-05-11 09:06:23 UTC
I suppose international organizations have specified standards for sorting (including sorting for specific locales). Best thing would be that the (major) softwares in the linux ecosystem implement these standards. This does not forbid some developer of some specific software to (try to) be more “creative”, but in any case standard sorting should be available as an option (and should even be the default, IMHO).
Just my two cents.
Comment 42 Tom 2017-05-11 11:09:52 UTC
3aa
4aa
20a

Come on.

At a _minimum_ on a *nix system I should be able to make it match "ls".

ls, of course, gives

20a
3aa
4aa

It continues to be a problem to be stuck with

3aa
4aa
20a
Comment 43 James Murray 2017-05-11 11:21:02 UTC
@42 - agreed.
Comment 44 Olivier Cailloux 2017-05-11 13:38:34 UTC
Created attachment 351637 [details]
Observing sorting behavior under Linux

Tests showing under two locales how three softwares sort a set of files according to their name.
Comment 45 Olivier Cailloux 2017-05-11 13:39:34 UTC
Created attachment 351639 [details]
Observing sorting behavior under Linux

Tests showing under two locales how three softwares sort a set of files according to their name.
Comment 46 Olivier Cailloux 2017-05-11 13:50:22 UTC
I attached my tests, as detailed in #21. (These are old tests, that I did not try to reproduce recently, but I suspect nothing has changed.)

Note that even ls does not seem to exhibit correct sorting behavior (for French locale at least).
Comment 47 André Klapper 2021-06-18 15:55:48 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org (resources are unfortunately quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version of Files (nautilus), then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/nautilus/-/issues/

Thank you for your understanding and your help.