After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 661100 - Rescan Music Library changes import date (Date Added column) on files with special characters
Rescan Music Library changes import date (Date Added column) on files with sp...
Status: RESOLVED FIXED
Product: banshee
Classification: Other
Component: general
2.2.0
Other Linux
: Normal normal
: ---
Assigned To: Andrés G. Aragoneses (IRC: knocte)
Banshee Maintainers
: 652953 662650 663306 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2011-10-06 16:01 UTC by Matthias Puech
Modified: 2011-12-11 21:04 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
patch with "System.IO" namespace (1.48 KB, patch)
2011-11-04 11:26 UTC, Andrés G. Aragoneses (IRC: knocte)
reviewed Details | Review
db migration (2.46 KB, patch)
2011-11-09 20:45 UTC, Andrés G. Aragoneses (IRC: knocte)
committed Details | Review

Description Matthias Puech 2011-10-06 16:01:22 UTC
When I hit Tools/Rescan Music Library, *some* of the files in my library get their Import date set to the current date. It appears to select random files for this reset. The files in question were already in my library, and weren't moved or copied or modified.

It is not the behavior I'd expect since these are old files, and it pollutes my 'Recently added' smart playlist.

* Steps to reproduce:
  - With a big music library, hit Rescan Music Library.
  - Watch the count of items in the Recently Added playlist.
  - With a little bit of bad luck, it should increase.

Note: if I rescan twice, the second time I get no new new files.
Comment 1 Age Bosma (IRC: Forage) 2011-10-06 16:12:30 UTC
I can confirm this issue. The exact same thing happened to me today, though I did not have the time to file an issue yet.

It appears to affect those files that have non-alphanumeric characters in the track title and/or artist name (,&@$=+ etc).
Comment 2 Matthias Puech 2011-10-07 14:34:08 UTC
Well observed! Indeed, all the tracks reset have punctuations (&,'=?) in their *track titles*. Spaces are not in this list. Album and artist with punctuations are not reset.
Comment 3 Andrés G. Aragoneses (IRC: knocte) 2011-10-11 23:27:37 UTC
I cannot reproduce this problem with master (and I have tried with non-alphanumeric characters).

Bertrand, did you confirm the bug because you reproduced it, or because 2 people reproduced it? If the former, can you tell me how you did it? Or anyone else can extract an exact pattern of steps to do it?
Comment 4 Matthias Puech 2011-10-12 09:45:26 UTC
I just reproduced it on 2.2.0:
- move the content of ~/Music/ somewhere else, leaving it empty
- start with an empty library (--db=blabla)
- import two tracks (one with a title containing a comma, one without any punctuation)
- they are copied to the ~/Music
- wait a couple of secs
- hit Rescan Library
- the 'Imported date' field in the properties is now different for the two files: for the first file it is when you hit Rescan, for the second it's when you actually imported
Comment 5 Andrés G. Aragoneses (IRC: knocte) 2011-10-14 22:46:24 UTC
Matthias, thanks for the information.

I did all the steps you mention except (1) and (4). Can you tell me if those steps are required to reproduce the bug please?
Comment 6 Christoph Wolk 2011-10-16 16:36:07 UTC
I just ran into this problem (losing a couple of thousand songs from my playlists :-/ )

This issue may be related to #652953. I had a look at all the newest tracks to see whether I can find some commonalities.

All the new tracks have symbols in their filenames that are percent-encoded. Comparing the tracks that got readded as new with the counterparts that remained as their old tracks, I only found examples for '&', ',' and "="; "?" and the apostrophe seem unaffected.

Going by #652953, I assume that the symbol got replaced by the percent-encoding. This applies only to the file name, not the directory name. For example, the following URI still has a & in the directory of the URI, but not in the file name:
file:///home/wolki/Musik/Einzel/The%20Mamas%20&%20the%20Papas/The%20Mamas%20%26%20the%20Papas%20-%20California%20Dreamin'.mp3

This seems to hold for all of the thousands of new entries: all of them have percent-encodings for each instance of '?,=' in the file name, but not in the directory. Also, almost all of them have one of these symbols (and the ones that don't are genuinely new files). 

I tried to reproduce it on a fresh database, but failed: apparently, the percent-encodings were used from the beginning. In #652953, apparently adding as files instead of folders is the cause, but I'm pretty sure I added most of the files by refreshing the library in the past. As this only affects some of the files within an album, I can compare the date they were added, for example I have an album that's in the database since February 2009, but the one track that contains a comma in its name is now listed as having been added as yesterday.

Also, while refreshing the library again, some tracks were added as new another time,, including some that I added for the first time in the refresh that recreated thousands of files. 
(e.g. file:///home/wolki/Musik/Alben/Various%20Artists/Blues%20Heroes/04%20-%20John%20Lee%20Hooker%20-%20One%20Bourbon%2C%20One%20Scotch%2C%20One%20Beer.flac , while the tracks that did not contain a comma stayed at their old versions). It also affected a small number of older files, unfortunately I can't say whether they survived the first rescan and were hit by the second one, or whether they were hit by both.
Comment 7 Bertrand Lorentz 2011-10-25 19:04:53 UTC
*** Bug 662650 has been marked as a duplicate of this bug. ***
Comment 8 Andrés G. Aragoneses (IRC: knocte) 2011-11-04 08:43:50 UTC
*** Bug 663306 has been marked as a duplicate of this bug. ***
Comment 9 Andrés G. Aragoneses (IRC: knocte) 2011-11-04 08:57:08 UTC
I might have good news: there is a high chance that this is indeed a duplicate of bug 631970. To confirm this, can anyone affected by this bug try the last patch that is proposed in that bug to check if it fixes the problem?

Thanks in advance for testing!
Comment 10 Matthias Puech 2011-11-04 11:14:27 UTC
I tried to build an ubuntu 2.2.0 package with the patch but I get a compile error

./Banshee.IO.Gio/Directory.cs(121,44): error CS0103: The name `Path' does not exist in the current context

I'm not fluent in C# and I have no idea why Path can't be found (apparently it's part of the standard library isn't it?)
Comment 11 Andrés G. Aragoneses (IRC: knocte) 2011-11-04 11:26:22 UTC
Created attachment 200682 [details] [review]
patch with "System.IO" namespace

Thanks for testing!

(In reply to comment #10)
> I tried to build an ubuntu 2.2.0 package with the patch but I get a compile
> error

That's weird! Please test this patch instead.
Comment 12 Jared Wiltshire 2011-11-04 12:09:14 UTC
I'd say that it is a duplicate. Its definitely related to inconsistencies in the way Banshee is handling Uri encoded paths. My database contains Uris which do not have any percent encoded commas or ampersands.

It seems that when it is doing the re-scan it is encoding the Uris with percent encoded symbols in the file name only and not the directory name which is why the SQL statements I posted in Bug 663306 dont work.

So yeah these lines from DatabaseImportManager.cs should return NULL but dont.

if (DatabaseTrackInfo.ContainsUri (uri, PrimarySourceIds)) {
  return null;
}
Comment 13 Xepher 2011-11-04 19:02:12 UTC
Review of attachment 200682 [details] [review]:

I was having the same problem here as well as those in 631970. I could simply edit a track's info with save metadata on, then a rescan would pick it up as a "new" track, causing it to show up with a new import date. The cause (and fix) suggested in 631970 is part of the above patch and I can confirm is solves this issue as well.

SafeUri directly uses the glib function g_filename_to_uri (and a couple other related glib functions) which do NOT encode commas and ampersands it seems. I confirmed this with with a bit of C code just to make sure, and that is how glib itself is working. If this is "correct" or not, I can't say. It is however different than Uri.EscapeDataString (originally used as part of the directory walk.) What's important is that this patch should at least make it consistent, and prevent it from finding existing tracks as "new" and therefore losing any metadata.
Comment 14 Andrés G. Aragoneses (IRC: knocte) 2011-11-06 14:18:32 UTC
Xepher, thanks for your testing!

I wanted a 2nd person to test the patch and I was hanging out yesterday on IRC with Age Bosma (Forage), and suddenly he couldn't reproduce the bug anymore, so he wasn't able to test the patch.

But now I just got up in the morning and had a revelation! And I now know what's the root cause of the issue (and I can reproduce the bug), so I'll be committing shortly your patch and maybe some other patch to update the database of people who may still have records imported with a buggy Banshee (with buggy I mean, a Banshee version that doesn't have your patch applied).
Comment 15 Jared Wiltshire 2011-11-07 01:54:05 UTC
So what is the correct form of URI that we want stored in the database? I believe it should just be the g_filename_to_uri converted version and this should be used everywhere.

I'm not sure why the code currently percent-encodes the filename separately from the path.

g_filename_to_uri calls g_escape_uri_string with the the UNSAFE_PATH parameter which means that '/', '&', '=', ':', '@', '+', '$' and ',' will be allowed and all other non alpha-numeric characters will be percent encoded.
Comment 16 Matthias Puech 2011-11-07 10:30:21 UTC
Hi,
I've been using banshee for a couple of days with the patch, and the bug didn't happen yet. Since it's rather nondeterministic when it happens, I'm waiting for a few more days, but I'm optimistic. Well done guys!
Comment 17 Matthias Puech 2011-11-09 12:38:13 UTC
This morning I compiled Banshee with the patch on my main computer (the one with the biggest library), launched it and hit Rescan. The bug was still there apparently: Banshee modified the 'added date' of about 800 files with punctuations in their names (among the ~20000 in my library) to the current date.
Does it mean that the patch doesn't solve the bug? I don't know, maybe there is something to update first in the database... What do you think? Let me know if you need me to test something else.
Comment 18 Andrés G. Aragoneses (IRC: knocte) 2011-11-09 20:45:09 UTC
Created attachment 201094 [details] [review]
db migration

(In reply to comment #17)
> This morning I compiled Banshee with the patch on my main computer (the one
> with the biggest library), launched it and hit Rescan. The bug was still there
> apparently: Banshee modified the 'added date' of about 800 files with
> punctuations in their names (among the ~20000 in my library) to the current
> date.
> Does it mean that the patch doesn't solve the bug? I don't know, maybe there is
> something to update first in the database... What do you think? Let me know if
> you need me to test something else.

Hey Matthias, yes I'm aware of that and I made this additional patch to solve it.
Comment 19 Xepher 2011-11-10 06:02:57 UTC
Sorry if I wasn't clear on what I'd tested myself either... After my library got corrupted by the original issue (and showed over 2000 new tracks) I rolled back to a backup I had from a few months prior and then did almost all of my tests and troubleshooting from that known-good point. It seems if the database had already been polluted with "bad" data, then fixing it back to the prior "correct" stuff caused almost the same problem for the first rescan... I think it should/would have remained correct after that though. However, Andres' patch above is a much more in-depth fix to correct errors already stored in the database, and is thus a much better solution.
Comment 20 Bertrand Lorentz 2011-11-10 19:55:34 UTC
Review of attachment 201094 [details] [review]:

Thanks for the patch !
I committed it with some changes:
- Prefix the name of the DB function with BANSHEE_
- Use static methods instead of allocating 2 SafeUri instances that are immediately discarded. This should be safe as we're only looking at URIs that start with file://
Comment 21 Bertrand Lorentz 2011-11-10 19:55:48 UTC
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.
Comment 22 Bertrand Lorentz 2011-12-11 21:04:05 UTC
*** Bug 652953 has been marked as a duplicate of this bug. ***