GNOME Bugzilla – Bug 521401
Won't import filenames with accented characters
Last modified: 2011-07-18 19:27:46 UTC
Tracks 02, 04 and 06 won't import. Output from ls -l | less: total 55892 -rw-r--r-- 1 pcor cachefs 5118811 2008-03-08 19:22 01 Queremos Paz.m4a -rw-r--r-- 1 pcor cachefs 4346721 2006-09-30 12:14 02 <C9>poca.m4a -rw-r--r-- 1 pcor cachefs 4888574 2008-03-08 19:38 03 Chunga's Revenge.m4a -rw-r--r-- 1 pcor cachefs 8194205 2006-09-30 12:14 04 Tr<ED>ptico.m4a -rw-r--r-- 1 pcor cachefs 5795759 2008-03-08 19:45 05 Santa Maria (Del Buen Ayre).m4a -rw-r--r-- 1 pcor cachefs 4078679 2006-09-30 12:14 06 Una M<FA>sica Brutal.m4a -rw-r--r-- 1 pcor cachefs 6041029 2008-03-08 19:52 07 El Capitalismo Foraneo.m4a -rw-r--r-- 1 pcor cachefs 5672105 2008-03-08 20:00 08 Last Tango In Paris.m4a -rw-r--r-- 1 pcor cachefs 6195490 2008-03-08 20:07 09 La Del Ruso.m4a -rw-r--r-- 1 pcor cachefs 6773063 2008-03-08 20:15 10 Vuelvo Al Sur.m4a
What characters are those? I re-imported some files with special characters (Icelandic) without problem. Also, anything in "Import Errors" or any output from 'banshee --debug'? Also, can you test the situation with other types of files?
Here are the filenames again. Hopefully the characters will show up correctly. They are, in this input box: 02 Época.m4a 04 Tríptico.m4a 06 Una Música Brutal.m4a I believe the accented characters are Spanish or Portuguese. No Import Errors show up. I tested some more, doing the following: Copied "~/Music/La Revancha Del Tango" to "~/Music/LaRevanchaDelTango" Renamed tracks 02, 04 and 06 Ran banshee --debug Imported "~/Music/LaRevanchaDelTango" results: all 10 tracks were imported, with the original Titles (accented characters included). See attached screenshot. Deleted the 10 tracks from Banshee. Imported "~/Music/La Revancha Del Tango" results: only 7 tracks were imported. tracks 02, 04 and 06 were missing. no error messages in terminal output. See attached screenshots.
Created attachment 107098 [details] results of importing "~/Music/LaRevanchaDelTango" all 10 tracks were imported. the titles used the accented characters.
Created attachment 107099 [details] results of importing "~/Music/La Revancha Del Tango" tracks with accented characters in the file name are not imported.
Created attachment 107100 [details] debug output from two import operations no obvious error messages from importing the files with the accent characters. if there were any, i think they would've shown up between the two timer messages.
Created attachment 107101 [details] directory listings
It appears that something's messed up in your locale; I see all my characters just fine in the terminal. What output do you get for the command 'echo $LANG'? Also, any possibility you could test this in SVN trunk?
pcor@zonbu-laptop ~ $ echo $LANG en_US.UTF-8 pcor@zonbu-laptop ~ $ w.r.t. SVN trunk testing, probably i've used SVN before but would need some instructions on what & how to test
The only other question I have is what happens with other file types--MP3, OGG, FLAC. Instructions for building from trunk are here: http://banshee-project.org/OnePointEx/BuildingAndRunning. If you choose to do it, please address any questions on the mailing list (http://mail.gnome.org/mailman/listinfo/banshee-list); keep the discussion here to your bug.
while testing other file types, i stumbled on an important clue: some background: the files i'm trying to import were originally created on a mac. then transferred to a USB thumb drive (FAT16). then transferred to two linux boxes. here's some test output: $ pwd /Documents/Music/La Revancha Del Tango $ $ ls 01 Queremos Paz.m4a 03 Chunga's Revenge.m4a 05 Santa Maria (Del Buen Ayre).m4a 07 El Capitalismo Foraneo.m4a 09 La Del Ruso.m4a 02 ?poca.m4a 04 Tr?ptico.m4a 06 Una M?sica Brutal.m4a 08 Last Tango In Paris.m4a 10 Vuelvo Al Sur.m4a $ $ ls | wc wc: standard input:2: Invalid or incomplete multibyte or wide character wc: standard input:4: Invalid or incomplete multibyte or wide character wc: standard input:6: Invalid or incomplete multibyte or wide character 10 37 230 $ $ ls 02* 02 ?poca.m4a $ $ ls 02* | wc wc: standard input:1: Invalid or incomplete multibyte or wide character 1 2 13 $ $ ls 02* | od -x 0000000 3230 c920 6f70 6163 6d2e 6134 000a 0000015 $ $
running the previous test in a directory where i've fixed the file names, i get the following output. notice how the 02 track filename is 14 bytes instead of 13: $ cd /media/La\ Revancha\ Del\ Tango/ $ ls 01 Queremos Paz.m4a 04 Tríptico.m4a 07 El Capitalismo Foraneo.m4a 10 Vuelvo Al Sur.m4a 02 Época.m4a 05 Santa Maria (Del Buen Ayre).m4a 08 Last Tango In Paris.m4a ls.txt 03 Chunga's Revenge.m4a 06 Una Música Brutal.m4a 09 La Del Ruso.m4a $ $ ls | wc 11 38 240 $ $ ls 02* 02 Época.m4a $ $ ls 02* | wc 1 2 14 $ $ ls 02* | od -x 0000000 3230 c320 7089 636f 2e61 346d 0a61 0000016 $ $
and finally, i'd like to note that 2 text editors i use, mousepad and vile, handle the "bad" filenames just fine, no complaints like you see from wc, and they both somehow figure out the correct character to display (this might have something to do with precomposed characters and combining diacritical characters.) rhythmbox, on my ubuntu machine, can import the files and play them, but if you look at the details of the files, the file name is listed as "Unknown file name". i'm not sure this is a bug.
also, i don't think i can properly test other file types. i don't know how to create the problematic filenames.
not a bug, it's an encoding problem. if i rename the 02 file with either 0xC389 or 0x45CC81 encoding the É, the banshee imports the file correctly. something encoded it as 0xC9, but it wasn't banshee.
i suppose it could be considered a bug if the UTF-8 encoding should only affect how the filename is displayed, not how the file is imported. your call.
I think the problem is that these aren't encoded in UTF-8 (despite your locale) because of originating from a Mac. (I'm not sure about that, though.) I also think this really isn't a Banshee bug, but a case could be made that it is since Rhythmbox imports them (successfully?). Anyway, I doubt there's much to be done in Banshee on this; I expect that the real issue (if any) might be lower-level, maybe Mono.
I have this issue too. My computer $LANG is pt-BR.UTF-8. My iTunes play list is UTF-8 and was created on a Mac. The songs I'm trying to import are in Brazilian Portuguese and they appear normal on my .xml library and in my OS. Is there anything I can do to import these songs??
(In reply to comment #17) > I have this issue too. > > My computer $LANG is pt-BR.UTF-8. > My iTunes play list is UTF-8 and was created on a Mac. > The songs I'm trying to import are in Brazilian Portuguese and they appear > normal on my .xml library and in my OS. > > Is there anything I can do to import these songs?? This bug report is pretty old (and closed). You may want to open a new report for your issue. If you don't mind, it may be helpful if you could attach the playlist file as well.