GNOME Bugzilla – Bug 652455
Rhythmbox is failed to detect the charset in ID3 tags other than utf8
Last modified: 2011-06-14 05:07:10 UTC
Created attachment 189826 [details] The screenshot shows how the bug looks like. The interface shows when Rhythmbox failed to decode the string correctly and shows chaos for Asian users. I have many mp3 files from Asian language, such as Chinese, Japanese, and Korean. The filename, ID3 tags (such as author, album, etc.) are all in Asian languages. And for the historical reason, not all of them are encoded in UTF8, acturally, about 70% of my mp3 files are not encoded in UTF8. Many of them are encoded in GBK, BIG5, or JIS. And I believe it's very common situation for Asian users. The problem for is that, Rhythmbox can only decode the tags by UTF8 (or maybe current locale), which makes the interface chaos, and definately not usable for Asian user. Just like the picture attached, I cannot find the music by song name, author, or album. There are 2 possible solutions for this. We can first try to decode the string by UTF8, and run a detect code to see whether the decoded string is not likely to be correct. Many browsers have such kind of encoding detection function based on statistics. And then, if the encoding detection function is intelligent enough, we can use the encoding detected to decode the string again, and show them. If the encoding detection function is not as good as we expected, we can add an option on user interface, let user to choose which charset should be used to decode the string if the error detection has been found during the decoding by UTF8.
Thanks for the bug report. This particular bug has already been reported into our bug tracking system, but please feel free to report any further bugs you find. *** This bug has been marked as a duplicate of bug 451565 ***
The idea of automatic charset detection is similar with bug 451565, however, the bug 451565 is talking about the encoding problem of subtitle in movie, my bug report is talking about the mp3 id tags such as song name, author, album, etc, which are used to identify the songs in the library. Without subtitle encoding auto-detection, we still can identify the movie and play them, however, without ID3 tag encoding support, we cannot use this software to identify songs, so we don't know which song we are going to play, just like a random click, which is unacceptable for a music player. So, I think this 2 bugs, bug 451565 and this one bug 652455, are related in some extent, but they are not duplicated.
See bug 647140 comment 3. GStreamer developers consider these two issues to be the same and have marked a number of 'detect id3 tag encoding' bugs as duplicates of bug 451565.