After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 741562 - local-metadata: Broken title with another TV episode format
local-metadata: Broken title with another TV episode format
Status: RESOLVED FIXED
Product: grilo
Classification: Other
Component: plugins
unspecified
Other All
: Normal normal
: ---
Assigned To: grilo-maint
grilo-maint
Depends on:
Blocks:
 
 
Reported: 2014-12-15 16:07 UTC by Bastien Nocera
Modified: 2015-01-06 00:14 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
local-metadata: Add a test for more episode formats (1.23 KB, patch)
2014-12-15 16:07 UTC, Bastien Nocera
none Details | Review
local-metadata: Improve string sanitise (2.13 KB, patch)
2014-12-18 20:42 UTC, Victor Toso
reviewed Details | Review
local-metadata: Improve regexp for tv shows (1.62 KB, patch)
2014-12-18 20:42 UTC, Victor Toso
accepted-commit_now Details | Review
changes since v1: (2.19 KB, patch)
2014-12-19 19:33 UTC, Victor Toso
committed Details | Review
local-metadata: improve regexp for tv shows (2.55 KB, patch)
2014-12-19 19:38 UTC, Victor Toso
committed Details | Review

Description Bastien Nocera 2014-12-15 16:07:00 UTC
This episode title came from a different source and isn't being parsed
properly. It not showing up as a TV show is the biggest problem.
Comment 1 Bastien Nocera 2014-12-15 16:07:03 UTC
Created attachment 292758 [details] [review]
local-metadata: Add a test for more episode formats
Comment 2 Victor Toso 2014-12-17 19:51:48 UTC
Real Humans S01 EP01 [X264] [HD 720p] [FR] [SWE] [SRT FR] [MRPHU].mkv

TV SHOW: Real Humans
Season: 1
Episode: 1
audio (first): FR
audio (second): SWE
subtitles (first) FR

is that right?
We don't have metadata-key for audio-language, do we?
Comment 3 Bastien Nocera 2014-12-18 06:58:03 UTC
(In reply to comment #2)
> Real Humans S01 EP01 [X264] [HD 720p] [FR] [SWE] [SRT FR] [MRPHU].mkv
> 
> TV SHOW: Real Humans
> Season: 1
> Episode: 1
> audio (first): FR
> audio (second): SWE
> subtitles (first) FR
> 
> is that right?

Yes.

> We don't have metadata-key for audio-language, do we?

But I wouldn't bother trying to parse anything but show-name, season/episode and episode title. The subtitles file is a separate file, and metadata about the audio tracks is in the file itself.
Comment 4 Victor Toso 2014-12-18 20:42:34 UTC
Created attachment 293004 [details] [review]
local-metadata: Improve string sanitise

* Do not get substring with a blacklisted word.
* Only allow a few chars to finish the substring.
* Include new blacklist word "x264".
Comment 5 Victor Toso 2014-12-18 20:42:45 UTC
Created attachment 293005 [details] [review]
local-metadata: Improve regexp for tv shows

* Get episode number in the format of "..EP01.."
Comment 6 Bastien Nocera 2014-12-19 17:16:05 UTC
Review of attachment 293004 [details] [review]:

It would be good to have tests which excercise those changes in the same commit.

::: src/local-metadata/grl-local-metadata.c
@@ +273,3 @@
+    gchar last_char = *(line_end - 1);
+
+    /* After removing substrings, ignore non alpha-numeric in the

substrings are what in this case?

@@ +275,3 @@
+    /* After removing substrings, ignore non alpha-numeric in the
+     * end of the string */
+    while (g_ascii_isalnum (last_char) == FALSE &&

You should probably be using unicode functions here, no? g_unichar_isalnum() and g_utf8_next_char() to skip to the next character.
Comment 7 Bastien Nocera 2014-12-19 17:17:32 UTC
Review of attachment 293005 [details] [review]:

Looks good.
Comment 8 Victor Toso 2014-12-19 19:00:29 UTC
(In reply to comment #6)
> Review of attachment 293004 [details] [review]:
> 
> It would be good to have tests which excercise those changes in the same
> commit.

I'll combine the test with the second patch.

> 
> ::: src/local-metadata/grl-local-metadata.c
> @@ +273,3 @@
> +    gchar last_char = *(line_end - 1);
> +
> +    /* After removing substrings, ignore non alpha-numeric in the
> 
> substrings are what in this case?

The substring which contain the blacklisted word.

Real Humans S01 EP01 [X264] [HD 720p] [FR] [SWE] [SRT FR] [MRPHU].mkv
<-- used in regexp --><------------- discarded --------------------->

> 
> @@ +275,3 @@
> +    /* After removing substrings, ignore non alpha-numeric in the
> +     * end of the string */
> +    while (g_ascii_isalnum (last_char) == FALSE &&
> 
> You should probably be using unicode functions here, no? g_unichar_isalnum()
> and g_utf8_next_char() to skip to the next character.

Thanks! Fixing...
Comment 9 Victor Toso 2014-12-19 19:33:47 UTC
Created attachment 293088 [details] [review]
changes since v1:

- using g_utf8_find_prev_char in the end of the string
- using g_unichar_isalnum instead of ascii
- improve comment

local-metadata: Improve string sanitise

* Do not get substring with a blacklisted word.
* Only allow a few chars to finish the substring.
* Include new blacklist word "x264".
Comment 10 Bastien Nocera 2014-12-19 19:36:52 UTC
Review of attachment 293088 [details] [review]:

Looks good.

::: src/local-metadata/grl-local-metadata.c
@@ +276,3 @@
+     * char in the end of the sanitised string */
+    while (g_unichar_isalnum (*line_end) == FALSE &&
+           *line_end != '!' && *line_end != '?' && *line_end != '.') {

Can you please put one statement on each line, so:
while (g_unichar_isalnum (*line_end) == FALSE &&
           *line_end != '!' &&
           *line_end != '?' &&
           *line_end != '.') {

I found it more readable.
Comment 11 Victor Toso 2014-12-19 19:38:58 UTC
Created attachment 293089 [details] [review]
local-metadata: improve regexp for tv shows

changes since v1:

Combined with test case provided by Bastien
Comment 12 Victor Toso 2014-12-19 20:06:09 UTC
Review of attachment 293088 [details] [review]:

git bz fail :(
pushed as 92141ba9dcecacdca3a8469b4edeaed10237d8ac
Comment 13 Victor Toso 2014-12-19 20:06:57 UTC
Review of attachment 293089 [details] [review]:

pushed as bd2e42a796fcb9b5333f669087863ae0bd8723d4
Comment 14 Bastien Nocera 2015-01-06 00:13:52 UTC
Comment on attachment 292758 [details] [review]
local-metadata: Add a test for more episode formats

Test was already merged into the commit with the fix.