After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 755464 - local-metadata: Another bad TV show parsing
local-metadata: Another bad TV show parsing
Status: RESOLVED FIXED
Product: grilo
Classification: Other
Component: lua
git master
Other Mac OS
: Normal normal
: ---
Assigned To: grilo-maint
grilo-maint
Depends on:
Blocks:
 
 
Reported: 2015-09-23 13:24 UTC by Bastien Nocera
Modified: 2015-09-25 12:19 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
tests: local-metadata: Re-add some old tests that used to work (1.48 KB, patch)
2015-09-23 13:25 UTC, Bastien Nocera
committed Details | Review
tests: local-metadata expects movie without suffix (1.35 KB, patch)
2015-09-24 13:27 UTC, Victor Toso
committed Details | Review
lua-factory: improve title parsing for movies (1.58 KB, patch)
2015-09-24 13:27 UTC, Victor Toso
none Details | Review
lua-factory: suffix can't be a valid metadata (914 bytes, patch)
2015-09-24 13:27 UTC, Victor Toso
none Details | Review
lua-factory: improve title parsing for movies (2.56 KB, patch)
2015-09-25 10:32 UTC, Victor Toso
committed Details | Review

Description Bastien Nocera 2015-09-23 13:24:25 UTC
+++ This bug was initially created as a clone of Bug #727181 +++

As we don't parse URLs anymore, I transformed it the way the tracker, filesystem or local-metadata plugins would have done, and tested again:

"My super series.S01E01.mp4" finds that "mp4" is an episode title
and
"140127Mata-16x9 (bug 723166).mp4" finds that "140127Mata" is the show name.
Comment 1 Bastien Nocera 2015-09-23 13:25:03 UTC
Created attachment 311945 [details] [review]
tests: local-metadata: Re-add some old tests that used to work
Comment 2 Victor Toso 2015-09-24 13:27:31 UTC
Created attachment 312051 [details] [review]
tests: local-metadata expects movie without suffix

This way the least we can do is removing the video suffix.
"Test.mp4" expects "Test" and the same should apply to others.
Comment 3 Victor Toso 2015-09-24 13:27:40 UTC
Created attachment 312052 [details] [review]
lua-factory: improve title parsing for movies

When our parser does not work for tv shows nor movies, the default is to
remove the suffix if it is possible.
Comment 4 Victor Toso 2015-09-24 13:27:48 UTC
Created attachment 312053 [details] [review]
lua-factory: suffix can't be a valid metadata

the lua pattern + would allow ".mp4" to be a valid return when removing
the suffix.
Comment 5 Victor Toso 2015-09-24 13:32:05 UTC
regarding "140127Mata-16x9 (bug 723166).mp4"
it is currently catching:
show: 140127Mata
season: 16
episode: 9
title:  (bug 723166)

which is quite good :)
I'm not sure how we could programmatically avoid this being recognize as a tv show or a movie.

16x9 is a good sign for season and episode
what comes before is usually the show and afterwards is the title

Ideas?
Comment 6 Bastien Nocera 2015-09-24 13:41:22 UTC
Review of attachment 312051 [details] [review]:

Sure.
Comment 7 Bastien Nocera 2015-09-24 13:43:26 UTC
Review of attachment 312052 [details] [review]:

::: src/lua-factory/sources/grl-video-title-parsing.lua
@@ +55,3 @@
 
+function remove_suffix(title)
+  return title:gsub("(.+)%..-$", "%1")

Are you sure this will only remove suffixes? We only work with titles, which will have been set by the filesystem, tracker, or UPnP plugins.
Comment 8 Bastien Nocera 2015-09-24 13:44:24 UTC
Review of attachment 312053 [details] [review]:

Right, marked only as reviewed based on the previous review.
Comment 9 Victor Toso 2015-09-24 13:48:16 UTC
(In reply to Bastien Nocera from comment #7)
> Review of attachment 312052 [details] [review] [review]:
> 
> ::: src/lua-factory/sources/grl-video-title-parsing.lua
> @@ +55,3 @@
>  
> +function remove_suffix(title)
> +  return title:gsub("(.+)%..-$", "%1")
> 
> Are you sure this will only remove suffixes? We only work with titles, which
> will have been set by the filesystem, tracker, or UPnP plugins.

it removes the last ".somethinghere" which not necessarily is a suffix. So, should we have a list of suffix to remove?
Comment 10 Bastien Nocera 2015-09-24 13:50:56 UTC
(In reply to Victor Toso from comment #9)
> (In reply to Bastien Nocera from comment #7)
> > Review of attachment 312052 [details] [review] [review] [review]:
> > 
> > ::: src/lua-factory/sources/grl-video-title-parsing.lua
> > @@ +55,3 @@
> >  
> > +function remove_suffix(title)
> > +  return title:gsub("(.+)%..-$", "%1")
> > 
> > Are you sure this will only remove suffixes? We only work with titles, which
> > will have been set by the filesystem, tracker, or UPnP plugins.
> 
> it removes the last ".somethinghere" which not necessarily is a suffix. So,
> should we have a list of suffix to remove?

I think a list of common suffixes would be good enough for now. If needed, in the future, we can always extend lua-factory to help us by using GIO for that.
Comment 11 Victor Toso 2015-09-25 09:43:56 UTC
Review of attachment 311945 [details] [review]:

::: tests/local-metadata/test_local_metadata.c
@@ +99,3 @@
     { "metalocalypse.s02e01.dvdrip.xvid-ffndvd.avi", "metalocalypse", NULL, 2, 1 },
     { "Boardwalk.Empire.S04E01.HDTV.x264-2HD.mp4", "Boardwalk Empire", NULL, 4, 1 },
+    { "My super series.S01E01.mp4", "My super series", NULL, 1, 1 },

This one looks good.

@@ +109,3 @@
     /* These below should not be detected as an episode of a series. */
     { "My.Neighbor.Totoro.1988.1080p.BluRay.X264.mkv", NULL, NULL, 0, 0 },
+    { "140127Mata-16x9 (bug 723166).mp4", NULL, NULL, 0, 0 }

Hey, please check Comment #5 about the expected values of this test.
Either we have a correct way of excluding show-name and/or titles or else this might be somewhat valid :(
Comment 12 Victor Toso 2015-09-25 10:32:58 UTC
Created attachment 312125 [details] [review]
lua-factory: improve title parsing for movies

When our parser does not work for tv shows nor movies, the default is to
remove the suffix if it is possible.


changes:
- we only remove suffix from formats we recognize
- squashed patches related to suffix as one
Comment 13 Victor Toso 2015-09-25 10:35:36 UTC
Comment on attachment 312051 [details] [review]
tests: local-metadata expects movie without suffix

Attachment 312051 [details] pushed as 957f44d - tests: local-metadata expects movie without suffix
Comment 14 Bastien Nocera 2015-09-25 11:38:46 UTC
(In reply to Victor Toso from comment #11)
> Review of attachment 311945 [details] [review] [review]:
> 
> ::: tests/local-metadata/test_local_metadata.c
> @@ +99,3 @@
>      { "metalocalypse.s02e01.dvdrip.xvid-ffndvd.avi", "metalocalypse", NULL,
> 2, 1 },
>      { "Boardwalk.Empire.S04E01.HDTV.x264-2HD.mp4", "Boardwalk Empire",
> NULL, 4, 1 },
> +    { "My super series.S01E01.mp4", "My super series", NULL, 1, 1 },
> 
> This one looks good.
> 
> @@ +109,3 @@
>      /* These below should not be detected as an episode of a series. */
>      { "My.Neighbor.Totoro.1988.1080p.BluRay.X264.mkv", NULL, NULL, 0, 0 },
> +    { "140127Mata-16x9 (bug 723166).mp4", NULL, NULL, 0, 0 }
> 
> Hey, please check Comment #5 about the expected values of this test.
> Either we have a correct way of excluding show-name and/or titles or else
> this might be somewhat valid :(

I guess the output from comment #5 would be good enough.
Comment 15 Bastien Nocera 2015-09-25 11:42:30 UTC
Review of attachment 312125 [details] [review]:

Looks fine otherwise.

::: src/lua-factory/sources/grl-video-title-parsing.lua
@@ +45,3 @@
+-- https://en.wikipedia.org/wiki/Video_file_format
+video_suffixes = {
+  "webm", "mkv", "flv", "vob", "ogv", "ogg",

I think we should just have common ones:

"webm", "mkv", "flv", "ogv", "ogg, "avi", "mov", "wmv", "mp4", "m4v", "mpeg", "mpg"

The rest is too much of an edge case.
Comment 16 Victor Toso 2015-09-25 12:19:00 UTC
Attachment 311945 [details] pushed as 77f8954 - tests: local-metadata: Re-add some old tests that used to work
Attachment 312125 [details] pushed as 26c644e - lua-factory: improve title parsing for movies

- Updated test to have metadata provided by grl-video-title-parsing (Comment #5)
- Removed unecessary video extensions
- tests are working