After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 749653 - dashdemux: Implement binary search for stream_sidx_seek
dashdemux: Implement binary search for stream_sidx_seek
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
1.5.2
Other All
: Normal enhancement
: 1.7.1
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-05-20 17:59 UTC by Jimmy Ohn
Modified: 2015-11-19 20:10 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
dashdemux: Implement binary search for stream_sidx_seek (1.58 KB, patch)
2015-05-20 18:02 UTC, Jimmy Ohn
none Details | Review
replace binary search to optimize in stream_sidx_seek. (1.99 KB, patch)
2015-05-21 12:43 UTC, Jimmy Ohn
none Details | Review
Add binary search to optimize in stream_sidx_seek. (2.00 KB, patch)
2015-05-30 09:59 UTC, Jimmy Ohn
none Details | Review
Add binary search to optimize in stream_sidx_seek. (2.00 KB, patch)
2015-06-03 11:28 UTC, Jimmy Ohn
none Details | Review
Add binary search to optimize in stream_sidx_seek (3.33 KB, patch)
2015-07-29 13:57 UTC, Jimmy Ohn
committed Details | Review

Description Jimmy Ohn 2015-05-20 17:59:52 UTC
Attached patch implements binary search for stream_sidx_seek.
Comment 1 Jimmy Ohn 2015-05-20 18:02:56 UTC
Created attachment 303693 [details] [review]
dashdemux: Implement binary search for stream_sidx_seek
Comment 2 Edward Hervey 2015-05-21 05:30:08 UTC
Review of attachment 303693 [details] [review]:

This looks wrong. The lowest entry it will return will be high / 2.

Say you have 200 entries. If your target ideal position is at 20, you will never reach it, it will always return at least 100.

A proper binary search would do the following (while high != low):
1) split the search domain in two regions (low-mid and mid-high)
2) is the requested ts *in* the 'mid' entry? If so, stop here, we've found it
3) if not, figure out where the target ts is (in the lower or higher region ?) and either adjust the 'high' or 'low' position accordingly
3.1) if ts < current mid, we know mid is the highest value, high = mid
3.2) if ts > current mid, we know mid is the lowest value, low = mid
4) recalculate mid = (high - low) / 2
5) goto 1)

That way you'd end up searching for less and less entries (200, 100, 50, 25, 12, 6, 3, 1) i.e. 8 steps instead of a max of 200 steps in the iterative case.
Comment 3 Edward Hervey 2015-05-21 06:00:01 UTC
Note that a slightly better algorithm could be to assume an even distribution of the entries, and weigh the mid-point evaluation. This is called interpolation search.

Example:
low value = entry 0 (ts : 0s)
high value = entry 100 (ts : 600s (10min))
target = 30s

Instead of using:
  mid = (high - low) / 2 (which would give you mid entry 50)

You weigh it:
  difference in entries : high - low : 100
  difference in values : high ts - low ts : 600s
  mid = (target ts) / (difference in values) * (difference in entries)
      = 30 / 600 * 100
      = 5

  If you're lucky it's in that entry (yay, found the entry in one go), if not, you re-adjust either high or low (juste like in steps 3.1 and 3.2 above) and repeat.

sidx entries are likely to be roughly the same duration, so this could allow very fast entry search.
Comment 4 Sebastian Dröge (slomo) 2015-05-21 06:38:19 UTC
Comment on attachment 303693 [details] [review]
dashdemux: Implement binary search for stream_sidx_seek

Just use gst_util_array_binary_search().
Comment 5 Edward Hervey 2015-05-21 08:06:09 UTC
or that :)
Comment 6 Jimmy Ohn 2015-05-21 09:02:58 UTC
(In reply to Edward Hervey from comment #3)
> Note that a slightly better algorithm could be to assume an even
> distribution of the entries, and weigh the mid-point evaluation. This is
> called interpolation search.

Example:
low value = entry 0 (ts : 0s)
high
> value = entry 100 (ts : 600s (10min))
target = 30s

Instead of using:
  mid
> = (high - low) / 2 (which would give you mid entry 50)

You weigh it:
 
> difference in entries : high - low : 100
  difference in values : high ts -
> low ts : 600s
  mid = (target ts) / (difference in values) * (difference in
> entries)
      = 30 / 600 * 100
      = 5

  If you're lucky it's in that
> entry (yay, found the entry in one go), if not, you re-adjust either high or
> low (juste like in steps 3.1 and 3.2 above) and repeat.

sidx entries are
> likely to be roughly the same duration, so this could allow very fast entry
> search.

It's very helpful for me. Thanks for your comment:)
I'll attach new patch.
Comment 7 Sebastian Dröge (slomo) 2015-05-21 09:06:45 UTC
Don't implement yet another binary search, unless it makes a very big difference compared to the standard one that is in libgstreamer already. As you noticed it's not trivial to implement binary search, and having multiple implementations means maintaining more code and making sure nothing breaks whenever someone changes something in there.
Comment 8 Jimmy Ohn 2015-05-21 09:15:29 UTC
(In reply to Sebastian Dröge (slomo) from comment #7)
> Don't implement yet another binary search, unless it makes a very big
> difference compared to the standard one that is in libgstreamer already. As
> you noticed it's not trivial to implement binary search, and having multiple
> implementations means maintaining more code and making sure nothing breaks
> whenever someone changes something in there.

OK I understood:). I'll attach the patch using gst_util_array_binary_search().
Thanks slomo.
Comment 9 Jimmy Ohn 2015-05-21 12:43:53 UTC
Created attachment 303756 [details] [review]
replace binary search to optimize in stream_sidx_seek.
Comment 10 Jimmy Ohn 2015-05-24 15:40:54 UTC
I also test using below command. It's working for me.
If my test is not correct, please let me know.
gst-launch-1.0 playbin uri="http://www-itec.uni-klu.ac.at/ftp/datasets/mmsys12/RedBullPlayStreets/redbull_4s/RedBullPlayStreets_4s_isoffmain_DIS_23009_1_v_2_1c2_2011_08_30.mpd"
Comment 11 Reynaldo H. Verdejo Pinochet 2015-05-28 21:02:08 UTC
Comment on attachment 303756 [details] [review]
replace binary search to optimize in stream_sidx_seek.


> 
>+static gint
>+gst_dash_demux_index_entry_search (GstSidxBoxEntry * entry, GstClockTime * ts,
>+    gpointer user_data)
>+{
>+  if (entry->pts + entry->duration < *ts)
>+    return -1;
>+  else if (entry->pts + entry->duration > *ts)
>+    return 1;
>+  else
>+    return 0;
>+}

It might make sense to avoid computing entry->pts + entry->duration
twice.
Comment 12 Jimmy Ohn 2015-05-30 09:58:48 UTC
(In reply to Reynaldo H. Verdejo Pinochet from comment #11)
> Comment on attachment 303756 [details] [review] [review]
> replace binary search to optimize in stream_sidx_seek.
> 
> 
> > 
> >+static gint
> >+gst_dash_demux_index_entry_search (GstSidxBoxEntry * entry, GstClockTime * ts,
> >+    gpointer user_data)
> >+{
> >+  if (entry->pts + entry->duration < *ts)
> >+    return -1;
> >+  else if (entry->pts + entry->duration > *ts)
> >+    return 1;
> >+  else
> >+    return 0;
> >+}
> 
> It might make sense to avoid computing entry->pts + entry->duration
> twice.

Thanks for your comment. I upload new patch to avoid computing it.
Comment 13 Jimmy Ohn 2015-05-30 09:59:46 UTC
Created attachment 304302 [details] [review]
Add binary search to optimize in stream_sidx_seek.
Comment 14 Edward Hervey 2015-06-02 06:22:23 UTC
Review of attachment 304302 [details] [review]:

::: ext/dash/gstdashdemux.c
@@ +910,3 @@
+  if (entry_ts < *ts)
+    return -1;
+  else if (entry_ts > *ts)

This looks wrong. Shouldn't this second 'if' be:

"else if (entry->pts > *ts)" ?

i.e. "if the timestamp is before the beginning of the entry" as opposed to "if the timestamp is before the end of the entry".

Another way to look at it is that the last "else" branch (the one that returns 0) will only be reached (with the current code) if "entry_ts == *ts", which is wrong, it should be reached if "*ts >= entry->pts && *ts < entry_pts + entry_duration"
Comment 15 Jimmy Ohn 2015-06-03 11:24:04 UTC
(In reply to Edward Hervey from comment #14)
> Review of attachment 304302 [details] [review] [review]:
> 
> ::: ext/dash/gstdashdemux.c
> @@ +910,3 @@
> +  if (entry_ts < *ts)
> +    return -1;
> +  else if (entry_ts > *ts)
> 
> This looks wrong. Shouldn't this second 'if' be:
> 
> "else if (entry->pts > *ts)" ?
> 
> i.e. "if the timestamp is before the beginning of the entry" as opposed to
> "if the timestamp is before the end of the entry".
> 
> Another way to look at it is that the last "else" branch (the one that
> returns 0) will only be reached (with the current code) if "entry_ts ==
> *ts", which is wrong, it should be reached if "*ts >= entry->pts && *ts <
> entry_pts + entry_duration"

I miss understanding! You're 100% right! Thanks for your review:)
Comment 16 Jimmy Ohn 2015-06-03 11:28:30 UTC
Created attachment 304492 [details] [review]
Add binary search to optimize in stream_sidx_seek.
Comment 17 Jimmy Ohn 2015-07-28 13:47:33 UTC
@Edward Hervey
Could you review my patch? I'm waiting for your review.
Comment 18 Thiago Sousa Santos 2015-07-28 15:01:29 UTC
Review of attachment 304492 [details] [review]:

::: ext/dash/gstdashdemux.c
@@ +928,3 @@
+      sizeof (GstSidxBoxEntry),
+      (GCompareDataFunc) gst_dash_demux_index_entry_search,
+      GST_SEARCH_MODE_AFTER, &ts, NULL);

Don't we want GST_SEARCH_MODE_BEFORE here?

Suppose you have the following entries:
entry[0] = 0
entry[1] = 10
entry[2] = 20

If you seek to 5, you want to start from 0, not from 10, right?

@@ +940,2 @@
   else
     dashstream->sidx_current_remaining = 0;

Assuming pts is a non-negative value and if we use GST_SEARCH_MODE_BEFORE, there is still a chance that we are seeking after the last element and we will get the last one. In this case we want to go with the 'else' option here. This should need special handling. Maybe before doing the binary search just check if the ts is already past the last element on the entries array.
Comment 19 Jimmy Ohn 2015-07-29 13:56:08 UTC
(In reply to Thiago Sousa Santos from comment #18)
> Review of attachment 304492 [details] [review] [review]:
> 
> ::: ext/dash/gstdashdemux.c
> @@ +928,3 @@
> +      sizeof (GstSidxBoxEntry),
> +      (GCompareDataFunc) gst_dash_demux_index_entry_search,
> +      GST_SEARCH_MODE_AFTER, &ts, NULL);
> 
> Don't we want GST_SEARCH_MODE_BEFORE here?
> 
> Suppose you have the following entries:
> entry[0] = 0
> entry[1] = 10
> entry[2] = 20
> 
> If you seek to 5, you want to start from 0, not from 10, right?
> 
> @@ +940,2 @@
>    else
>      dashstream->sidx_current_remaining = 0;
> 
> Assuming pts is a non-negative value and if we use GST_SEARCH_MODE_BEFORE,
> there is still a chance that we are seeking after the last element and we
> will get the last one. In this case we want to go with the 'else' option
> here. This should need special handling. Maybe before doing the binary
> search just check if the ts is already past the last element on the entries
> array.

Thanks for your review. I upload the new patch for your comment.
Comment 20 Jimmy Ohn 2015-07-29 13:57:21 UTC
Created attachment 308398 [details] [review]
Add binary search to optimize in stream_sidx_seek
Comment 21 Jimmy Ohn 2015-08-10 13:01:14 UTC
@Edward and Thiago
could you please review my patch?
Comment 22 Thiago Sousa Santos 2015-11-19 20:10:25 UTC
commit 7093ad482bf3ffb88fb4bfc97f29e8b7a042c855
Author: Jimmy Ohn <yongjin.ohn@lge.com>
Date:   Wed Jul 29 22:31:30 2015 +0900

    dashdemux: Add binary search for stream_sidx_seek
    
    Add binary search to optimize in stream_sidx_seek.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=749653