After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 778592 - souphttpsrc: Add sessions/caching to souphttpsrc
souphttpsrc: Add sessions/caching to souphttpsrc
Status: RESOLVED INCOMPLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
unspecified
Other All
: Normal enhancement
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2017-02-14 07:10 UTC by Sean-Der
Modified: 2017-03-28 09:17 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Sean-Der 2017-02-14 07:10:12 UTC
I have a pipeline that fetches a videofile on a remote HTTPS server and then plays it on a loop (catch the EOS and remove/add another souphttpsrc) however this means that for every loop the remote file needs to be downloaded again since I am tearing down and then creating a new souphttpsrc.

I would like to add the ability to maybe pass a SoupCache to souphttpsrc, so I can pass around a cache/session and only download a file once even though souphttpsrc has been created/destroyed multiple times.

Would a patch like this be accepted into souphttpsrc, and if so are there any suggestions/requirements to make sure my patch would be accepted.

thanks
Comment 1 Stephan Hesse 2017-03-02 13:33:26 UTC
Hey everybody

This sounds like a cool feature. I think this makes a lot of sense to have a persistent pipeline life cycle surviving cache for HTTP (or other) resources.

I think the problem could be more generally addressed, and less libsoup specific.

How about: a caching URI handler implementation that would generally instantiate internally existing URI handler implementations and cache on the filesystem previous results, while taking them from a specifically set path or eventually using the actual URI handler to do a request if the cache does not have the resource. Obviously, cache could be disabled at any point to force refreshing a resource.

This could be made completely HTTP (or any other protocol) implementation agnostic. It would just be about mapping a URI to a file storing the previously downloaded data related to it.

I'd favor such a solution as it would be more generally addressed, solving more issues while staying independent of further HTTP (or other) implementations and libsoup specifics.

What do you think?

Cheers
Comment 2 Sean-Der 2017-03-03 06:49:24 UTC
That sounds like it would solve my problem! No idea what the idiomatic element would look like (would it just be an attribute on uridecodebin etc..)

Do you want to start a thread on gstreamer-devel about it (or I can)? I would just really love to get sign-off from a committer before I put the work in. The problem doesn't seem technically challenging, but don't know if there will be push back (this can be solved by the application)
Comment 3 Sebastian Dröge (slomo) 2017-03-03 10:10:41 UTC
Sounds like a good plan or at least useful in general. Cache invalidation is a bit tricky in the general case though.

Do you plan on working on this, Stephan?
Comment 4 Stephan Hesse 2017-03-03 17:16:07 UTC
(In reply to Sebastian Dröge (slomo) from comment #3)
> Sounds like a good plan or at least useful in general. 

Nice that you agree, let me know if you have any more ideas in this direction!

> Cache invalidation is a bit tricky in the general case though.

Hm I kind off know what you mean, but not sure. Can you give an example?

Probably these things should really be left to decide by the application. We shouldn't look into the actual protocol or app payload specifics I think. Either its enabled and it checks if the URI is "mapped" on the filesystem at the chosen path and push it (leaving the actual protocol implementing URI handler inactive), or we are disabled and just pass on all the calls to the actual URI handler which may get the data.

What do you think? I am probably missing something, right? :) What would be your approach?

> 
> Do you plan on working on this, Stephan?

Would be a nice little thing to do, but just for fun ;) Can't guarantee this will be production-ready anytime soon. But I d be happy to churn out a basic version of it.

@Sean-Der What do you mean by "this can be solved by the application"?
Comment 5 Stephan Hesse 2017-03-03 17:24:23 UTC
@Sean-Der Anyway, no idea what will be the result accepted in the end, but if you find this a good idea too, lets collaborate.
Comment 6 Sean-Der 2017-03-03 18:47:44 UTC
@Stephan "this can be solved by the application" sometimes projects reject uncommon features because they are a maintenance burden, and a lot of people wouldn't use the feature poor wording on my part.

I would love to work on it! I have some GStreamer elements I haven't released and VERY tiny patches upstream. So I hopefully have enough knowledge to slow you down :p 

I will also being using this in a real-world application, so I can champion it getting merged/polished and would be using it myself everyday.
Comment 7 Sebastian Dröge (slomo) 2017-03-16 13:34:53 UTC
SoupSession shared is implemented in bug #780140. Should we mark this one as a duplicate of that one, or should we re-purpose this bug for a source-independent caching thing?
Comment 8 Stephan Hesse 2017-03-21 14:30:13 UTC
I'd still be for implementing some sort of source-independent caching thing ;) That is, I think one could even cache based on other stream properties, not only resource URI but eventually time-stamps. What do you think? For example a demuxer might be able to cache its previous output on a VoD stream and thus could immediatly resume playing at any point even if all the downstream queues have been flushed. It might be a new concept of efficient re-buffering? I have seen that a lot of time in adaptive streaming performance when seeking the issue is the retention of previously played content. But maybe this also goes way beyond the frame of the initial post here.

@Sean-Der So until now I lacked the time to pursue this. Have you gotten any progress in this? :)
Comment 9 Sebastian Dröge (slomo) 2017-03-21 20:21:06 UTC
Part of what you describe is already handled by queue2 (see the ringbuffer related properties).

How would you rename this bug, what should be implemented here, what's the scope? :)
Comment 10 Stephan Hesse 2017-03-24 12:43:45 UTC
True, now that you say it. 

Let's close this for now. If I make up my mind more clearly about the idea will post something.

Question: So queue2 with ringbuffer does keep things around even after a flush?
Comment 11 Sebastian Dröge (slomo) 2017-03-24 12:46:15 UTC
I don't know, but it probably should in case after the flush it has to continue from the same position. It does keep it around in the non-ringbuffer mode when the whole stream is cached to a file.
Comment 12 Sean-Der 2017-03-24 23:06:42 UTC
Sorry for the late response!

I tried out the SoupSession patch and it works great, however it doesn't work with setting a SoupCache on the session. Libsoup is writing to my on-disk cache, but not reading from it.

The SoupCache seems to only be queried with the async API
https://github.com/GNOME/libsoup/blob/master/libsoup/soup-session.c#L4300

If I use soup_session_send_async in a toy program it works great. I am not super familiar libsoup however, I submitted a TINY bug fix relating to certs but beyond that I haven't dug through anything else.

IMO GStreamer provided me with the tools to solve my problem (directly set or manipulate the session so I can do a SoupCache myself) so I consider this bug fixed. If anyone has suggestions about what I can do next/have wrong about libsoup and caching I would love to hear

@Stephan I am pretty busy as well unfortunately. I am going to keep going down this path and see what I can get working. It would be awesome if I could just get a changed into libsoup or GStreamer and get the caching working, but we will see!

thanks
Comment 13 Sebastian Dröge (slomo) 2017-03-25 08:46:06 UTC
It would seem like a bug if libsoup only makes use of the cache when doing async operations. Please report that to libsoup, thanks!
Comment 14 Sean-Der 2017-03-25 18:14:02 UTC
In case anyone ends up here from a search engine here is the libsoup bug, was already filed by somebody else in 2013
https://bugzilla.gnome.org/show_bug.cgi?id=693967
Comment 15 Sean-Der 2017-03-28 07:23:08 UTC
Hey slomo, 

I submitted a patch to libsoup, and the good news is that setting a SoupCache on the gst.soup.session SoupSession works!

So on my local instance I can loop video files (catch the EOF) and re-add with no additional downloads.

Do you have any suggestions for people I should reach out to, IRC channels I should be in to get my patch reviewed?

thanks
Comment 16 Sebastian Dröge (slomo) 2017-03-28 09:17:06 UTC
Dan, who reported bug #693967, is the maintainer of libsoup, he'll probably look at your patch sooner or later :)