GNOME Bugzilla – Bug 689460
Can't playback LPCM data via HTTP after playing back something else
Last modified: 2016-05-15 10:26:53 UTC
Situation as follows: 1) Playback audio (MP3, LPCM,...) on a pristine playbin via http 2) Setting playbin state to READY. 3) Setting playbin.uri to a new LPCM audio uri on http. The Content-Type definitely is correct (audio/L16;rate=44100;channels=2), can even be the same track that worked perfectly fine in 1) 4) Setting playbin state to PLAYING 5) Critical on console, no audio played: (lt-rygel:2148): GStreamer-CRITICAL **: gst_segment_to_running_time: assertion `segment->format == format' failed This seems to work with 0.10. Trace:
+ Trace 231250
Can you provide a public HTTP URI with such LPCM audio so I can test this with? Could you attach a (gzipped, if needed) GST_DEBUG=*:6 debug log please? I don't really known how this (audio/L16 content-type) can work properly tbh. souphttpsrc just puts raw audio caps on buffers for audio/L16, but that's not really right and assumes the chunking will line up etc. And souphttpsrc will produce a BYTE segment.
Logfile is at http://rygel-project.org/snapshot/testfiles/pcm-test.log.xz
Created attachment 230827 [details] Self-contained example Sample code with HTTP server, sample audio file and playbin.
info provided
sample data is audio/x-raw-int,channels=2,width=16,depth=16,signed=true,endianness=4321,rate=44100
Confirmed here, the problem is also that souphttpsrc does not necessarily return data in chunks that are a multiple of the sample size: ** (lt-pcm-test:11512): WARNING **: conv: size 3967 is not a multiple of unit size 4 The segment warnings come from the fact that souphttpsrc does not output a TIME segment. And baseaudiosink assumes TIME format. I think this use case is broken by design, there should be an element after the source that properly timestamps and chunks the data.
I guess we need a way to autoplug an audioparse..
This bug breaks playback of audio/L16 (aka LPCM) in gmrender-resurrect: https://github.com/hzeller/gmrender-resurrect/issues/14 audio/L16 is used frequently in UPnP/DLNA streaming. In particular, it is a mandatory format for a device to be DLNA certified. When an UPnP/DLNA device doesn't support a particular audio codec, it is frequent that media servers transcode to audio/L16. It is also used by media servers that generate realtime audio streams, like the output of an audio card. That's why I think fixing this bug is important. As a side not, this mutiple of 4 issue doesn't seem to happen on x86. But it sure do happen on ARM, for example on Raspberry Pi. I suppose that on x86, read() always return data whose size is multiple of 4.
Maybe a quick option would be to add a audio/x-unaligned-raw format, have sources like souphttpsrc/dlnasrc produce that instead of audio/x-raw (because it's not proper audio/x-raw since the samples are cut), then create a subclass of audioparse that only takes audio/x-unaligned-raw as input and audio/x-raw as output and rank that... Or modify demuxers to add a aligned=TRUE to audio/x-raw formats and just rank audioparse ?
I favor the media type idea. But, this seems L16 specific. So instead of audio/x-unaligned-raw I'd just set audio/L16.
Created attachment 322999 [details] [review] Patch to change audio/x-raw to audio/L16 in souphttpsrc This patch modifies the caps that souphttpsrc sets when encountering L16 data. With audio/L16, an L16 parser can be autoplugged.
Created attachment 323002 [details] [review] Adds an L16 parser to rawparse This adds an L16 parser to the rawparse plugin. It is a bin, can be autoplugged, and internally uses audioparse, configured to output S16BE data.
Review of attachment 323002 [details] [review]: This looks like you want to run gst-indent over the .c files. ::: gst/rawparse/gstL16parse.c @@ +42,3 @@ + + +G_DEFINE_TYPE (GstL16Parse, gst_L16_parse, GST_TYPE_BIN) Probably gst-indent after adding a ; at the end of the line here @@ +85,3 @@ + gst_element_factory_make ("audioparse", "L16_inner_parser"); + if (L16_parse->inner_parser == NULL) + return; You can make this an assertion. It should always work. @@ +88,3 @@ + + g_object_set (G_OBJECT (L16_parse->inner_parser), + "format", (gint) 0, The enums are in the same plugin, so use them :) @@ +187,3 @@ + + gst_query_parse_caps (query, &filter); + caps = gst_pad_get_pad_template_caps (GST_PAD (pad)); Get the downstream caps (downstream of audioparse I guess) intersected with your template caps, and only if there is no downstream use the template caps instead.
Comment on attachment 322999 [details] [review] Patch to change audio/x-raw to audio/L16 in souphttpsrc This is strictly speaking a breaking change. But it should finally fix this situation properly... OTOH might break existing applications, and now requires a -bad plugin for proper functioning. Opinions?
Come to think of it, while the main use case right now is L16, having it more generic for unaligned raw data may make sense in the future, also because audioparse provides seeking, right? Also, I did run gst-indent over these files! Apparently it didn't catch everything. And yeah, it is a breaking change. But I think requiring a "parsed" field in the audio/x-raw caps would be worse, because it would have to be introduced in many other elements. Unless of course we assume "parsed=TRUE" to be implicit, and only if "parsed" is FALSE, then a raw parser would kick in. This is unusual though - typically, parsers only get autoplugged if "parsed=TRUE" is missing. So, this would be an inconsistency.
(In reply to Carlos Rafael Giani from comment #15) > Come to think of it, while the main use case right now is L16, having it > more generic for unaligned raw data may make sense in the future, also > because audioparse provides seeking, right? Yes, but maybe audioparse (and videoparse) should already default to using the caps event to configure themselves (or have a property?). > Also, I did run gst-indent over these files! Apparently it didn't catch > everything. You're missing a semicolon after G_DEFINE_TYPE. That's why it is all confused > And yeah, it is a breaking change. But I think requiring a "parsed" field in > the audio/x-raw caps would be worse, because it would have to be introduced > in many other elements. Unless of course we assume "parsed=TRUE" to be > implicit, and only if "parsed" is FALSE, then a raw parser would kick in. > This is unusual though - typically, parsers only get autoplugged if > "parsed=TRUE" is missing. So, this would be an inconsistency. Yes, also it barely worked before without having a parser :)
The scenario that we must not break is people doing a manual pipeline with souphttpsrc ! audioparse
Created attachment 323130 [details] [review] Patch to change audio/x-raw to audio/x-unaligned-raw in souphttpsrc New souphttpsrc patch. The same as the old one, except that it uses the more generic audio/x-unaligned-raw caps instead of audio/L16.
Created attachment 323131 [details] [review] Adds unaligned raw audio data parsing This adds the ability to parse audio/x-unaligned-raw data. audioparse accepts audio/x-unaligned-raw. If "use-sink-caps" is set to TRUE, the input caps are used as the output caps, except that the name "audio/x-unaligned-raw" is replaced with "audio/x-raw". This ensures that "souphttpsrc ! audioparse" still works. The new unalignedaudioparse bin sets up an audioparse element with "use-sink-caps" set to TRUE. Due to audioparse's new support for the unaligned caps, the code of unalignedaudioparse becomes very small; the former event and query handlers are now unnecessary.
(In reply to Tim-Philipp Müller from comment #17) > The scenario that we must not break is people doing a manual pipeline with > souphttpsrc ! audioparse That will always work as audioparse accepts anything. But what might be a problem is people who only dynamically add the audioparse if the caps on souphttpsrc are audio/x-raw. That's my main worry.
(In reply to Sebastian Dröge (slomo) from comment #20) > (In reply to Tim-Philipp Müller from comment #17) > > The scenario that we must not break is people doing a manual pipeline with > > souphttpsrc ! audioparse > > That will always work as audioparse accepts anything. But what might be a > problem is people who only dynamically add the audioparse if the caps on > souphttpsrc are audio/x-raw. That's my main worry. Agreed, but I cannot see a way around it. Introducing autoplugging for audio/x-raw would potentially break a lot of other cases. There is probably no 100% clean way out of this.
If there was an audioparse variant for autoplugging it could simply have an additional aligned=false or such in its template caps, which means it would never be plugged for any normal raw audio caps (besides, decodebin would stop already at that point anyway).
(In reply to Tim-Philipp Müller from comment #22) > If there was an audioparse variant for autoplugging it could simply have an > additional aligned=false or such in its template caps, which means it would > never be plugged for any normal raw audio caps (besides, decodebin would > stop already at that point anyway). But caps with "parsed=FALSE" are a subset of caps with no "parsed" field, aren't they? How would autoplugging work with this? Also, this sort of reverses the typical usage of "parsed" caps. Usually, "parsed=FALSE" is implicit.
You mean aligned=false for audio/x-raw? That won't work: decodebin intersects audio/x-raw,whatever with audio/x-raw, non-empty result, decides it's raw caps and exposes the pad.
Ah, it doesn't do a subset check then, ok.
I ran into this bug while trying to make a UPnP media renderer on top of a Raspberry Pi - see http://cweiske.de/tagebuch/gmediarenderer.htm I'd really like to see this fixed since it prevents me from using my .ogg music library.
If you want to do DLNA, a plain HTTP source is not going to be enough and you need special handling anyway. In your application you could use the audioparse element with correct settings here for example, or make a custom source for DLNA content. DLNA is not HTTP, even if it pretends that.
Note though that my patches would autoplug everything, and no special handling would be necessary.
For one specific case that we should handle, yes. But not for DLNA in general.
Okay. I will re-check my patches, and then request here that they are reviewed for inclusion into the repositories.
I think the patches are valid. Can we review them again for 1.10 ?
Review of attachment 323131 [details] [review]: Mostly looks good to me, just cosmetical below. Also it's a bit unfortunate that this now requires a -bad plugin but what can we do other than moving it asap :) I'll keep it in bugzilla for another day or two, in case someone else has opinions ::: gst/rawparse/Makefile.am @@ +22,2 @@ noinst_HEADERS = \ + unalignedaudio.h \ Why not just put that into the gstunalignedaudioparse.h header, gstrawparse.h or other :)
> ::: gst/rawparse/Makefile.am > @@ +22,2 @@ > noinst_HEADERS = \ > + unalignedaudio.h \ > > Why not just put that into the gstunalignedaudioparse.h header, > gstrawparse.h or other :) I didn't want to include the header of one element (unalignedaudioparse) into the header of another element (audioparse) just for the GST_UNALIGNED_RAW_AUDIO_CAPS definition. So instead, I factored it out and made both audioparse and unalignedaudioparse dependent on that. And you are right, it should be moved. However, I do think this plugin needs an overhaul. It isn't based on baseparse, and apparently does the seeking badly.
(In reply to Carlos Rafael Giani from comment #33) > > ::: gst/rawparse/Makefile.am > > @@ +22,2 @@ > > noinst_HEADERS = \ > > + unalignedaudio.h \ > > > > Why not just put that into the gstunalignedaudioparse.h header, > > gstrawparse.h or other :) > > I didn't want to include the header of one element (unalignedaudioparse) > into the header of another element (audioparse) just for the > GST_UNALIGNED_RAW_AUDIO_CAPS definition. So instead, I factored it out and > made both audioparse and unalignedaudioparse dependent on that. Alright then :) > And you are right, it should be moved. However, I do think this plugin needs > an overhaul. It isn't based on baseparse, and apparently does the seeking > badly. Yes, it's older than baseparse. Porting it to baseparse should be absolutely trivial though as rawparse has a constant frame size. It should be mostly a matter of removing code. Are you going to take a look at that?
OK, I can try.
This still breaks backwards compatibility for people who wait for caps on souphttpsrc before deciding whether to plug audioparse or decodebin, and we know people do that, because they have to do that currently...
You mean it breaks because the caps are different now? I don't see a way around that. Either we break that, or it's not going to work out of the box.
The way around that is to do chunking in souphttpsrc, ugly as it may be.
Right, that would be the ugly solution. Or special casing in decodebin that if it gets raw audio data, it should first plug something else.
There's another option: we turn the raw parsing into a library, rewrite the rawparse plugin to be based on baseparse+this library, and integrate the library into souphttpsrc for chunking.
Or, yet another option: I've read opinions that souphttpsrc needs to be rewritten on top of the new Soup API (not sure what that is, I don't know Soup so well). So, one way would be to develop a souphttpsrc2 ..
So what should we do here? IMHO it's just wrong that souphttpsrc claims to produce audio/x-raw, it's providing the wrong caps. That's as wrong as a pipeline doing filesrc ! "video/x-raw" ! ...
(In reply to Carlos Rafael Giani from comment #41) > Or, yet another option: I've read opinions that souphttpsrc needs to be > rewritten on top of the new Soup API (not sure what that is, I don't know > Soup so well). So, one way would be to develop a souphttpsrc2 .. See https://bugzilla.gnome.org/show_bug.cgi?id=693911
(In reply to Tim-Philipp Müller from comment #36) > This still breaks backwards compatibility for people who wait for caps on > souphttpsrc before deciding whether to plug audioparse or decodebin, and we > know people do that, because they have to do that currently... It occurred to me that the audio/x-unaligned-raw caps actually do *not* break anything, because in the case you described, the code which waits for caps on souphttpsrc before deciding whether or not to manually add audioparse will expect the audio/x-raw caps in order to do that. So, if souphttpsrc doesn't ever set audio/x-raw caps as the media type again, then this code will just never manually plug in audioparse, and always rely on decodebin instead, which will autoplug everything properly. So, the one case that would be broken is if souphttpsrc is used *only* for audio/x-raw data, and for some reason the code checks the media type without using decodebin as a fallback.
(In reply to Carlos Rafael Giani from comment #44) That's true. I think I can live with that corner case breaking, I am not aware of anyone doing that. We'll just have to mention it in the release notes.
commit b415d7b34f5a5d030165f2596e7074cc45bd48a4 Author: Carlos Rafael Giani <dv@pseudoterminal.org> Date: Fri Mar 4 22:10:47 2016 +0100 rawparse: Add unaligned raw audio parsing to audioparse and add new element This helps in cases where raw audio data is being delivered, but the buffers do not come in sample aligned sizes. The new unalignedaudioparse bin can be autoplugged and configures an internal audioparse element to align the data. audioparse itself gets support for audio/x-unaligned-raw input caps; the output caps then contain the same information, except that the name is changed to audio/x-raw (since audioparse aligns the data). This ensures that souphttpsrc ! audioparse still works. https://bugzilla.gnome.org/show_bug.cgi?id=689460 commit e9a795fa8bea7aee6fd15b9865744b61b22c2435 Author: Carlos Rafael Giani <dv@pseudoterminal.org> Date: Fri Mar 4 10:14:00 2016 +0100 souphttpsrc: Use audio/x-unaligned-raw instead of audio/x-raw for L16 data Directly setting audio/x-raw caps leads to problems when the delivered data blocks do not align properly at sample boundaries (for example, a data block with 391 bytes). So, instead, set audio/x-unaligned-raw to let a parser be autoplugged. https://bugzilla.gnome.org/show_bug.cgi?id=689460