GNOME Bugzilla – Bug 160126
multi.ogg is slow
Last modified: 2005-01-08 18:42:12 UTC
Trying to play the multi.ogg file it starts to play then I get: ** (totem:31522): WARNING **: Channel position 0 occurred 3 times, not allowed then Totem hangs
Known. Reason is simple: capsnego sucks. Long version: when a chained ogg finishes, it receives EOS and its pad is destroyed. Playbin will attach a fakesrc to it (so it doesn't hang or so; not sure) which renegotiates and emits an EOS. This renegotiates with the pipeline (queue!audioconvert!audioscale!volume!audiosink) and gets to 3, 4, 5 or so channels for some unknown reason. Fixation definately is wrong here, but that's besides the point. Then it starts fixating channels and randomly sets values for it. This is obviously wrong, it'll fixate each to the first value (0, which is mono). This tries to negotiate with audioconvert which fails because it's not a valid channel configuration (which the above warning implies; a g_warning() is appropriate here). We need either to save caps or to re-apply the caps preference patch from Dave to fix this. I can't fix this.
Ok, so how do we proceed? who decides what to do?
Save-caps would be a crude but working hack. Let's see if Wim is ok with that.
Wim says go for it :)
At least I've found the reason why it fixates to three channels: gst_caps_do_simplify is buggered. Making that function a no-op fixes the warning. Chained ogg works again. Can someone please fix gst_caps_do_simplify so I don't need hacks in playbin?
So if you want a testcase, call gst_caps_do_simplify on "audio/x-raw-int, channels=2; audio/x-raw-int,channels=3,channel-positions=<1,2,3>". It'll return a caps that is the other way around (so first channels=3, then channels=2).
Created attachment 34663 [details] [review] workaround OK, so the current code has no apparent purpose (it makes structures with less members appear later to that it's ordered alphabetically... Yes?!?), so I just kept it that way in such a way that it actually shows simpler structures first (and that has the side-effect of working around this bug). Bah. Hacks are good.
Above hack is committed. I can now play two chains. However, I have hickups between chain transition (caps negotiation really still takes too long) and Totem still hangs on transitioning to the third chain.
Why are you depending on caps ordering?
It's a hack. It clearly says so. When connecting fakesrc ! queue ! ..., the first connection will get to fixate, it'll fixate to the first item in a set of chained caps. That used to be channels=3. Now it's channels=1. That works, because it requires no channel ordering. First post: capsnego is broken. That's reality. This is just a hack that happens to make it work.
With Wim's opt fix the 'Channel position 0 occurred 3 times' message is gone, Totem doesn't really hang either, but playback is not right. It is very very slow it seems.
No, that's my hack, not Wim's fix. Wim's fix doesn't change anything for playback of local files. And yes, it's slow, negotiation takes too long.
*** Bug 156386 has been marked as a duplicate of this bug. ***
OK, so I had a look at callgrind. Tried oprofile too, but we didn't appear to like each other. I simply profile lt-totem multi.ogg, with audio resampling disabled (it biases the whole thing, and it's not the culprit here). Here's some numbers: * probe_triggered()+callees in playbasebin takes 86.92% of the CPU time. * half of this time (50.66%) is spent in gst_value_init_and_copy()+callees. * direct callees taking time appear to be gst_element_set_state() (54.33%) and g_signal_emit() (31.99%). This might imply that it renegotiates during both state_chage *and* the signal emission (which, FYI, results in setup_sinks being called in playbin). That would suck. * gst_value_init_and_copy itself is responsible for 44.22% of the CPU time (in the whole app). This implies that the use of data copying in caps negotiation during chain switches is the bad factor here. Use of gst_value_init_and_copy() outside the context of probe_triggered() is apparently negligible. * gst_caps_is_subset(), which is a simple check (like g_return_if_fail() takes 28.30%. All of this time (98.62%) is spent in gst_value_init_and_copy(). * there's some more similar things... Basically, I see that caps negotiation is horribly slow, mostly due to data copying. The best thing to do, it seems, is to try and keep variables alive while calling functions, rather than making copies. This whole thing is, of course, rather exagerated by the use of arrays (for surround sound), which makes this thing suddenly show off. Short-term, we can decide to either revert surround sound alltogether or spend time optimizing core. I obviously prefer the second. If you want more specific numbers for other parts, let me know.
Another question: givent he amount of copying that is going on, would it help if we used GValueArray instead of GArray for lists/fixedlists in caps? The good thing is that it is incredibly easy to copy (g_memdup ()), or at least if we allow ourselves to use such hacks, because the full struct is transparent, which is not the case for GArray. It might get faster then. Of course, refcounting would help too, but that's gonna be nasty...
One optimization is to remove some of the extraneous copying of caps in gstpad.c to fill out the PadLink structure. I try to look at that today.
Ok, I looked at it. I understand a lot more about this issue. Negotiation itself is pretty efficient about moving caps structures around. (One issue which we'll probably fix in 0.9 is that elements and GstPads tend to each have a copy of the allowed caps, but that's mostly irrelevant here.) The problem is that alsa, combined with multichannel caps, creates *huge* caps structures which then get shuffled around a lot. alsasink creates a caps that contains 46 structures on my system. lists and fixed lists are not partcularly efficient to deal with, in terms of basic operations (compare, intersect, etc.). So there's two slowing factors that are multiplied together. Is there any way to determine what format alsa will use to transfer to the hardware? That's really the only interesting thing for alsa caps.
ALSA? ALSA creates caps with only a few (3) arrays, and no lists. The 46 must come from audioconvert.
No, this is directly from alsasink. 46, which includes depth=18, depth=20, and other interesting oddities.
Oh, I see. So you're basically suggesting to intersect the caps with the peercaps before setting channels on it? Wouldn't that be *more* expensive? Also, you wouldn't be able to cache caps anymore.
Huh? No. I'm wondering why it's necessary to have 46 structures in alsasink's caps. Partly this has to do with multichannel structures not unioning easily.
I added stream selection, which significantly speeds up multi.ogg. It's now so good that I don't hear skips anymore. Marking fixed.