GNOME Bugzilla – Bug 688824
queue: Dropping sticky events on flush without resending them later
Last modified: 2013-05-27 11:06:10 UTC
When trying to play back http://download.blender.org/peach/trailer/trailer_400p.ogg every now and then I get a crash in theora_handle_type_packet (thread 18 in the bt):
+ Trace 231212
Thread 25 (Thread 0x7fffb53f5700 (LWP 25413))
Thread 21 (Thread 0x7fffb5bf6700 (LWP 25400))
Thread 18 (Thread 0x7fffc288f700 (LWP 25397))
Thread 17 (Thread 0x7fffc3998700 (LWP 25396))
Thread 1 (Thread 0x7ffff7fb7800 (LWP 25343))
Every now and then meaning about 1 out of 5 attempts.
No crashes here, any chance you could get a valgrind trace as well? What 1.x version is this with exactly? (gst-inspect-1.0 theoradec | grep Version)
Sorry, just realiyed that theora is in base, not good. Version is 1.1.0.1, git sha 15ee41dfc9dd4d3dc80cc0a6d2f332aa59284ce2 on gst-plugins-base.
Created attachment 229618 [details] Valgrind log
Do you have a test case that does not involve rygel? Or maybe a valgrind run that does not have this in it: ==4104== Invalid write of size 4 ==4104== at 0x5FC8EFC: g_main_context_dispatch (gmain.c:2668) ==4104== by 0x5FC92EF: g_main_context_iterate.isra.25 (gmain.c:3290) ==4104== by 0x5FC96E9: g_main_loop_run (gmain.c:3484) ==4104== by 0x405E8C: rygel_main_run (rygel-main.vala:81) ==4104== by 0x40857C: rygel_main_main (rygel-main.vala:290) ==4104== by 0x4086AD: main (rygel-main.vala:263) ==4104== Address 0xed7a35c is 44 bytes inside a block of size 96 free'd ==4104== at 0x4C2A82E: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4104== by 0x5FC6973: g_source_unref_internal (gmain.c:1839) ==4104== by 0x6D0BDC0: soup_message_io_stop (soup-message-io.c:144) ==4104== by 0x6D0BE13: soup_message_io_cleanup (soup-message-io.c:100) ==4104== by 0x6D0C066: soup_message_io_finished (soup-message-io.c:163) ==4104== by 0x6D0D4EF: io_run (soup-message-io.c:911) ==4104== by 0x5FC8FA2: g_main_context_dispatch (gmain.c:2715) ==4104== by 0x5FC92EF: g_main_context_iterate.isra.25 (gmain.c:3290) ==4104== by 0x5FC96E9: g_main_loop_run (gmain.c:3484) ==4104== by 0x405E8C: rygel_main_run (rygel-main.vala:81) ==4104== by 0x40857C: rygel_main_main (rygel-main.vala:290) ==4104== by 0x4086AD: main (rygel-main.vala:263) Can you reproduce it with gst-launch-1.0 somehow?
I'll try.
Jens Georg, any news here?
Created attachment 243036 [details] New valgrind file
Can you get a valgrind log with debug symbols for gst-plugins-base (and ideally gstreamer, glib, libc, etc) installed? The latest debug log only gives the function names, not exactly where things go wrong
Weird. Have them installed
Created attachment 243155 [details] backtrace Backtrace with line numbers from within GDB
Yeah, dec->input_state is not set at that point yet.
Which means that the decoder got no caps yet. Either that shouldn't happen, or we should reconstruct the caps/input state in that function.
Jens, can you get a debug log with GST_DEBUG=6 for this?
No. Apparently this changes some timing so it never happens with GST_DEBUG=6. However, I noticed that I never mentioned that I'm actually meant playback via HTTP, sorry. I can reproduce it with gst-launch-1.0 playbin uri=http://download.blender.org/peach/trailer/trailer_400p.ogg
<slomo> phako: your logs are from rygel, not gst-launch. and the multiqueue-5 log looks interesting, there are flush-start/stop events for some reason <slomo> phako: any idea where they could come from? i think there they could drop the caps event from the queue ;) <slomo> phako: so for some reason there are flush events while stream-start,caps,etc events are in the queue but not forwarded yet. the queue is flushed, the events get lost, downstream only gets the segment event that followed the flush So what must be done here is to not flush sticky events from the queues that wouldn't be flushed from the pads by flush-stop either.
Created attachment 243685 [details] [review] 0001-queue-queue2-multiqueue-When-flushing-make-sure-to-n.patch Does this help?
It seems to help rygel, but gst-launch still crashes
+ Trace 231920
Thread 4 (Thread 0x7fa37cbaa700 (LWP 27830))
Inferior 1 [process 27825] will be detached. Quit anyway? (y or n) Detaching from program: /opt/rygel-stable/bin/gst-launch-1.0, process 27825
Also, it seems to take longer now to reproduce the bug, approx. 1 out of 20 tries.
Could it be that this patch makes GstDiscoverer really slow?
I wouldn't know why :) Could you try to get debug logs of discoverer, and also of the crash again?
With this patch it's not possible to get it crashing at any meaningful loglevel :( I'll do some more checks on the discoverer thing, might be that it's master and not just this patch alone.
Also see bug #700806 , which is yet another problem of flushes dropping sticky events that it shouldn't drop Any news on this Jens? Any new observations?
My patch should also make sure that whenever we flush the queues, any pending serialized queries should be woken up. See bug #700342 , which is probably related to this here.
No, nothing new. only that's not possible anymore to make it crash with any meaningful loglevel with your patch.
Should be good with latest git master, otherwise please reopen with commit b6fac17502957bb4e10e73926da87f289a1d94b1 Author: Sebastian Dröge <slomo@circular-chaos.org> Date: Mon May 27 13:01:43 2013 +0200 queue/queue2/multiqueue: When flushing, make sure to not lose any sticky events https://bugzilla.gnome.org/show_bug.cgi?id=688824