GNOME Bugzilla – Bug 796607
souphttpsrc: reference count error after flushing (race condition)
Last modified: 2018-06-20 16:06:16 UTC
This is an error I've encountered several times while running the gst-validate adaptive test suite. It only happens a small fraction of the times a test is run though. I had best luck reproducing it with the following Bash command on GStreamer master: $ true; while [[ $? -eq 0 ]]; do echo "new run"; GST_DEBUG=souphttpsrc:9,adaptivedemux:9,dashdemux:9,queue:9 G_DEBUG=fatal-criticals GST_GL_XINITTHREADS='1' DISPLAY=':0' GST_VALIDATE_SCENARIO='fast_forward' GST_VALIDATE_OVERRIDE='/home/ntrrgc/gst-validate/gst-integration-testsuites/medias/adaptivecontent/BBCRD_ondemand_testcard_1_client_manifest-ctv-events.override' GST_VALIDATE_SCENARIOS_PATH='/home/ntrrgc/Apps/gst-build/prefix/share/gstreamer-1.0/validate/scenarios:/home/ntrrgc/Apps/gst-build/subprojects/gst-devtools/validate/data/scenarios' /home/ntrrgc/Apps/gst-build/build/subprojects/gst-devtools/validate/tools/gst-validate-1.0 playbin uri=http://rdmedia.bbc.co.uk/dash/ondemand/testcard/1/client_manifest-ctv-events.mpd audio-sink=fakesink sync=true video-sink=fakesink sync=true qos=true max-lateness=20000000 --set-media-info /home/ntrrgc/gst-validate/gst-integration-testsuites/medias/adaptivecontent/BBCRD_ondemand_testcard_1_client_manifest-ctv-events.stream_info > ~/tmp/gst-log.txt 2>&1; done It usually takes around 10 tries to manifest in my case. For some reason I had no luck attempting to run the same test with `gst-validate-launcher -m --forever adaptive.dash.playback.fast_forward.BBCRD_ondemand_testcard_1_client_manifest-ctv-events` even after 150 tries, although I have seen it many times by chance when running the entire adaptive suite. Start buffering Buffering... 0% ^M0:00:05.303023920 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:1683:gst_adaptive_demux_handle_seek_event: Locking from thread 0x16a4010 0:00:05.303073512 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:1683:gst_adaptive_demux_handle_seek_event: Locked from thread 0x16a4010 0:00:05.303095751 35680 0x16a4010 LOG adaptivedemux gstadaptivedemux.c:2078:gst_adaptive_demux_stop_tasks:<dashdemux0> Stopping tasks 0:00:05.303127989 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:2097:gst_adaptive_demux_stop_tasks: Unlocking from thread 0x16a4010 0:00:05.303145020 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:2101:gst_adaptive_demux_stop_tasks: Locking from thread 0x16a4010 0:00:05.303142650 35680 0x7fcbf0541720 DEBUG adaptivedemux gstadaptivedemux.c:3210:gst_adaptive_demux_stream_download_uri:<dashdemux0:audio_00> Finished Waiting for fragment download: http://rdmedia.bbc.co.uk/dash/ondemand/testcard/1/audio/128kbps/000023.m4s 0:00:05.303164565 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:2101:gst_adaptive_demux_stop_tasks: Locked from thread 0x16a4010 0:00:05.303185940 35680 0x7fcbf0541720 TRACE adaptivedemux gstadaptivedemux.c:3212:gst_adaptive_demux_stream_download_uri: Locking from thread 0x7fcbf0541720 0:00:05.303204570 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:2118:gst_adaptive_demux_stop_tasks: Unlocking from thread 0x16a4010 0:00:05.303263193 35680 0x7fcbf0541720 TRACE adaptivedemux gstadaptivedemux.c:3212:gst_adaptive_demux_stream_download_uri: Locked from thread 0x7fcbf0541720 0:00:05.303286738 35680 0x7fcbf0541720 DEBUG adaptivedemux gstadaptivedemux.c:3438:gst_adaptive_demux_stream_download_fragment:<dashdemux0:audio_00> Fragment download result: -2 (200) flushing 0:00:05.303300415 35680 0x7fcbf0541720 DEBUG adaptivedemux gstadaptivedemux.c:3922:gst_adaptive_demux_stream_download_loop:<dashdemux0:audio_00> Stream has been cancelled 0:00:05.303301583 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:2130:gst_adaptive_demux_stop_tasks: Locking from thread 0x16a4010 0:00:05.303319092 35680 0x7fcbf0541720 TRACE adaptivedemux gstadaptivedemux.c:3916:gst_adaptive_demux_stream_download_loop: Unlocking from thread 0x7fcbf0541720 0:00:05.303354321 35680 0x7fcbf0541720 LOG adaptivedemux gstadaptivedemux.c:3917:gst_adaptive_demux_stream_download_loop:<dashdemux0:audio_00> download loop end 0:00:05.303361554 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:2130:gst_adaptive_demux_stop_tasks: Locked from thread 0x16a4010 0:00:05.303377419 35680 0x16a4010 TRACE adaptivedemux gstadaptivedemux.c:2118:gst_adaptive_demux_stop_tasks: Unlocking from thread 0x16a4010 0:00:05.303753736 35680 0x16a4010 DEBUG souphttpsrc gstsouphttpsrc.c:1929:gst_soup_http_src_unlock:<souphttpsrc0> unlock() 0:00:05.304608040 35680 0x7fcbf05f1990 ERROR validate gst-validate-reporter.c:195:gst_validate_report_valist: <playbin0> 2335 (critical) : g-log: We got a g_log critical issue : g_main_context_ref: assertion 'g_atomic_int_get (&context->ref_count) > 0' failed (gdb) bt
+ Trace 238643
Created attachment 372701 [details] Full stack trace
Someone will have to figure out a way to trace when maincontext refcount goes back to zero. Don't have any other idea on how to debug this.
Thanks for taking the time to report this. This particular bug has already been reported into our bug tracking system, but please feel free to report any further bugs you find. *** This bug has been marked as a duplicate of bug 785110 ***
Sebastian, your marked as duplicate of https://bugzilla.gnome.org/show_bug.cgi?id=785110 but in that bug, you state that this particular issue has gone, are you sure we should still treat it as a duplicate?
(In reply to Thibault Saunier from comment #4) > Sebastian, your marked as duplicate of > https://bugzilla.gnome.org/show_bug.cgi?id=785110 but in that bug, you state > that this particular issue has gone, are you sure we should still treat it > as a duplicate? One specific instance of the problems is gone, but there were still assertions it just did not crash for me anymore. This crash above looks like a very similar one from the original problem. The bug report is about multiple related issues all over the place in libsoup, one of them being solved at this point.
This particular issue actually seems to be https://bugzilla.gnome.org/show_bug.cgi?id=762138
(In reply to Edward Hervey from comment #6) > This particular issue actually seems to be > https://bugzilla.gnome.org/show_bug.cgi?id=762138 That's the part of https://bugzilla.gnome.org/show_bug.cgi?id=785110 that is fixed already, yes. It looks very similar but AFAIU Alicia has a version with that fix?
(In reply to Sebastian Dröge (slomo) from comment #7) > (In reply to Edward Hervey from comment #6) > > This particular issue actually seems to be > > https://bugzilla.gnome.org/show_bug.cgi?id=762138 > > That's the part of https://bugzilla.gnome.org/show_bug.cgi?id=785110 that is > fixed already, yes. It looks very similar but AFAIU Alicia has a version > with that fix? I was using system's libsoup (2.62.2) which according to https://bugzilla.gnome.org/show_bug.cgi?id=762138#c19 does not have the patch (unless it has been added downstream by Fedora). In any case, I built libsoup master and used LD_LIBRARY_PATH to repeat the testcase above with it but noticed no improvement. Same exact error after a number of retries.
(In reply to Alicia Boya García from comment #8) > (In reply to Sebastian Dröge (slomo) from comment #7) > I was using system's libsoup (2.62.2) which according to > https://bugzilla.gnome.org/show_bug.cgi?id=762138#c19 does not have the > patch (unless it has been added downstream by Fedora). Sure it does. "Created commit d28962ff in master (2.62.1+)" <== That's in 2.62.2 > > In any case, I built libsoup master and used LD_LIBRARY_PATH to repeat the > testcase above with it but noticed no improvement. Same exact error after a > number of retries. So we should reopen that bug ?
(In reply to Edward Hervey from comment #9) > (In reply to Alicia Boya García from comment #8) > > (In reply to Sebastian Dröge (slomo) from comment #7) > > I was using system's libsoup (2.62.2) which according to > > https://bugzilla.gnome.org/show_bug.cgi?id=762138#c19 does not have the > > patch (unless it has been added downstream by Fedora). > > Sure it does. "Created commit d28962ff in master (2.62.1+)" <== That's in > 2.62.2 Right, I misread the version at some point; it's as you point out.
Here's the only remaining mis-handling of references I found in the actual element. The remainder seems to really be an issue *within* libsoup :( https://bugzilla.gnome.org/show_bug.cgi?id=796639
Created attachment 372734 [details] adress sanitizer report This is an address-sanitizer (asan) backtrace of the actual failure. We do indeed hit a use-after-free issue :( validate.dash.playback.scrub_forward_seeking.dash_exMPD_BIP_TC1
And the actual fix : https://gitlab.gnome.org/GNOME/libsoup/merge_requests/7
(In reply to Edward Hervey from comment #13) > And the actual fix : https://gitlab.gnome.org/GNOME/libsoup/merge_requests/7 This seems to fix the bug shown here. After applying the patch in libsoup the bug has not appeared in 627 runs.