Bug 624338 – [pulsesink] Handle pulse context separate from the ringbuffers and share them

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 624338 - [pulsesink] Handle pulse context separate from the ringbuffers and share them


Summary:	[pulsesink] Handle pulse context separate from the ringbuffers and share them


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-good
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	0.10.26
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-07-14 12:46 UTC by Philippe Normand
Modified:	2010-09-13 18:00 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
stack trace (24.16 KB, text/plain) 2010-07-14 12:46 UTC, Philippe Normand		Details
test-case (421 bytes, text/plain) 2010-07-14 12:48 UTC, Philippe Normand		Details
pulsesink: use G_TYPE_DEFINE (2.23 KB, patch) 2010-07-15 08:17 UTC, Philippe Normand	none	Details \| Review
pulsesink: use G_TYPE_DEFINE (2.59 KB, patch) 2010-07-15 13:31 UTC, Philippe Normand	committed	Details \| Review
pulsesink: share the PA context between all clients with the same name (22.52 KB, patch) 2010-08-16 07:27 UTC, Philippe Normand	none	Details \| Review
pulsesink: share the PA context between all clients with the same name (24.00 KB, patch) 2010-08-16 17:00 UTC, Philippe Normand	none	Details \| Review
pulsesink: share the PA context between all clients with the same name (22.67 KB, patch) 2010-08-16 17:12 UTC, Philippe Normand	none	Details \| Review
First backtrace when using the latest patch (14.68 KB, text/plain) 2010-08-16 21:48 UTC, Colin Guthrie		Details
Second backtrace when using the latest patch (12.10 KB, text/plain) 2010-08-16 21:49 UTC, Colin Guthrie		Details
Debug output before the patch (750.84 KB, text/plain) 2010-08-17 11:26 UTC, Colin Guthrie		Details
Debug output after the patch (15.96 KB, text/plain) 2010-08-17 11:27 UTC, Colin Guthrie		Details
pulsesink: clear the PA mainloop if baseaudiosink failed to open the ring_buffer (1.33 KB, patch) 2010-08-17 11:45 UTC, Philippe Normand	committed	Details \| Review
pulsesink: share the PA context between all clients with the same name (23.63 KB, patch) 2010-08-17 13:13 UTC, Philippe Normand	none	Details \| Review
pulsesink: share the PA context between all clients with the same name (23.35 KB, patch) 2010-08-17 15:38 UTC, Philippe Normand	committed	Details \| Review

Description Philippe Normand 2010-07-14 12:46:44 UTC

Created attachment 165873 [details]
stack trace

If I use 2 playbin2 in the same process and if PulseAudio is not installed, a double-free exception occurs. It doesn't happen 100% of the time. Try the attached test-case with a URI in parameter.

Comment 1 Philippe Normand 2010-07-14 12:48:30 UTC

Created attachment 165874 [details]
test-case

Comment 2 Philippe Normand 2010-07-14 13:19:04 UTC

Also reported on PA Trac: http://pulseaudio.org/ticket/841

Comment 3 Philippe Normand 2010-07-15 07:06:08 UTC

Sometimes I also get this one. Registration of already existing GstPulseSinkRingBuffer:


test2.py:18: Warning: cannot register existing type `GstPulseSinkRingBuffer'
  loop.run()

GLib-GObject-CRITICAL **: g_object_new: assertion `G_TYPE_IS_OBJECT (object_type)' failed
aborting...

Program received signal SIGABRT, Aborted.

+ Trace 222852

Thread 140737152808720 (LWP 20355)

#0 *__GI_raise
at ../nptl/sysdeps/unix/sysv/linux/raise.c line 64
#1 *__GI_abort
at abort.c line 92
#2 IA__g_logv
at gmessages.c line 549
#3 IA__g_log
at gmessages.c line 569
#4 IA__g_object_new
at gobject.c line 1088
#5 gst_pulsesink_create_ringbuffer
at pulsesink.c line 1720
#6 gst_base_audio_sink_create_ringbuffer
at gstbaseaudiosink.c line 1673
#7 gst_base_audio_sink_change_state
at gstbaseaudiosink.c line 1825
#8 gst_element_change_state
at gstelement.c line 2546
#9 gst_element_set_state_func
at gstelement.c line 2502
#10 gst_auto_audio_sink_find_best
at gstautoaudiosink.c line 289
#11 gst_auto_audio_sink_detect
at gstautoaudiosink.c line 343
#12 gst_auto_audio_sink_change_state
at gstautoaudiosink.c line 391
#13 gst_element_change_state
at gstelement.c line 2546
#14 gst_element_set_state_func
at gstelement.c line 2502
#15 try_element
at gstplaysink.c line 1019
#16 gen_audio_chain
at gstplaysink.c line 1580
#17 gst_play_sink_reconfigure
at gstplaysink.c line 2232
#18 no_more_pads_cb
at gstplaybin2.c line 2828
#19 IA__g_closure_invoke
at gclosure.c line 767
#20 signal_emit_unlocked_R
at gsignal.c line 3248
#21 IA__g_signal_emit_valist
at gsignal.c line 2981
#22 IA__g_signal_emit
at gsignal.c line 3038
#23 IA__g_closure_invoke
at gclosure.c line 767
#24 signal_emit_unlocked_R
at gsignal.c line 3248
#25 IA__g_signal_emit_valist
at gsignal.c line 2981
#26 IA__g_signal_emit
at gsignal.c line 3038
#27 gst_decode_bin_expose
at gstdecodebin2.c line 3148
#28 source_pad_blocked_cb
at gstdecodebin2.c line 3267
#29 handle_pad_block
at gstpad.c line 3986
#30 gst_pad_push_event
at gstpad.c line 4931
#31 gst_ffmpegdec_sink_event
at gstffmpegdec.c line 2460
#32 gst_pad_send_event
at gstpad.c line 5098
#33 gst_pad_push_event
at gstpad.c line 4954
#34 gst_single_queue_push_one
at gstmultiqueue.c line 942
#35 gst_multi_queue_loop
at gstmultiqueue.c line 1101
#36 gst_task_func
at gsttask.c line 271
#37 g_thread_pool_thread_proxy
at gthreadpool.c line 315
#38 g_thread_create_proxy
at gthread.c line 1893
#39 start_thread
at pthread_create.c line 300
#40 clone
at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 112
#41 ??

Comment 4 Sebastian Dröge (slomo) 2010-07-15 07:51:20 UTC

Yes, type registration of the ring buffer is not threadsafe. Patch coming soon

Comment 5 Philippe Normand 2010-07-15 08:17:33 UTC

Created attachment 165945 [details] [review]
pulsesink: use G_TYPE_DEFINE

Comment 6 Tim-Philipp Müller 2010-07-15 08:26:10 UTC

Out of curiosity: why doesn't the g_type_class_ref (GST_TYPE_PULSERING_BUFFER) in gst_pulseringbuffer_class_init() take care of this already?

Comment 7 Philippe Normand 2010-07-15 13:04:58 UTC

(In reply to comment #6)
> Out of curiosity: why doesn't the g_type_class_ref (GST_TYPE_PULSERING_BUFFER)
> in gst_pulseringbuffer_class_init() take care of this already?

In a normal case get_type() is first called and then class_init() triggers a new call to get_type() via that g_type_class_ref (GST_TYPE_PULSERING_BUFFER), right?

I think the race is occurring while the 2 threads are calling the first get_type() so the g_type_class_ref() happens too late.

I'm quite new to this, I'm probably all wrong anyway ;)

Comment 8 Tim-Philipp Müller 2010-07-15 13:17:40 UTC

D'oh. That class_ref() should've been in gst_pulsesink_class_init() of course, not gst_pulseringbuffer_class_init() :)

Comment 9 Philippe Normand 2010-07-15 13:31:26 UTC

Created attachment 165956 [details] [review]
pulsesink: use G_TYPE_DEFINE

The previous get_type() implementation was racy. Also removed the
g_type_class_ref() hack which was misplaced anyway ;)

See bug 624338

Comment 10 Colin Guthrie 2010-08-04 14:52:19 UTC

Hi there,

Does the above patch fix this bug completely or just the "Registration of already existing GstPulseSinkRingBuffer:" bit of it?

Comment 11 Tim-Philipp Müller 2010-08-04 15:41:32 UTC

Committed this, but changed the commit message:

 commit 864a52d8aa938fbc308a5bf027cbc4bd8a23d45e
 Author: Philippe Normand <pnormand@igalia.com>
 Date:   Thu Jul 15 10:10:31 2010 +0200

    pulsesink: use G_TYPE_DEFINE to define ring buffer type
    
    The existing get_type() implementation is racy, and the
    g_type_class_ref() workaround didn't actually work because
    it was in the wrong function. Since class creation in GObject
    is thread-safe these days (since 2.16), the class_ref workaround
    is no longer needed and it is sufficient to ensure the _get_type()
    function is thread-safe, which G_TYPE_DEFINE does.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=624338


The only reason this is actually ok as-is is that now class creation in gobject is thread-safe, which wasn't the case when the g_type_class_ref() was added (IIRC) ;-)

Setting NEEDINFO for answer to comment #10, although the stack trace looks like there is something else going on.

Comment 12 Sebastian Dröge (slomo) 2010-08-06 17:20:27 UTC

Philipe said on IRC that this doesn't fix the real problem

Comment 13 Philippe Normand 2010-08-12 07:03:28 UTC

I will check again today if I can reproduce this bug but IIRC the committed patch was only a first step ;)

Comment 14 Philippe Normand 2010-08-16 07:27:57 UTC

Created attachment 167933 [details] [review]
pulsesink: share the PA context between all clients with the same name

Avoid to create a new PA context for each new client by using a hash
table containing the list of ring-buffers and the shared PA context
for each client. Doing this will improve application memory usage in
the cases where multiple pipelines involving multiple pulsesink
elements are used.

Fixes bug #624338.

Comment 15 Sebastian Dröge (slomo) 2010-08-16 08:55:17 UTC

Looks good in general but you should protected the hash table with a mutex... everywhere.

Then it might make sense to add a property to set the pulse client name. E.g. in webkit you might want to have all pulsesinks of the same tab in the same pulse client but not all pulsesinks of all tabs.

Also, gst_pulse_client_name() should maybe be improved and return something like "GStreamer ($PID)" when the other two ways to get a name fail.

Comment 16 Philippe Normand 2010-08-16 09:00:11 UTC

(In reply to comment #15)
> Looks good in general but you should protected the hash table with a mutex...
> everywhere.
> 

Right, I did it only at one place, sorry ;)

> Then it might make sense to add a property to set the pulse client name. E.g.
> in webkit you might want to have all pulsesinks of the same tab in the same
> pulse client but not all pulsesinks of all tabs.
> 

Good idea indeed, could that go in a separate bug/patch? I'd be fine with doing it here but the current patch is already big I think.

> Also, gst_pulse_client_name() should maybe be improved and return something
> like "GStreamer ($PID)" when the other two ways to get a name fail.

Ok, will do that in next iteration of the patch.

Comment 17 Colin Guthrie 2010-08-16 09:02:27 UTC

Awesome, thanks for your work here guys. I've already tested the first version and I'm quietly hoping it'll help in some KDE Phonon related bugs I have that seem to relate to similar cases... perhaps they wont, but I live in hope and it seemed to help in my test setup. I look forward to the next iteration of the patch :)

Comment 18 Sebastian Dröge (slomo) 2010-08-16 10:23:42 UTC

(In reply to comment #16)

> > Then it might make sense to add a property to set the pulse client name. E.g.
> > in webkit you might want to have all pulsesinks of the same tab in the same
> > pulse client but not all pulsesinks of all tabs.
> > 
> 
> Good idea indeed, could that go in a separate bug/patch? I'd be fine with doing
> it here but the current patch is already big I think.

No, do it as a separate patch. The patch, after fixing the thread safety issue, is already big enough as you said. I always prefer many small patches over one large.

> > Also, gst_pulse_client_name() should maybe be improved and return something
> > like "GStreamer ($PID)" when the other two ways to get a name fail.
> 
> Ok, will do that in next iteration of the patch.

But do it as a separate patch too please. It's related to this change but not really the same

Comment 19 Philippe Normand 2010-08-16 17:00:41 UTC

Created attachment 167977 [details] [review]
pulsesink: share the PA context between all clients with the same name

Avoid to create a new PA context for each new client by using a hash
table containing the list of ring-buffers and the shared PA context
for each client. Doing this will improve application memory usage in
the cases where multiple pipelines involving multiple pulsesink
elements are used.

Fixes bug #624338.

Comment 20 Philippe Normand 2010-08-16 17:02:17 UTC

Hopefully the thread-safety issues are fixed now, please test :)

Comment 21 Philippe Normand 2010-08-16 17:12:17 UTC

Created attachment 167980 [details] [review]
pulsesink: share the PA context between all clients with the same name

Avoid to create a new PA context for each new client by using a hash
table containing the list of ring-buffers and the shared PA context
for each client. Doing this will improve application memory usage in
the cases where multiple pipelines involving multiple pulsesink
elements are used.

Fixes bug #624338.

Added a gst_pulsering_get_context() function to avoid copy/pastes.

Comment 22 Colin Guthrie 2010-08-16 17:46:14 UTC

Hmm, I'm not sure it's related, but if I apply this patch, then I can trigger the assertion in the gst_pulsesink_change_state() function:

  switch (transition) {
    case GST_STATE_CHANGE_NULL_TO_READY:
      g_assert (pulsesink->mainloop == NULL);


By sharing contexts, are mainloops somehow shared too? (or are they meant to be)?

If you are on IRC, I'm coling, I'll try and be around this evening if you want to debug (or if this is totally unrelated and just triggered as a coincidence, you can tell me to stop bothering you :D)

Comment 23 Philippe Normand 2010-08-16 18:08:12 UTC

Can you please post a test-case triggering this? If not it'd be nice to at least have a stack-trace.

I didn't change anything related to the mainloop, only the pa_context should be shared. So I'd be happy to debug this potential regression, tomorrow though ;)

Comment 24 Colin Guthrie 2010-08-16 18:43:48 UTC

I'll post a proper stack trace in a bit - I'm using a slightly patched up version of 0.10.24 (your two patches on top) so I'll do it from git master to be sure of a clean line number reference etc.

The test case is trivial but the loops it jumps through are not! I'm technically using the command:
 kdialog --error "Foo" & kdialog --error "Bar"

to trigger the error. This goes via dbus to knotifyd (a daemon), which then uses KDE's Phonon library which then uses a gstreamer backend which then uses pulsesink....

The bug could easily be in phonon-gstreamer code, but only exposed by this change, but not knowing gstreamer overly well I'm not really qualified to comment on that possibility. On potential issue that does crop up is the fact that I also have another PA context running in the knotifyd daemon. This is my "control" connection to PA that I establish to do all the housekeeping work that is needed to get nice PA integration with Phonon. Obviously your patch is all about reducing the number of contexts, but my control connection will have it's own one which could mess things up a bit.

Anyway, like I say I'll get a full stack-trace soon.

Thanks again :)

Comment 25 Colin Guthrie 2010-08-16 21:48:36 UTC

Created attachment 168013 [details]
First backtrace when using the latest patch

Here is some debug output and a backtrace showing the problem reported. As I said previously, the error may be in phonon-gstreamer code and only working currently due to the alignment of the planets being just right, or it could be indicative of a real problem. The test case (as previously reported) is easy to trigger, but the whole stack is a bit complex if you're not running it all already!

Comment 26 Colin Guthrie 2010-08-16 21:49:22 UTC

Created attachment 168014 [details]
Second backtrace when using the latest patch

Here is a second backtrace. Different set of circumstances leading up to it, but same crash point.

Comment 27 Philippe Normand 2010-08-17 07:51:32 UTC

In http://gitorious.org/phonon/phonon/blobs/master/gstreamer/devicemanager.cpp#line144

the first set_state call can fail (but the pulsesink mainloop is created and not freed) and the second set_state call triggers the assertion error, I think ;) Can you confirm this?

Comment 28 Colin Guthrie 2010-08-17 08:38:48 UTC

That was certainly my guess at what was going on, but I hadn't quite managed to work out the place where the mainloop needed to be free'd in such a scenario.

I'll try and do some tests to see why the patch makes a difference. Perhaps the old code is such that the first state transition always succeeds and the crash trigger would have always been there, but now, the first transition can fail for some reason and that then means the code that causes the crash is run?

As you didn't touch the mainloop code, I'm guessing that this is actually where the issue lies.... but I could be totally wrong :D

Comment 29 Colin Guthrie 2010-08-17 09:33:55 UTC

Just to confirm this behaviour, I put a filthy great bit assert() right after that initial call, and prior to your patch, it was never hit with my testcase. After applying your patch, it is hit.

So my suspicions above are correct, in that your patch in some way makes this state change fail (although I strongly suspect the same actual gst assert could be hit by the phonon code with the current gst-good version in the event that the first state transition would fail - I guess that just hasn't happened before!)

So not sure what to do next. Any ideas?

Comment 30 Philippe Normand 2010-08-17 11:14:28 UTC

Can you please provide GST_DEBUG=pulse:5 debug output of both runs?

Comment 31 Colin Guthrie 2010-08-17 11:26:45 UTC

Created attachment 168051 [details]
Debug output before the patch

Comment 32 Colin Guthrie 2010-08-17 11:27:17 UTC

Created attachment 168052 [details]
Debug output after the patch

Comment 33 Philippe Normand 2010-08-17 11:45:42 UTC

Created attachment 168055 [details] [review]
pulsesink: clear the PA mainloop if baseaudiosink failed to open the ring_buffer

If the application requests a state-change and pulsesink fails to open
the ring_buffer device the mainloop attribute of the sink should be
cleaned up to avoid future state-change (NULL->READY) failures.

Colin, can you test this patch please? It should probably go in a
separate bugreport but I first want to confirm it fixes your issue.

Comment 34 Philippe Normand 2010-08-17 11:57:51 UTC

I wonder why the reused pa_context fails, will investigate some more. I think that with the above patch only one sound will be played.

Comment 35 Philippe Normand 2010-08-17 12:08:02 UTC

Found another issue with the hash-table patch. The server name is not taken in account so in the same process I can have 2 pulsesinks that will share the same pa_context and try to connect to different PA servers. This is probably not a good idea ;)

Will work some more on the patch.

Comment 36 Colin Guthrie 2010-08-17 12:58:02 UTC

Ahh yes, different servers would likely break it. Not a common case, but still should be supported I guess!

As for your other patch, yes, this seems to resolve the issue for me. I get both sounds (or at least I think I do, I guess the first could be stopped by the second - kinda hard to tell from just listening as my test trigger case is just relatively short "bing-bongs") but either way I'm happy :D



While testing I did manage to trigger a second bug that is unrelated to this, but which has been causing me major headaches for several months. FYI the other bug is this one https://bugs.kde.org/show_bug.cgi?id=232068 and the crash happens in alsasink, not pulsesink: for some reason PA is not detected as running (something that in this case is done outside of gst for other reasons) but the use of the alsasink with the alsa-pulse plugin ultimately cases a crash... not sure why. Anyway, that is very much off-topic for this bug, but after this is out of the way I'd really appreciate any insights you may have to the issue, but I'll leave that for another day! :)

Many thanks again :)

Comment 37 Philippe Normand 2010-08-17 13:13:10 UTC

Created attachment 168068 [details] [review]
pulsesink: share the PA context between all clients with the same name

Avoid to create a new PA context for each new client by using a hash
table containing the list of ring-buffers and the shared PA context
for each client. Doing this will improve application memory usage in
the cases where multiple pipelines involving multiple pulsesink
elements are used.

Fixes bug #624338.

The hash table keys are now context_name@server_name. So this should
ensure that we can have shared multiple contexts connected to multiple
servers (potentially).

Comment 38 Philippe Normand 2010-08-17 15:38:29 UTC

Created attachment 168102 [details] [review]
pulsesink: share the PA context between all clients with the same name

Avoid to create a new PA context for each new client by using a hash
table containing the list of ring-buffers and the shared PA context
for each client. Doing this will improve application memory usage in
the cases where multiple pipelines involving multiple pulsesink
elements are used.

Fixes bug #624338.

Simpler patch, also fixes invalid memory access when destroy the context.

Comment 39 Philippe Normand 2010-08-17 16:05:43 UTC

(In reply to comment #15)
> 
> Then it might make sense to add a property to set the pulse client name. E.g.
> in webkit you might want to have all pulsesinks of the same tab in the same
> pulse client but not all pulsesinks of all tabs.
> 

See bug 627174

> Also, gst_pulse_client_name() should maybe be improved and return something
> like "GStreamer ($PID)" when the other two ways to get a name fail.

See bug 627162

Comment 40 Sebastian Dröge (slomo) 2010-08-19 10:34:55 UTC

Comment on attachment 168102 [details] [review]
pulsesink: share the PA context between all clients with the same name

There's a small memory leak here, you have to free the GstPulseContext.

I've fixed this locally and will push your changes after the release

Comment 41 Sebastian Dröge (slomo) 2010-09-04 13:14:51 UTC

commit 69a397c32f4baf07a7b2937c610f9e8f383e9ae9
Author: Philippe Normand <pnormand@igalia.com>
Date:   Mon Aug 16 09:12:04 2010 +0200

    pulsesink: share the PA context between all clients with the same name
    
    Avoid to create a new PA context for each new client by using a hash
    table containing the list of ring-buffers and the shared PA context
    for each client. Doing this will improve application memory usage in
    the cases where multiple pipelines involving multiple pulsesink
    elements are used.
    
    Fixes bug #624338.

Comment 42 Stefan Sauer (gstreamer, gtkdoc dev) 2010-09-13 14:05:31 UTC

(In reply to comment #38)
> Created an attachment (id=168102) [details] [review]
> pulsesink: share the PA context between all clients with the same name
> 
> Avoid to create a new PA context for each new client by using a hash
> table containing the list of ring-buffers and the shared PA context
> for each client. Doing this will improve application memory usage in
> the cases where multiple pipelines involving multiple pulsesink
> elements are used.
> 
> Fixes bug #624338.
> 
> Simpler patch, also fixes invalid memory access when destroy the context.

Have you done any measurements on the memory saving? I am a bit lost what problem the patch actually tried to solve. I have reworked the patch under bug #628996 - please check that the original goals are still covered.

Comment 43 Philippe Normand 2010-09-13 18:00:30 UTC

I checked that on PA-side the daemon was reusing the same context and thus not allowing extra shared memory.

I will test the latest changes you commited, sorry again for the trouble.