After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 761586 - systemclock: regression causing audiomixer to stall
systemclock: regression causing audiomixer to stall
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other All
: Normal blocker
: 1.7.2
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2016-02-05 12:36 UTC by Thiago Sousa Santos
Modified: 2018-03-27 08:20 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
systemclock: handle unschedule of late entries (3.91 KB, patch)
2016-02-05 18:38 UTC, Thiago Sousa Santos
none Details | Review
systemclock: handle unschedule of late entries (3.57 KB, patch)
2016-02-05 19:00 UTC, Thiago Sousa Santos
committed Details | Review
The previously proposed patch improves the situation, but is not complete. (5.67 KB, patch)
2016-02-10 11:22 UTC, Stian Selnes (stianse)
committed Details | Review
gst_system_clock_id_wait_jitter not returning (2.29 MB, text/plain)
2018-03-27 08:20 UTC, Bhavya
  Details

Description Thiago Sousa Santos 2016-02-05 12:36:49 UTC
commit f0e94b4cdf70955dc2b002148b1636f0dc18f459 for https://bugzilla.gnome.org/show_bug.cgi?id=760757 caused a regression and is making audiomixer tests timeout.

It seems like some clock entry ends up waiting forever even after trying to be unscheduled.
Comment 1 Thiago Sousa Santos 2016-02-05 13:58:12 UTC
What seems to be the relevant and last log lines for the clock:

0:00:00.383852455 16039      0x14ae450 DEBUG              GST_CLOCK gstclock.c:1056:gst_clock_get_internal_time:<GstAudioSrcClock> internal time 0:00:00.253696144
0:00:00.383865314 16039      0x14ae450 DEBUG              GST_CLOCK gstclock.c:1101:gst_clock_get_time:<GstAudioSrcClock> adjusted time 0:00:00.253696144
0:00:00.383878472 16039      0x14ae450 DEBUG              GST_CLOCK gstclock.c:253:gst_clock_entry_new:<GstAudioSrcClock> created entry 0x7f8e100042b0, time 0:00:00.273061224
0:00:00.383895595 16039      0x14ae450 DEBUG              GST_CLOCK gstclock.c:542:gst_clock_id_wait:<GstAudioSrcClock> waiting on clock entry 0x7f8e100042b0
0:00:00.383912717 16039      0x14ae450 DEBUG             audioclock gstaudioclock.c:172:gst_audio_clock_get_internal_time:<GstAudioSrcClock> result 0:00:00.253786847, last_time 0:00:00.253786847
0:00:00.383929110 16039      0x14ae450 DEBUG              GST_CLOCK gstclock.c:1056:gst_clock_get_internal_time:<GstAudioSrcClock> internal time 0:00:00.253786847
0:00:00.383942963 16039      0x14ae450 DEBUG              GST_CLOCK gstclock.c:1101:gst_clock_get_time:<GstAudioSrcClock> adjusted time 0:00:00.253786847
0:00:00.383955129 16039      0x14ae450 DEBUG              GST_CLOCK gstsystemclock.c:707:gst_system_clock_id_wait_jitter_unlocked: entry 0x7f8e100042b0 time 0:00:00.273061224 now 0:00:00.253786847 diff (time-now) 19274377
0:00:00.383980898 16039      0x14ae450 DEBUG              GST_CLOCK gstsystemclock.c:735:gst_system_clock_id_wait_jitter_unlocked: entry 0x7f8e100042b0 unlocked, status 3, ret 1
0:00:00.646847957 16039      0x14ae320 DEBUG             audioclock gstaudioclock.c:172:gst_audio_clock_get_internal_time:<GstAudioSrcClock> result 0:00:00.507188208, last_time 0:00:00.507188208
0:00:00.646899899 16039      0x14ae320 DEBUG              GST_CLOCK gstclock.c:1056:gst_clock_get_internal_time:<GstAudioSrcClock> internal time 0:00:00.507188208
0:00:00.646916823 16039      0x14ae320 DEBUG              GST_CLOCK gstclock.c:1101:gst_clock_get_time:<GstAudioSrcClock> adjusted time 0:00:00.507188208
0:00:00.646980370 16039      0x14ae320 DEBUG              GST_CLOCK gstsystemclock.c:1000:gst_system_clock_id_unschedule: unscheduling entry 0x7f8e100042b0
0:00:00.887998162 16039      0x14ae320 DEBUG              GST_CLOCK gstsystemclock.c:1000:gst_system_clock_id_unschedule: unscheduling entry 0x7f8e100042b0
Comment 2 Tim-Philipp Müller 2016-02-05 14:04:43 UTC
I also notice the gst/gstsystemclock test in core timing out often these days, in the new test.
Comment 3 Thiago Sousa Santos 2016-02-05 18:07:29 UTC
This seems to be the root cause for the issue later, the wakeup_count ends up being messed:

0:00:01.040926341 26812      0x1cc0850 DEBUG              GST_CLOCK gstclock.c:253:gst_clock_entry_new:<GstAudioSrcClock> created entry 0x7f86c0002e70, time 0:00:00.779297051
0:00:01.040943389 26812      0x1cc0850 DEBUG              GST_CLOCK gstclock.c:542:gst_clock_id_wait:<GstAudioSrcClock> waiting on clock entry 0x7f86c0002e70
0:00:01.040957159 26812      0x1cc0850 DEBUG             audioclock gstaudioclock.c:172:gst_audio_clock_get_internal_time:<GstAudioSrcClock> result 0:00:00.780136054, last_time 0:00:00.780136054
0:00:01.040954297 26812      0x1cc0720 DEBUG              GST_CLOCK gstsystemclock.c:1019:gst_system_clock_id_unschedule: unscheduling entry 0x7f86c0002e70
0:00:01.040971786 26812      0x1cc0850 DEBUG              GST_CLOCK gstclock.c:1056:gst_clock_get_internal_time:<GstAudioSrcClock> internal time 0:00:00.780136054
0:00:01.040997865 26812      0x1cc0720 DEBUG              GST_CLOCK gstsystemclock.c:1028:gst_system_clock_id_unschedule: unscheduled entry 0x7f86c0002e70 2
0:00:01.041014502 26812      0x1cc0850 DEBUG              GST_CLOCK gstclock.c:1101:gst_clock_get_time:<GstAudioSrcClock> adjusted time 0:00:00.780136054
0:00:01.041025315 26812      0x1cc0720 DEBUG              GST_CLOCK gstsystemclock.c:1035:gst_system_clock_id_unschedule: entry was BUSY, doing wakeup
0:00:01.041030800 26812      0x1cc0850 DEBUG              GST_CLOCK gstsystemclock.c:710:gst_system_clock_id_wait_jitter_unlocked: entry 0x7f86c0002e70 time 0:00:00.779297051 now 0:00:00.780136054 diff (time-now
0:00:01.041042960 26812      0x1cc0720 DEBUG              GST_CLOCK gstsystemclock.c:394:gst_system_clock_add_wakeup: writing control
0:00:01.041060645 26812      0x1cc0850 DEBUG              GST_CLOCK gstclock.c:552:gst_clock_id_wait:<GstAudioSrcClock> done waiting entry 0x7f86c0002e70, res: 2 (unscheduled)
0:00:01.041079436 26812      0x1cc0720 DEBUG              GST_CLOCK gstsystemclock.c:410:gst_system_clock_add_wakeup: wakeup count 1
0:00:01.041136890 26812      0x1cc0850 DEBUG              GST_CLOCK gstclock.c:356:_gst_clock_id_free: freed entry 0x7f86c0002e70

T1 and T2 are the 2 threads

1) T1: The entry is created and a wait for it is done.
2) T2: Tries to unschedule the entry
3) T2: unschedule receives a BUSY return
4) T1: wait finishes and it is too late
5) T2: writes at the poll and increases wakeup count
6) T1: entry is done. and freed.

The wakeup count is never decremented again and subsequent uses will stall
Comment 4 Thiago Sousa Santos 2016-02-05 18:38:21 UTC
Created attachment 320511 [details] [review]
systemclock: handle unschedule of late entries

If the clockentry is too late and is unscheduled before it gets
a change to detect its lateness the wakeup count and the poll are
used but never properly cleaned up. This leaves it in a dirty state
that is going to mess with the next clock entry waiting requests.
Comment 5 Tim-Philipp Müller 2016-02-05 18:50:47 UTC
Comment on attachment 320511 [details] [review]
systemclock: handle unschedule of late entries

Nice work tracking that down.

Mostly commenting on the code style of the refactoring, not the fix itself.

I find the way you have factored out gst_system_clock_check_and_cleanup_unscheduled() a bit confusing, because the 'check' part (status argument and boolean return value) is only used in one case, in the other two cases you ignore the return value and check status outside of the function.

I think in the first chunk you should keep the status checking there and just turn the function into a cleanup function without passing status or returning a return value, unless I missed something.
Comment 6 Thiago Sousa Santos 2016-02-05 19:00:17 UTC
Created attachment 320513 [details] [review]
systemclock: handle unschedule of late entries

Indeed, the many refactorings I did after trying different approaches
led it to that state. Thanks for spotting it, this version should be
cleaner
Comment 7 Tim-Philipp Müller 2016-02-05 19:21:29 UTC
Patch makes the audiomixer test pass reliably for me, systemclock test still passes reliably.
Comment 8 Stian Selnes (stianse) 2016-02-10 11:22:01 UTC
Created attachment 320783 [details] [review]
The previously proposed patch improves the situation, but is not complete.

This patch fixes a possible race on top of that patch and adds a couple of tests
that triggers the issues fixed by both the previous and this patch. They should
probably be squashed before applied?

---

systemclock: Fix wait/unschedule race

Fixes a race where an entry is set to BUSY in
gst_system_clock_id_wait_jitter() and is UNSCHEDULED before
gst_system_clock_id_wait_jitter_unlocked() starts processing it. The
wakeup added by gst_system_clock_id_unschedule() must be cleaned up.

Two stress tests are added. One test that triggers the specific issue
described above. The second stresses the code path where a wait is
rescheduled because the poll returned early.
Comment 9 Thiago Sousa Santos 2016-02-10 13:02:37 UTC
Thanks for spotting the missing piece of the patch and, specially, for the tests!

commit 2b0ff13d070f370286078c5531b2e13006e5ce98
Author: Stian Selnes <stian@pexip.com>
Date:   Wed Feb 10 09:09:29 2016 +0100

    systemclock: Fix wait/unschedule race
    
    Fixes a race where an entry is set to BUSY in
    gst_system_clock_id_wait_jitter() and is UNSCHEDULED before
    gst_system_clock_id_wait_jitter_unlocked() starts processing it. The
    wakeup added by gst_system_clock_id_unschedule() must be cleaned up.
    
    Two stress tests are added. One test that triggers the specific issue
    described above. The second stresses the code path where a wait is
    rescheduled because the poll returned early.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=761586

commit 77a98d98cbed15e0bfdd8fe982b7ac83b1826d13
Author: Thiago Santos <thiagoss@osg.samsung.com>
Date:   Fri Feb 5 15:34:47 2016 -0300

    systemclock: handle unschedule of late entries
    
    If the clockentry is too late and is unscheduled before it gets
    a change to detect its lateness the wakeup count and the poll are
    used but never properly cleaned up. This leaves it in a dirty state
    that is going to mess with the next clock entry waiting requests.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=761586
Comment 10 Bhavya 2018-03-27 08:14:33 UTC
@ Thiago Sousa Santos 
Hello,

Currently we are using gstreamer version 1.6.3 on imx6x platform. We came across same issue. Applied above two patches on 1.6.3. But still issue is reproducing.
Please check logs attached.
gst_clock_id_wait() will never return.
One thread is waiting on the clock and another thread is disposing the clock.


Please suggest if there is any wrong handling done from application side.

Below is the issue logs section :

0:00:11.336737668 [336m  755[00m 0x72e022f0 [37mDEBUG  [00m [00m            basesink gstbasesink.c:2539:gst_base_sink_do_sync:<a_fakesink>[00m possibly waiting for clock to reach 0:00:10.727573696, adjusted 0:00:10.727573696
0:00:11.336880335 [336m  755[00m 0x72e022f0 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstclock.c:542:gst_clock_id_wait:<GstAudioSinkClock>[00m waiting on clock entry 0x72505ad8
0:00:11.336975668 [336m  755[00m 0x72e022f0 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstsystemclock.c:857:gst_system_clock_id_wait_jitter:[00m gst_system_clock_id_wait_jitter
0:00:11.337070335 [336m  755[00m 0x72e022f0 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstclock.c:1019:gst_clock_get_internal_time:<GstAudioSinkClock>[00m internal time 0:00:10.720000000
0:00:11.337216335 [336m  755[00m 0x72e022f0 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstclock.c:1064:gst_clock_get_time:<GstAudioSinkClock>[00m adjusted time 0:00:10.720000000
0:00:11.337325001 [336m  755[00m 0x72e022f0 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstsystemclock.c:703:gst_system_clock_id_wait_jitter_unlocked:[00m entry heere asd0x72505ad8 time 0:00:10.727573696 now 0:00:10.720000000 diff (time-now) 7573696

0:00:11.344644001 [336m  755[00m 0x724fdc60 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstsystemclock.c:383:gst_system_clock_add_wakeup:[00m writing control

0:00:11.344801668 [336m  755[00m 0x72e022f0 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstsystemclock.c:731:gst_system_clock_id_wait_jitter_unlocked:[00m entry 0x72505ad8 unlocked, status 3, ret 1

0:00:11.344959668 [336m  755[00m 0x724fdc60 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstsystemclock.c:399:gst_system_clock_add_wakeup:[00m wakeup count 1
0:00:11.345121335 [336m  755[00m 0x724fdc60 [37mDEBUG  [00m [00;01;33m           GST_CLOCK gstsystemclock.c:231:gst_system_clock_dispose:[00m joined thread
Comment 11 Bhavya 2018-03-27 08:20:03 UTC
Created attachment 370175 [details]
gst_system_clock_id_wait_jitter not returning

On top of gstreamer 1.6.3, applied changes from commit 77a98d98cbed15e0bfdd8fe982b7ac83b1826d13 and  2b0ff13d070f370286078c5531b2e13006e5ce98