After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 778515 - Crash in the gio kqueue backend
Crash in the gio kqueue backend
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
2.50.x
Other OpenBSD
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2017-02-12 00:23 UTC by Steven McDonald
Modified: 2017-04-29 02:54 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gio: Always purge kqueue subs from missing list (1.75 KB, patch)
2017-02-12 00:23 UTC, Steven McDonald
committed Details | Review
Backtrace after crash (1.32 KB, text/plain)
2017-02-12 00:24 UTC, Steven McDonald
  Details

Description Steven McDonald 2017-02-12 00:23:16 UTC
Created attachment 345549 [details] [review]
gio: Always purge kqueue subs from missing list

Hi there,

I have encountered an odd bug in the gio kqueue backend. It is consistently reproducible on one of my machines running OpenBSD -current (an X1 Carbon 4th gen), but none of the others, and I do not know what about this system triggers it.

Steps to reproduce:

 1. Open HexChat (I am using 2.12.3 with glib 2.50.2 + OpenBSD patches)
 2. Click on any link to open it in a web browser (pressing F1 to open the help page is sufficient)
 3. HexChat should immediately crash with SIGABRT

The problem appears to be a logic error in gio/kqueue/kqueue-helper.c. _kh_cancel_sub assumes that it only needs to call _km_remove if sub did not exist in subs_hash_table. This is erroneous because the complementary operation, _km_add_missing, can be called from process_kqueue_notifications, in which context sub can *only* have come from subs_hash_table.

Since _km_remove is implemented using g_slist_remove, which is documented to be a noop if the list does not contain the element to be removed, it is safe to call _km_remove unconditionally.

I will be attaching a backtrace of the crash and a patch implementing the proposed solution. Note that this is my first time code diving in glib, so my patch may be way off the mark.

Let me know if you need more information,
Steven
Comment 1 Steven McDonald 2017-02-12 00:24:19 UTC
Created attachment 345550 [details]
Backtrace after crash
Comment 2 rozhuk.im 2017-03-08 22:51:26 UTC
Did you try: https://bugzilla.gnome.org/show_bug.cgi?id=779777
Comment 3 Philip Withnall 2017-03-09 14:25:01 UTC
(In reply to rozhuk.im from comment #2)
> Did you try: https://bugzilla.gnome.org/show_bug.cgi?id=779777

I don’t think that’s relevant here.
Comment 4 Philip Withnall 2017-03-09 14:30:47 UTC
Review of attachment 345549 [details] [review]:

Your analysis and the patch look reasonable. Have you been running your system with the patched GLib for a while?
Comment 5 Steven McDonald 2017-03-09 17:08:57 UTC
(In reply to Philip Withnall from comment #4)
> Review of attachment 345549 [details] [review] [review]:
> 
> Your analysis and the patch look reasonable. Have you been running your
> system with the patched GLib for a while?

Yes, I've been running with this patch for almost a month. I haven't observed any further crashes, or any other problems with glib at all.
Comment 6 Philip Withnall 2017-03-09 17:32:10 UTC
Review of attachment 345549 [details] [review]:

That’s reassuring. Let’s push it. Thanks for the patch!
Comment 7 Philip Withnall 2017-03-09 17:39:25 UTC
Attachment 345549 [details] pushed as e305fe9 - gio: Always purge kqueue subs from missing list
Comment 8 rozhuk.im 2017-03-09 23:46:09 UTC
(In reply to Philip Withnall from comment #3)
> (In reply to rozhuk.im from comment #2)
> > Did you try: https://bugzilla.gnome.org/show_bug.cgi?id=779777
> 
> I don’t think that’s relevant here.

This is diffrent implementation of kqueue backend: less code - less bugs.
Current implementation big and buggy, IMHO. In past there was deadlock and crashes.
I done it because my CPU spent to much time in polling provided by current kqueue backend.
Comment 9 Philip Withnall 2017-03-10 09:22:46 UTC
(In reply to rozhuk.im from comment #8)
> (In reply to Philip Withnall from comment #3)
> > (In reply to rozhuk.im from comment #2)
> > > Did you try: https://bugzilla.gnome.org/show_bug.cgi?id=779777
> > 
> > I don’t think that’s relevant here.
> 
> This is diffrent implementation of kqueue backend: less code - less bugs.
> Current implementation big and buggy, IMHO. In past there was deadlock and
> crashes.
> I done it because my CPU spent to much time in polling provided by current
> kqueue backend.

Perhaps. But the solution to a simple crasher bug, where someone has already provided a patch, is not generally to rewrite everything.