After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 778263 - YelpMallardDocument frees xmlDocPtr while in threaded use
YelpMallardDocument frees xmlDocPtr while in threaded use
Status: RESOLVED OBSOLETE
Product: yelp
Classification: Applications
Component: Crashers
git master
Other Linux
: Normal major
: ---
Assigned To: Yelp maintainers
Yelp maintainers
Depends on:
Blocks:
 
 
Reported: 2017-02-07 08:05 UTC by Christian Hergert
Modified: 2018-05-22 13:11 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
mallard: protect thread_running with mutex (1.76 KB, patch)
2017-02-09 00:46 UTC, Christian Hergert
none Details | Review
mallard: fix thread data races (16.80 KB, patch)
2017-02-09 00:46 UTC, Christian Hergert
none Details | Review
view: handle NULL result from document read gracefully (1.46 KB, patch)
2017-02-09 00:46 UTC, Christian Hergert
none Details | Review

Description Christian Hergert 2017-02-07 08:05:22 UTC
The YelpMallardDocument seems to track whether or not it is running in a thread, but does not track whether or not it's base class might be running in a thread with data found within the xmlDocPtr.

For example, yelp_transform_start() may have been called with contents from this doc and actively processing a stylesheet.

This results in priv->cache being freed in mallard_monitor_changed() while in use.

Simply commenting out the xmlFreeDoc() seems to improve things. (Although I have a lot of other local changes while trying to track this down, so others may also be responsible).

It's hard to love the design of these layers right now where everything is signaled instead of using async/finish pairs. The lifetime tracking of the structs has me in the twilight zone, and I would strongly encourage updating Yelp*Document and YelpTransform to use modern glib async/finish, autoptr, and some sort of reference-counted wrapper for the libxml2 pointers.
Comment 1 Christian Hergert 2017-02-09 00:46:34 UTC
Created attachment 345275 [details] [review]
mallard: protect thread_running with mutex

TSAN was saying there is a data race with thread_running. It's not super
critical for x86, but relaxed memory models on other systems could be
less reliable.
Comment 2 Christian Hergert 2017-02-09 00:46:47 UTC
Created attachment 345276 [details] [review]
mallard: fix thread data races

There were a few thread data races going on in here due to how data was
accessed from various threads and what was freed when a file monitor
notified that the underlying file changed.

In the instance private data, we need to abstract the reference to the
cache information so that it can outlast being freed during reload.

Additionally, the page data contains the document to be transformed which
could also be freed during the transformation.

Furthermore, some access to the page hashtable was performed without
holding the mutex which could cause inconsistent behavior during the
pointer chases necessary to perform lookups.

This makes a reference counted structure for the cache info (the cache
doc and the cached NS ptr) as well as turns MallardPageData into a
reference counted structure.

This allows us to use the new yelp_transfrom_run_async() API and increase
the reference count during the lifetime of the transform. If we get a
change to the underlying file during this operation, the hashtable will
release it's reference to the page data, but the async transform operation
is still safe to use the xmlDocPtr.

There very well still might be more data races to shake down, but this
fixes my immediate crashes. I would still like to see this code all turned
into modern async code, but that will take more time and effort by
someone who knows the codebase better than me.

I should note that the reloading of mallard documents does not update the
UI still, but I haven't tracked down why. This seems to be fallout still
from the WebKit2 port and this just addresses various crashers.
Comment 3 Christian Hergert 2017-02-09 00:46:50 UTC
Created attachment 345277 [details] [review]
view: handle NULL result from document read gracefully

If the document is being reloaded in the background, we might get a NULL
data read as the page data could have been released between our
notification of data available and our ability to callback.

This seems to be more reliable than trying to check page_id since the
sub-classes can override read_contents() anyway.
Comment 4 GNOME Infrastructure Team 2018-05-22 13:11:38 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/yelp/issues/120.