GNOME Bugzilla – Bug 665824
librsvg leaks memory when parsing <desc> tags
Last modified: 2012-01-05 11:50:04 UTC
Created attachment 203126 [details] [review] PATCH: Fix memory leak when parsing <desc> tags Whenever librsvg parses a <desc> tag it leaks the string containing the contents of the tag. The string is allocated in rsvg_start_desc() but is never freed. A patch is attached against git master. It fixes the leak by freeing the string when the desc handler is freed. Thanks for working on a very handy library. :)
I don't think this is correct. First, the desc string is needed after parsing (rsvg_handle_get_desc()) so you cannot free it at this point. Furthermore, the RsvgHandle:instance_dispose func does free this string. (on rsvg git master at least). Do you have an actual testcase (or valgrind run of some programme) that shows this leak, using librsvg master?
Sorry Christian, I had misunderstood the way that librsvg was using <desc> here. I now understand this comment in rsvg-base.c: /* This isn't quite the correct behavior - in theory, any graphics element may contain a metadata or desc element */ That describes my SVG document - it has dozens of <desc> tags. If you look at the parsing behaviour, as each an additional <desc> is parsed it will leak the old one. I did have a valgrind leak-check run that showed this behaviour quite clearly but I unfortunately deleted it after I found the bug. If you're still not seeing it then I'm happy to recreate. In the meantime, I've attached an alternative patch that just frees the old 'desc' or 'title' string if one already exists, when parsing the new one. This will give the same behaviour as currently exists for multiple <desc> or <title> tags, except without leaking.
Created attachment 203305 [details] [review] Possible patch: Fix leak when document contains multiple <desc> or <title> tags
Ok; I see the problem. (And <metadata> has the same problem.) I don't really like making the rsvg_handle_get_{title,desc,metadata} API (more) wrong though which the patch does since it'll return the one from the last element to have one, instead of the one from the toplevel element as intended in the API... I guess since it's not really used anywhere (google codesearch finds no real users of rsvg_handle_get_desc/title/metadata) we could just make them always return NULL and just throw away these element's content.
Yeah - well the patch doesn't change the same bad behaviour you describe. At the moment those calls still return the last element that had one of those tags. I was trying to think of a way to guarantee that the API will always return the <desc> or <title> from the top. I suppose you'd need to look at the SAX parser context and see if your parent is <svg> or something else, and decide on that basis. I can try and put together a patch along those lines if you're interested.
I don't think it's worth fixing since there's no users of this API. So unless the other librsvg developers disagree, I think we can just deprecate these APIs (and make them just return NULL (or "" ?) always).
Created attachment 203312 [details] [review] Possible patch: only parse metadata tags <title>, <desc> & <metadata> when immediate children of <svg> Fair enough. I just saw this now, but I also just knocked together this alternative patch which should give the original documented API behaviour. Have run it under valgrind for my basic use case. Obviously I'm not fussed which path the developers take, myself - I'm not using the API either! :) Cheers - Angus
Review of attachment 203312 [details] [review]: Urgh, please disregard this last patch totally. It may run under valgrind but it doesn't work in actual daily use (the metadata tags' content get parsed in as content for any <text> element.) My mistake for rushing it. Given that you've said you want to remove the whole feature, I won't bother amending it unless you've changed your mind.
Fixed on master.
Created attachment 204645 [details] [review] Fix bug introduced by fix, rendering <text> elements Thanks for that Christian. Unfortunately the fix on mainline introduces the same regression as my patch 203312 - specifically, that if a <text> element contains a <desc> element or similar as a child, the contents of the <desc> gets rendered as text. It seems this is caused by setting 'characters' to NULL, which causes the SAX parser to fall back on another handler. The fix is very simple - because 'string' is NULL in these cases the characters handler can be safely called. Patch attached. Thanks again - Angus
Pushed to master; thanks!