GNOME Bugzilla – Bug 127840
gtktextview uses huge amounts of memory on large files
Last modified: 2014-07-25 19:25:39 UTC
I tried loading a 10 meg file with one *huge* line of 'A's in testtext and gedit and watched it spend ages to load the thing. When it finished reading the file in it spent a huge amount of time allocating memory. After a while it was using 550 megs of RAM and still hadn't gotten to the point where I could see the file's content. I tried the same after adding linebreaks after every 'A' (20 meg file) which made it read the full file faster, but it still consumed over 500 megs of ram without being able to show the file content. Tried this using gtk-2-2 (gedit) and HEAD (testtext)
You picked the two extreme cases at which the textbuffer sucks. If you split 10M As in lines of 128 bytes each, gedit loads the file quickly and uses only 34M. To cite docs/text_widget_internal.txt: The text widget is efficient for huge numbers of paragraphs, but will choke on extremely long blocks of text without intervening newlines. (This explains why a 10M character paragraph sucks, but not why 10M 1 character paragraphs kill the text view.)
Looking at the structures used to store lines in the B-tree, we seem to have a GtkTextLine (16 bytes) and at least one GtkTextLineSegment (>= 20 bytes) per line, so for your example of 10M 1 character lines, we roughly need 36*10M bytes for the leafs of the tree. If you add the tree itself, you may well end up with ~500M.
Maybe this should just be closed since the limitations are well known and documented?
Will this ever be fixed? Any plans to remove these limitations?
it's safe to say that, if nobody has worked on the (documented) limitations for the past 10 years, it's unlikely that somebody will. unless, obviously, you're volunteering.
*** Bug 679724 has been marked as a duplicate of this bug. ***
I'm going to close this. If somebody shows up to work on this, they can simply open a new bug to attach their patch.