After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 160917 - Gedit freezes while performing lenghtly search and replace operations
Gedit freezes while performing lenghtly search and replace operations
Status: RESOLVED FIXED
Product: gedit
Classification: Applications
Component: general
2.8.x
Other Linux
: Normal normal
: ---
Assigned To: Gedit maintainers
gedit QA volunteers
Depends on:
Blocks:
 
 
Reported: 2004-12-10 02:33 UTC by James
Modified: 2005-12-17 10:33 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Profiling data collected during a big "replace all" operation... (24.59 KB, application/x-compressed-tar)
2005-02-09 00:44 UTC, Luis Menina
  Details
proposed patch (1.23 KB, patch)
2005-02-09 09:05 UTC, Paolo Borelli
committed Details | Review
Comparison of the results without and with the patch (2.10 KB, text/plain)
2005-02-10 01:09 UTC, Luis Menina
  Details
Sysprof profiling data for gedit 2.12.1 (98.90 KB, text/plain)
2005-10-27 00:10 UTC, Luis Menina
  Details
Patch to avoid sending GUI update events while in "replace all" (1.32 KB, patch)
2005-10-31 20:00 UTC, Luis Menina
none Details | Review
Sysprof profile with the patch above applied (43.14 KB, application/x-gzip)
2005-10-31 20:03 UTC, Luis Menina
  Details

Description James 2004-12-10 02:33:32 UTC
1.  Get a big text file (such as a 1.5 mb sql file)
2.  Search and replace a text string.
3.  Gedit freezes while performing the task, but when finished, unfreezes.
Comment 1 Paolo Maggi 2004-12-10 08:58:21 UTC
You are right, gedit has quite big performance problems while performing
"Replace All" operations on big files.
This is an area where gedit would need some profiling.
Comment 2 Luis Menina 2005-02-09 00:44:35 UTC
Created attachment 37220 [details]
Profiling data collected during a big "replace all" operation...

gedit_view_update_cursor_position_statusbar and gedit_document_find seem to
have some performance problems...
Comment 3 Paolo Borelli 2005-02-09 09:03:35 UTC
Thanks! this is really valuable information... that's a gprof profile, right? I
don't trust gprof much, is the data reproducible? Can you also describe in
detail how you did the profile? I never had much luck with gprof and GUI
programs, but maybe that's my fault :)


Looking at the data, gedit_view_update_cursor_position_statusbar can definately
be one of the bottlenecks since each time is called has to count how many lines
there are from the beginning of the buffer and how many chars there are from the
beginning of the current line. I don't see an easy way to improve that (keeping
the counter up to date instead of counting from the start every time seems an
abvious optimization, but may become tricky if you consider that the user may
move the cursor at a random place with the mouse).

Much more interesting is to find out why it is called a shitload of times...
here the first thing that comes to my mind is to disable counting while
performing the search&replace.
However just by braking in update_cursor_position_statusbar() from gdb it turns
out that the function is called a lot of times even just when scrolling the text
and that sounds strange to me, so I've come up with the following really simple
patch: it does not remove all the unneeded calls, but it should improve things
and seems pretty simple and logical to me. Any chance you could test again with
the ptach applied?

(PS: in future can you attach the profile data as plain text instead of gzipping
it? It makes things more confortable and bugzilla is not falling short on storage :)
Comment 4 Paolo Borelli 2005-02-09 09:05:06 UTC
Created attachment 37229 [details] [review]
proposed patch
Comment 5 Paolo Borelli 2005-02-09 09:06:36 UTC
Luis, it looks like you forgot to add yoourself to CC :)

Can you read the above comment, in caseyou have missed it?
Comment 6 Paolo Borelli 2005-02-09 15:05:01 UTC
ok here are some numbers obtained with the patch above by s/gedit/pippo in the
profile file you attached above. The numbers are obtained by installing a timer
in gedit-document.c and measuring the elapsed time from the start to the end of
the replace_all function.

without the patch:
paolo@murdock:~/cvs/gnome2/gedit$ gedit/gedit
** Message: start : 0,000
** Message: end : 2,231
** Message: start : 0,000
** Message: end : 2,156
** Message: start : 0,000
** Message: end : 2,202
** Message: start : 0,000
** Message: end : 2,202


with the patch:
paolo@murdock:~/cvs/gnome2/gedit$ gedit/gedit
** Message: start : 0,000
** Message: end : 1,985
** Message: start : 0,000
** Message: end : 2,035
** Message: start : 0,000
** Message: end : 1,750
** Message: start : 0,000
** Message: end : 1,743

A speedup is measurable, even if not huge... this may also be due to the test
file: in the test file the string "gedit" is present on almost every line, so
the percentage of the  the time spent in the actual substitution may be larger
then in an average file.
Comment 7 Luis Menina 2005-02-10 00:30:56 UTC
Thanks for adding me to CC paolo, I'm still learning this all :-)
I tried to do my best, I'm very new to linux programming, and even with the
gnome-love help, the learning curve is still high... 

For the profiling data, yes I used gprof ( I used the information I found at
http://www.network-theory.co.uk/docs/gccintro/gccintro_66.html ), and used a
test file with lots of duplicate lines (~ 10000 lines). On my machine, replacing
a word is about 8-10 seconds... The same action with vim is intantaneous !

Gedit had previously been successfully compiled by jhbuild. I did the profiling
this way : 
1. I performed a "make clean"
2. Compiled with "make CFLAGS=-pg" (do you know if this is the way I should
modify compilation settings in a autotools-driven project ?)
3. I ran Gedit on my test file, then performed a big "replace all"
4. Profiling data was extracted with "gprof /opt/gnome2/bin/gedit >
gedit_replace_all_profiling.txt"

If I compressed the data its because I thought 228KB of text data was a little
too big. I will not do it anymore, I promise :-) . I try your patch ASAP...
Comment 8 Paolo Borelli 2005-02-10 00:49:35 UTC
> 2. Compiled with "make CFLAGS=-pg" (do you know if this is the way I should
> modify compilation settings in a autotools-driven project ?)

Sounds right, but the fact is that theoretically you would need to recompile
also all the underling libraries (glib, atk, gtk etc) with the same flags...
anyway for now it would be great if you could just measure if the above patch
improves things whithout changing other things.


Comment 9 Luis Menina 2005-02-10 01:09:05 UTC
Created attachment 37272 [details]
Comparison of the results without and with the patch

I put it as an attachment since columns are too wide to fit well in a normal
comment... My test file has 9879 lines... you can see that now
gedit_view_update_cursor_position_statusbar is roughly called once per line.
But I'm not sure if it really has to be called at all anyway...
Comment 10 Luis Menina 2005-02-10 01:22:20 UTC
What I don't understand is how gedit_view_update_cursor_position_statusbar can
be called thrice less and still replacement have so poor performance (still
around 8 seconds for me). It still needs improvement, because this is awful
compared to vim.

Otherwise, I saw nothing broken, but only gave it a quick try.
Comment 11 Paolo Maggi 2005-02-12 10:11:43 UTC
Comment on attachment 37229 [details] [review]
proposed patch

The patch looks good to me, please commit.
Comment 12 Paolo Borelli 2005-02-12 10:56:41 UTC
Comment on attachment 37229 [details] [review]
proposed patch

committed the above patch, but further optmization would still be good.
Comment 13 Luis Menina 2005-10-27 00:10:57 UTC
Created attachment 53929 [details]
Sysprof profiling data for gedit 2.12.1

Adding this sysprof data in case someone can understand better the problem from
it (the file is compressed because it was refused by bugzilla: too big). Check
out http://primates.ximian.com/~federico/news-2005-07.html#26 for more info
about sysprof.

The sample file was genererated with:
for i in $(seq 5000); do echo "foo bar foobar" >> gedit-replace-all-sample.txt;
done

I replaced "foo" by "fun" to have on each line "fun bar funbar".

What I understand from this is that too many events are sent while performing
the replacement, while we only need to redraw everything when the global
replacement is done, and not after replacing each word.
But I'm unsure of that. The other guilty one may be
gedit_mdi_child_find_state_changed_handler. May one of the Paolos tell me if
the comportment of this function is affected by the ongoing new_mdi
modifications, and maybe try to check if the results are better in new_mdi ?
Comment 14 Federico Mena Quintero 2005-10-27 03:30:05 UTC
The sysprof log from comment #13 is great; it clearly indicates the problems.  I
just blogged about this; please see
http://primates.ximian.com/~federico/news-2005-10.html#gedit-replace-all-sysprof
for the details.

The summary is this:

1. gedit_document_replace_all() calls gedit_document_find() many times.

2. When gedit_document_find() does its work, it emits a signal that the GUI code
uses to update the sensitivity of menu items.

3. The GUI code changes the sensitivity of a couple menu items. It does this by
asking the BonoboUIComponent to freeze itself, then changing the sensitivity,
then thawing the component. Bonobo then updates the widgets that correspond to
menu items with pending changes.

The problem is that (3) happens for every match in the "replace all" process!
The solution is to update the GUI only once, when the whole process has finished.
Comment 15 Paolo Maggi 2005-10-27 08:30:48 UTC
IIRC, we still have to write "Replace All" in new_mdi... help is welcome.

Your data demonstrate how some time a developer could be wrong when he tries to
guess where the performance problems of a given piece of code are (without
trying to gather real profiling data).
I was quite sure the most important performance problem was the undo_manager
with its bad list management.

Thanks for your help!
Comment 16 Paolo Borelli 2005-10-27 09:56:59 UTC
Luis, Federico: awesome!

As Paolo said, the problem doesn't affect new_mdi (yet), but knowing about it
will surely halp us avoid introducing it.

Paolo: replace_all is there already, but we currently don't implement the 'find
again' functionality (in the sense that you can find again, but sensitivity is
not set and last searched term is not persisted) so we never emit that signal.
By the way, I think that we should get rid of the signal and turn it into a
property, but that obviusly doesn't affect the fact that the property should
just be set once at the end of replace_all.
Comment 17 Luis Menina 2005-10-27 13:14:59 UTC
Thanks federico for your help on this !
Still, I don't know if doing a patch is necessary, if the new_mdi branch is not
affected... Unless the new_mdi release of gedit is not ready for GNOME 2.14...
Is it worth the trouble ?

Paolo said: "I was quite sure the most important performance problem was the
undo_manager with its bad list management."

You may still be wright on this point, Paolo. I noticed yesterday that after
doing the replacements, Gedit was reaaaly slow to quit when modifications are
not saved. So maybe your intuition is correct here about the undo_manager. Does
this comportment also happen in the new_mdi branch ?

Comment 18 Paolo Borelli 2005-10-27 13:35:39 UTC
Luis, here are some clarifications:

 - new_mdi is currently not affected by the problem, but this is because we have
not a system in place to specify the sensitivity of 'find again' (that is to say
that the can_find_again signal is never emitted). This is because the 'search'
code has been changed in many ways, for instance to support find-as-you-type,
the code is still in flux and we probably need some discussion on how to
finalize it. Anyway at some point we need code to set the sensitivity on 'find
again'. And yes, new_mdi is targeted at 2.14.

 - As for the patch for 2.12, I'd say that it's up to you... it's a dying
codebase, but I guess that after the effort you put into this a patch may be fun
 to write and should be pretty easy[1]. It would also help us to quantify the
improvement. If it's small and self contained I'd be ok to include it for
2.12.2, however I haven't discussed this with Paolo, so he may not agree.

 - for the undo manager, see bug #312653. It's a gtksourceview issue, thus it's
still valid for new_mdi


[1]: should be a matter of factoring out a gedit_document_find_real() that
doesn't emit the signal from gedit_document_find and use it in replace_all.
Comment 19 Luis Menina 2005-10-31 19:57:39 UTC
As stated in patch submission reference: 
http://live.gnome.org/GnomeLove_2fSubmittingPatches , I join the changelog in
the comments of the patch... 

Improvement can be noticed: 
without patch: 15 seconds
with patch: 10 seconds

I also attach a gzipped sysprof profile with the patch, showing that now
gedit_document_replace_selected_text seems to be the way to go...
But I still don't understand why using vim, the operations is instantaneous on
the same file...
Comment 20 Luis Menina 2005-10-31 20:00:26 UTC
Created attachment 54151 [details] [review]
Patch to avoid sending GUI update events while in "replace all"

2005-10-31  Luis Menina  <liberforce@fr.st>

	* gedit/gedit-document.c: Improvement for bug #160917
	gedit_document_replace_all now calls gedit_document_find_real
	instead of gedit_document_find to avoid sending useless 
	GUI update events
Comment 21 Luis Menina 2005-10-31 20:03:39 UTC
Created attachment 54153 [details]
Sysprof profile with the patch above applied
Comment 22 Paolo Borelli 2005-12-15 10:35:25 UTC
Ok, this is finally fixed for good on HEAD (2.13.0, based on the new_mdi branch).


Luis, feel free to give it a run just to confirm :)


Paolo, do we close the bug or do you think that it's worth at least apply the
last patch to the 2.12 branch? (patch is easy enough, but I am not sure we are
even going to have another 2.12.X release)
Comment 23 Luis Menina 2005-12-17 01:04:28 UTC
Ok, tried with HEAD... 1.5 second :-) that's much better !
Undoing the action however takes 7.5s, but this must be related with  bug #312653
This one is good for me, ok for closing. It's a pity my first patch wasn't
applied, even if trivial :'-(
Maybe next one :-p
Comment 24 Paolo Maggi 2005-12-17 09:25:31 UTC
Yes, please keep 2.12 alive. You know, we need a fallback in the case HEAD will
not prove itself stable enough.
Comment 25 Paolo Borelli 2005-12-17 10:33:03 UTC
Ok, I committed the last patch to the 2-12 branch. Btw, Luis your profiling work
has been wonderful and would have been very appreciated even if the patch didn't
went in :)

About undoing the replace all being slower: this is due to the fact that at the
moment GtkSourceView's UndoManager sees the replace all as a collection of
deletes and inserts, so when we are performing the real replace all we can
prevent the signal from being emitted, but during the undo there is no way we
can prevent emitting the signal for every change in the buffer.

This is a known issue and not easily solvable with the current undo manager,
unless we move replace all into SourceView itself.

I am closing this bug.