After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 68720 - gedit perfomance problems with big files
gedit perfomance problems with big files
Status: RESOLVED NOTABUG
Product: gedit
Classification: Applications
Component: general
0.9.7
Other All
: Normal normal
: ---
Assigned To: Gedit maintainers
gedit QA volunteers
Depends on:
Blocks:
 
 
Reported: 2002-01-15 00:44 UTC by Rodd Clarkson
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Profiler output for 18000+ case-sensitive replacements (164.93 KB, text/plain)
2002-08-12 15:38 UTC, Narayana Pattipati
  Details
caching group count (2.46 KB, patch)
2002-08-20 16:03 UTC, Mukund
committed Details | Review
Does away with varargs processing inside gedit_debug() when debug flags not set (1.59 KB, patch)
2002-08-20 16:32 UTC, Mukund
committed Details | Review

Description Rodd Clarkson 2002-01-15 00:09:53 UTC
Package: gedit
Severity: normal
Version: 0.9.7
Synopsis: Crash doing search on large text file
Bugzilla-Product: gedit
Bugzilla-Component: general

Description:
I'm working with a couple of large text files (that i can't send for reasons of privacy.)

One is 211k, another is 240k.

I opened the find window using F5, and then searched for Girral (it's in there).
gedit then crashed




Debugging Information:

(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...[New Thread 1024 (LWP 23761)]

0x40742989 in __wait4 () from /lib/i686/libc.so.6
  • #0 __wait4
    from /lib/i686/libc.so.6
  • #1 __DTOR_END__
    from /lib/i686/libc.so.6
  • #2 waitpid
    at wrapsyscall.c line 172
  • #3 gnome_segv_handle
    at eval.c line 41
  • #4 pthread_sighandler
    at signals.c line 97
  • #5 <signal handler called>
  • #6 strlen
    from /lib/i686/libc.so.6
  • #7 gedit_search_start
    at eval.c line 41
  • #8 gedit_dialog_replace
    at eval.c line 41
  • #9 gtk_marshal_NONE__NONE
    from /usr/lib/libgtk-1.2.so.0
  • #10 gtk_handlers_run
    from /usr/lib/libgtk-1.2.so.0
  • #11 gtk_signal_real_emit
    from /usr/lib/libgtk-1.2.so.0
  • #12 gtk_signal_emit
    from /usr/lib/libgtk-1.2.so.0
  • #13 gtk_accel_group_activate
    from /usr/lib/libgtk-1.2.so.0
  • #14 gtk_accel_groups_activate
    from /usr/lib/libgtk-1.2.so.0
  • #15 gtk_window_key_press_event
    from /usr/lib/libgtk-1.2.so.0
  • #16 gtk_marshal_BOOL__POINTER
    from /usr/lib/libgtk-1.2.so.0
  • #17 gtk_signal_real_emit
    from /usr/lib/libgtk-1.2.so.0
  • #18 gtk_signal_emit
    from /usr/lib/libgtk-1.2.so.0
  • #19 gtk_widget_event
    from /usr/lib/libgtk-1.2.so.0
  • #20 gtk_propagate_event
    from /usr/lib/libgtk-1.2.so.0
  • #21 gtk_main_do_event
    from /usr/lib/libgtk-1.2.so.0
  • #22 gdk_event_dispatch
    from /usr/lib/libgdk-1.2.so.0
  • #23 g_main_dispatch
    from /usr/lib/libglib-1.2.so.0
  • #24 g_main_iterate
    from /usr/lib/libglib-1.2.so.0
  • #25 g_main_run
    from /usr/lib/libglib-1.2.so.0
  • #26 gtk_main
    from /usr/lib/libgtk-1.2.so.0
  • #27 main
    at eval.c line 41
  • #28 __libc_start_main
    at ../sysdeps/generic/libc-start.c line 129
  • #0 __wait4
    from /lib/i686/libc.so.6
  • #1 __DTOR_END__
    from /lib/i686/libc.so.6
  • #2 waitpid
    at wrapsyscall.c line 172
  • #3 gnome_segv_handle
    at eval.c line 41
  • #4 pthread_sighandler
    at signals.c line 97
  • #5 <signal handler called>
  • #6 strlen
    from /lib/i686/libc.so.6
  • #7 gedit_search_start
    at eval.c line 41




------- Bug moved to this database by unknown@bugzilla.gnome.org 2002-01-14 19:09 -------

Reassigning to the default owner of the component, chema@celorio.com.

Comment 1 Rodd Clarkson 2002-01-15 00:15:18 UTC
Ooops, that was F6 not F5.

I opened both files again and couldn't reproduce the crash.  Bummer

Hope the debug info helps chema!
Comment 2 Luis Villa 2002-01-29 20:30:59 UTC
Any chance this information will be usable, guys?
Comment 3 Luis Villa 2002-01-29 20:55:04 UTC
Cleaning up and reassigning. Paolo, you are down to 17 bugs, many of which are
crashers. If you could read through all the bugs, I'd appreciate it. You should
do two things for each bug, especially crashers:
1) If the code path is not relevant for GNOME2, please remove the GNOME2 keyword.
2) if the code is otherwise irrelevant (not in gedit, already fixed, whatever)
please close it out.
If you do these things, hopefully I can help keep the gedit corner of the
bugzilla much cleaner in the future.
Search on 'luis doing GNOME2 work' to filter out this spam.
Comment 4 Yogeesh 2002-05-16 12:50:50 UTC
I tested with 1.118.0 on linux/solaris with approx 3k line text file, 
this bug couldnot be reproduced.
Comment 5 Rodd Clarkson 2002-05-17 00:00:47 UTC
I'm using gedit2-1.119.0.0.200205141824-0.snap.ximian.1 and after
loading a 3.2M 1200 line text file, it crashed doing a find and replace.

As a comparison, this file took about a minute to open fully in gedit
and performance wasn't good.  In vi, it opens in less than 5 seconds.

While the search and replace failed in gedit (I left the process
chewing 100% of CPU for a full 10 minutes before shuting the window)
vi completed the same search and replace (':% s/and/AND/g ) in less
than five seconds.  Out of interest there was 16429 substitutions on
506 lines.

Food for thought.
Comment 6 Paolo Maggi 2002-05-17 08:01:15 UTC
About the gedit 1.119.0 problem.

Please, try to reproduce this bug in testtext (in the gtk+ test
directory) and let me know.

I can try to accellerate gedit loading, but probably we will never get
the vi performance. gedit is only a simple editor, vi is more
sofisticated and I think it is a "disk" editor.

Did gedit core dump? Or did you kill it? 
In the first case, could you please attach a backtrace?
Comment 7 Paolo Maggi 2002-05-17 09:46:11 UTC
About gedit 1.119.0

I have made some experiment with a big text file ~4.5Mb (containing
the same line repeated 120000 times) on an AMD K6-III 450Mhz with
128Mb running RH Linux 7.2

Yep, "Replace All" on big files (120000 substituitions) is actually
very very slow (about 4 minutes)
This is partially due to the way gedit manages undo info for replace
all operations (I should fix this).
But it does not crash for me.

The file loading is quite fast for me (less than 5 secs), even if I
see gedit chewing 100% of CPU for about 1 minute. Note that, during
this time, you can use it.
If you go at the end of the file you will also see strange things
happening to the scroll bar. 
BTW, I can reproduce the loading problems also with testtext so I
think it is a gtk+ problem.

Find operations on big files are quite slow on testtext too (but not
so slow as in gedit)
Also copying all the file in clipboard is quite slow (both in gedit
and in testtext)

Also redoing a Replace All operation (120000 substitutions) is slow.

Lowering Severity and Priority to since it is not a crasher

Rodd: is a crasher for you?
Comment 8 Rodd Clarkson 2002-05-20 00:51:36 UTC
Haven't got the time to test right now, I'll do it tomorrow.

However, gedit didn't core dump, but when I closed the window, a
dialog informed me that it had stopped reponding and did i want to
kill it.

No backtrace to attach.
Comment 9 Rodd Clarkson 2002-05-20 00:53:07 UTC
Maggi: I'd class this as a crasher for me.
Comment 10 Narayana Pattipati 2002-07-24 11:33:52 UTC
One interesting fact! There is a significant difference in time taken
for Find/Replace between "case-sensitive" and "case-insensitive"
cases. 

I have a file with 85000 lines. When I did find/replace for a
word(with 3800 instances) on my Linux box, the results are as below:

	case-sensitive			40-42 secs.
	case-insensitive		70-72 secs.

When I went thru the code, I could find only one difference in the
code executed for both the cases. 

For case-sensitive case, each line of the file is searched for word
with strstr.(gedit_document_find -> gedit_text_iter_forward_search->
gtk_text_iter_forward_search -> lines-match->strstr)

For case-insensitive find/replace, each line of the file is searched
for the word with g_utf8_strcasestr call. 
(gedit_document_find -> gedit_text_iter_forward_search
->               lines_match -> g_utf8_strcasestr)

g_utf8_strcasestr is slow when compared with strstr(). And
g_utf8_strcasestr is used to make internationalised caseless search. 

I am just wondering why g_utf8_XXXXXXXX calls are used for
case-insensitive search and NOT for case-sensitive search.??
Comment 11 Narayana Pattipati 2002-08-12 15:27:45 UTC
I have profiled find/replace(case-sensitive) for 18030+ instances. I
will attach the profile output shortly. 

It is observed that gedit_undo_manager_get_number_of_groups() function
is hogging lot of time. (18030+ find/replacements took 360 odd seconds
to finish. The profiler showed a total time of 236 seconds. Out of
this 234.31 seconds is hogged by this function.) 
Comment 12 Narayana Pattipati 2002-08-12 15:38:29 UTC
Created attachment 10435 [details]
Profiler output for 18000+ case-sensitive replacements
Comment 13 Mukund 2002-08-20 16:02:29 UTC
Proposed patch that does away with linked-list scanning in 
gedit_undo_manager_get_number_of_groups function by caching the group 
count.
Comment 14 Mukund 2002-08-20 16:03:38 UTC
Created attachment 10600 [details] [review]
caching group count
Comment 15 Mukund 2002-08-20 16:32:02 UTC
Created attachment 10601 [details] [review]
Does away with varargs processing inside gedit_debug() when debug flags not set
Comment 16 Paolo Maggi 2002-08-26 08:49:04 UTC
I have committed both the patches.

The second one was ok.
The first one had some problems so I modified it a bit.
Comment 17 Paolo Maggi 2002-08-28 11:34:28 UTC
Did you find other performance problems?
Is "replace all" fast enough now?

Comment 18 Andrew Sobala 2002-09-09 15:23:46 UTC
Is this a problem for the 0.9.7 release that it was originally
reported against or is it limited to GNOME2?
Comment 19 Paolo Maggi 2002-09-09 15:30:12 UTC
The crash was in gedit 0.9.7, but the performance problems are in the
GNOME2 version.
Comment 20 Luis Villa 2004-01-09 00:16:46 UTC
I'm not sure if it is right to note this here or not, but for me the
biggest performance hits come with a single long line- like, if I have
a single 600 character line, operations get brutally slow, even if the
whole doc is on that line.
Comment 21 Paolo Maggi 2004-01-09 11:03:28 UTC
About the slowness with very long lines see  bug #114337