After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 110991 - gedit crashes on a long language file
gedit crashes on a long language file
Status: RESOLVED FIXED
Product: gtksourceview
Classification: Platform
Component: Syntax files
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Gustavo Giráldez
GTK Sourceview maintainers
Depends on:
Blocks:
 
 
Reported: 2003-04-17 02:23 UTC by Alex Duggan
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: 2.3/2.4


Attachments
php .lang file (20.61 KB, text/plain)
2003-04-17 02:25 UTC, Alex Duggan
Details
backtrace of crash (10.40 KB, text/plain)
2003-04-17 02:26 UTC, Alex Duggan
Details

Description Alex Duggan 2003-04-17 02:23:03 UTC
I'm in the process of creating a php .lang file and it seems gedit crashes
on a very long .lang file.  Attached are the .lang file and a backtrace of
the crash.
Comment 1 Alex Duggan 2003-04-17 02:25:28 UTC
Created attachment 15788 [details]
php .lang file
Comment 2 Alex Duggan 2003-04-17 02:26:19 UTC
Created attachment 15789 [details]
backtrace of crash
Comment 3 Alex Duggan 2003-04-17 03:48:08 UTC
Here is another backtrace I got running gedit directory from gdb using
another long .lang file:

[aldug@astrolinux simpleurl-admin]$ gdb gedit
GNU gdb Red Hat Linux (5.2.1-4)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux"...
(gdb) run update_expiration.php
Starting program: /usr/local/gnome2/bin/gedit update_expiration.php
[New Thread 8192 (LWP 19746)]
 
Program received signal SIGSEGV, Segmentation fault.

Thread 8192 (LWP 19746)

  • #0 _int_malloc
    from /lib/i686/libc.so.6
  • #1 malloc
    from /lib/i686/libc.so.6
  • #2 re_node_set_init_1
    from /lib/i686/libc.so.6
  • #3 group_nodes_into_DFAstates
    from /lib/i686/libc.so.6
  • #4 build_trtable
    from /lib/i686/libc.so.6
  • #5 transit_state
    from /lib/i686/libc.so.6
  • #6 check_matching
    from /lib/i686/libc.so.6
  • #7 re_search_internal
    from /lib/i686/libc.so.6
  • #8 re_search_stub
    from /lib/i686/libc.so.6
  • #9 re_search
    from /lib/i686/libc.so.6
  • #10 gtk_source_regex_search
    at gtksourceregex.c line 118
  • #11 search_patterns
    at gtksourcebuffer.c line 2129
  • #12 check_pattern
    at gtksourcebuffer.c line 2216
  • #13 highlight_region
    at gtksourcebuffer.c line 2339
  • #14 ensure_highlighted
    at gtksourcebuffer.c line 2394
  • #15 idle_worker
    at gtksourcebuffer.c line 1310
  • #16 g_idle_dispatch
    at gmain.c line 3164
  • #17 g_main_dispatch
    at gmain.c line 1653
  • #18 g_main_context_dispatch
    at gmain.c line 2197
  • #19 g_main_context_iterate
    at gmain.c line 2278
  • #20 g_main_loop_run
    at gmain.c line 2498
  • #21 gtk_main
    at gtkmain.c line 1092
  • #22 main
    at gedit2.c line 397
  • #23 __libc_start_main
    from /lib/i686/libc.so.6

Comment 4 Gustavo Giráldez 2003-04-17 07:05:13 UTC
Apparently this is a GLIBC bug due to the huge generated regular
expression for "Keywords".  Splitting that pattern in 4 patterns fixes
the problem.

I'll check to see what is causing it though, but we'll surely need a
workaround in gtksourcelanguage.c.
Comment 5 Alex Duggan 2003-04-17 15:12:48 UTC
Gustavo,

Is there a temperary workaround that I can do in the .lang file so I
can use php syntax highlighting in gedit?  If not, how long do you
think it will be until a "fix" is implimented in gktsourceview?
Comment 6 Gustavo Giráldez 2003-04-17 23:37:36 UTC
Alex: yes, you can split the keyword-list in 4 separate keyword-lists.
 Just name them distinctly (i.e. "Keywords 1", "Keywords 2", etc).  If
you assign them the same style there will be no difference with a
unique keyword-list.  In fact, I'm thinking the workaround will have
to be implemented along this line... transparently of course.
Comment 7 Gustavo Giráldez 2003-04-23 04:42:17 UTC
As I suspected it was a GLIBC bug.  I have reported the problem to
bugs.gnu.org
(http://bugs.gnu.org/cgi-bin/gnatsweb.pl?debug=&database=glibc&cmd=view+audit-trail&cmd=view&pr=5006
for interested parties).

So, until somebody fixes that, a new glibc is released and it's widely
deployed so that we can depend on it :-)  I have committed a
workaround.  The solution is to split the generated regex into
subgroups.  I.e., instead of:

\b\(key1\|key2\|key3\|key4...\)

generate:

\b\(key1\|key2\)\|\b\(key3\|key4\)\|...

Anyway, long keyword lists are strongly discouraged because,

- I think it degrades performance, even more than having the same
amount of keywords in separate lists (I still have to back this up
with performance measurements)
- they resulting .lang file is harder to read
- they are evil in general ;-)

Nevertheless it should not crash anymore.  Please reopen if you still
have issues.
Comment 8 Gustavo Giráldez 2003-06-23 13:33:50 UTC
I'm getting this crash again with long keyword lists.  It's probably
the regular expression syntax change (we used GNU emacs before,
extended POSIX now).
Comment 9 Gustavo Giráldez 2003-09-25 03:44:05 UTC
I just committed a temporary "fix" for this crasher.  Long keyword
lists are truncated at 250 elements automatically now and a warning
console message is produced.  I also added a note about this in the
README file.

There is no easy and clean way to solve this with the current
highlighting implementation.  In the future, when we separate regular
expression patterns from tag objects, we will be able to transparently
split the keyword lists.