After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 708113 - Inconsistent Word Boundary Detection using <keyword-char-class>
Inconsistent Word Boundary Detection using <keyword-char-class>
Status: RESOLVED FIXED
Product: gtksourceview
Classification: Platform
Component: General
3.6.x
Other Linux
: Normal normal
: ---
Assigned To: GTK Sourceview maintainers
GTK Sourceview maintainers
Depends on:
Blocks:
 
 
Reported: 2013-09-15 14:23 UTC by Mark Corbin
Modified: 2013-09-24 16:13 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch for negative lookbehind regex for <keyword-char-class>. (1022 bytes, patch)
2013-09-15 14:26 UTC, Mark Corbin
accepted-commit_after_freeze Details | Review

Description Mark Corbin 2013-09-15 14:23:44 UTC
Custom word boundary detection using 'keyword-char-class' is not consistent between leading and trailing word boundaries.

This behaviour can be shown using the following gedit language definition file:

<?xml version="1.0" encoding="UTF-8"?>
<language id="test" _name="Test" version="2.0" _section="Sources">
  <metadata>
    <property name="globs">*.tst</property>
  </metadata>
  <styles>
    <style id="keyword" _name="Keyword" map-to="def:keyword"/>
  </styles>

  <keyword-char-class>[A-Za-z0-9_]</keyword-char-class>

  <definitions>
    <context id="test-keyword" style-ref="keyword">
      <keyword>helloworld</keyword>
    </context>
    <context id="test">
      <include>
        <context ref="test-keyword"/>
      </include>
    </context>
  </definitions>
</language>

A test text file, e.g. highlight.tst, illustrates the behaviour as described by the comments:

  helloworldX		# The keyword 'helloworld' isn't highlighted which is correct.
  Xhelloworld		# The keyword 'helloworld' is highlighted which is *incorrect*.
  
  helloworld$		# The keyword 'helloworld' is highlighted which is correct.
  $helloworld		# The keyword 'helloworld' is highlighted which is correct.
Comment 1 Mark Corbin 2013-09-15 14:26:14 UTC
Created attachment 254975 [details] [review]
Patch for negative lookbehind regex for <keyword-char-class>.
Comment 2 Mark Corbin 2013-09-15 14:29:20 UTC
The negative lookbehind regex used by 'keyword-char-class' should begin with '?<!' instead of '?!<'.

The attached patch file fixes this issue.

As a workaround an equivalent regular expression could be defined for existing versions of gtksourceview:

[SNIP]
  <define-regex id="my-regex">[A-Za-z0-9_]</define-regex>
  <define-regex id="my-word-boundary">((?&lt;!\%{my-regex})(?=\%{my-regex}))|((?&lt;=\%{my-regex})(?!\%{my-regex}))</define-regex>
  <context id="test-keyword" style-ref="keyword">
    <prefix>\%{my-word-boundary}</prefix>
    <suffix>\%{my-word-boundary}</suffix>
    <keyword>helloworld</keyword>
  </context>
[SNIP]
Comment 3 Sébastien Wilmet 2013-09-19 20:50:00 UTC
Thank you for the patch. Indeed, from the GLib doc:

https://developer.gnome.org/glib/stable/glib-regex-syntax.html

> Lookbehind assertions
>
> Lookbehind assertions start with (?<= for positive assertions and (?<! for
> negative assertions.

So I'll push your patch after the freeze (see https://wiki.gnome.org/ThreePointNine ).
Comment 4 Sébastien Wilmet 2013-09-19 20:50:53 UTC
Review of attachment 254975 [details] [review]:

commit_after_freeze
Comment 5 Sébastien Wilmet 2013-09-24 16:13:52 UTC
Commit pushed to the master branch. It will be available for GtkSourceView 3.10.1.