After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 316835 - Syntax highlighting not correct with LF instead of CR
Syntax highlighting not correct with LF instead of CR
Status: RESOLVED FIXED
Product: gtksourceview
Classification: Platform
Component: Syntax files
unspecified
Other All
: High major
: ---
Assigned To: GTK Sourceview maintainers
GTK Sourceview maintainers
: 340354 343910 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2005-09-21 09:12 UTC by Mario Bruckschwaiger
Modified: 2014-02-15 12:53 UTC
See Also:
GNOME target: ---
GNOME version: 2.11/2.12



Description Mario Bruckschwaiger 2005-09-21 09:12:14 UTC
Please describe the problem:
After a single line comment (//) all following lines are "commnents" if the file
uses 0x0D as end of line instead of 0x0A. Only the syntax highlighting is wrong.
The presentation of the text itself is correct.

Steps to reproduce:
1. Load a java file with LF instead of CR as end of line


Actual results:
All lines after a single line comment are marked as comment.

Expected results:
The lines after a single line comment are highlighted as java code syntax.

Does this happen every time?
Yes.

Other information:
All of my GNOME components are from fedora development repository and have the
version 2.12.
Comment 1 Paolo Borelli 2005-09-21 09:40:35 UTC
Syntax Highlghting is provided by gtksourceview -> moving the bug to it.

Does the problem happens also with C and other languages? I can't see anything
particular in the java.lang synatx file...
Comment 2 Mario Bruckschwaiger 2005-09-21 09:49:42 UTC
This happens also with C source code file (and I suppose also with other languages).
Comment 3 Mario Bruckschwaiger 2005-09-21 09:51:17 UTC
The gtksourceview version is gtksourceview-1.4.1-1
Comment 4 Paolo Maggi 2006-07-26 12:17:30 UTC
Probably this bug is not specific to the Java language.
In gtksourcetag.c we explicit use \n as line terminator.
Comment 5 Paolo Maggi 2006-07-26 13:51:03 UTC
*** Bug 340354 has been marked as a duplicate of this bug. ***
Comment 6 Paolo Borelli 2006-07-26 13:52:33 UTC
*** Bug 343910 has been marked as a duplicate of this bug. ***
Comment 7 Paolo Maggi 2006-08-23 15:47:46 UTC
I have tried to fix this bug without success.

As I said in comment #4, gtksourcetag.c explicitly use \n as line terminator. 
But replacing it with "$" or with [\n\r] didn't solve the problem.

One of the problems is that regex does not match "\r" when using $ and we
are using "$" in a lot of .lang files.

Furthermore some .lang files also use "\n".

BTW, fixing all the "\n" I have found (at least related to c.lang and core code) 
did not solve the problem, so I gave up.

We will try to investigate again this problem when releasing the new engine.
May be PCRE will help us.
Comment 8 Yevgen Muntyan 2006-08-23 19:32:41 UTC
From new engine code:

/* Line terminator characters (\n, \r, \r\n, or unicode paragraph separator)
 * are removed from the line text. The problem is that pcre does not understand
 * arbitrary line terminators, so $ in pcre means (?=\n) (not quite, it's also
 * end of matched string), while we really need "((?=\r\n)|(?=[\r\n])|(?=\xE2\x80\xA9)|$)".
 * It could be worked around by replacing line terminator in matched text with
 * \n, but it's a good source of errors, since offsets (not all, unfortunately) returned
 * from pcre need to be compared to line length, and adjusted when necessary.
 * Not using line terminator only means that \n can't be in patterns, it's not a
 * big deal: line end can't be highlighted anyway; if a rule needs to match it, it can
 * can use "$" as start and "^" as end.
 */

An example is trailing backslash rule:

        <context id="line-continue">
            <start>\\$</start>
            <end>^</end>
        </context>

Using '\n' in lang files is broken (and simply won't work).
Comment 9 Yevgen Muntyan 2006-08-23 19:34:17 UTC
(In reply to comment #8)
> From new engine code:
> 
> /* Line terminator characters (\n, \r, \r\n, or unicode paragraph separator)
>  * are removed from the line text. 

Need to make sure that \r followed by \n in pathological cases (line ending with \r, next line ending with \n, deleting second line body) is handled right.
Comment 10 Yevgen Muntyan 2007-05-26 21:09:24 UTC
If some lang file uses "\n" explicitely, then it's a lang file bug which should be fixed (this one, java, is fine).