Bug 316835 – Syntax highlighting not correct with LF instead of CR

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 316835 - Syntax highlighting not correct with LF instead of CR


Summary:	Syntax highlighting not correct with LF instead of CR


Status:	RESOLVED FIXED

Product:	gtksourceview
Classification:	Platform
Component:	Syntax files
Version:	unspecified
Hardware:	Other All

Importance:	High major
Target Milestone:	---
Assigned To:	GTK Sourceview maintainers
QA Contact:	GTK Sourceview maintainers

URL:
Whiteboard:

Duplicates:	340354 343910 (view as bug list)
Depends on:
Blocks:

Reported:	2005-09-21 09:12 UTC by Mario Bruckschwaiger
Modified:	2014-02-15 12:53 UTC

See Also:
GNOME target:	---
GNOME version:	2.11/2.12

Description Mario Bruckschwaiger 2005-09-21 09:12:14 UTC

Please describe the problem:
After a single line comment (//) all following lines are "commnents" if the file
uses 0x0D as end of line instead of 0x0A. Only the syntax highlighting is wrong.
The presentation of the text itself is correct.

Steps to reproduce:
1. Load a java file with LF instead of CR as end of line


Actual results:
All lines after a single line comment are marked as comment.

Expected results:
The lines after a single line comment are highlighted as java code syntax.

Does this happen every time?
Yes.

Other information:
All of my GNOME components are from fedora development repository and have the
version 2.12.

Comment 1 Paolo Borelli 2005-09-21 09:40:35 UTC

Syntax Highlghting is provided by gtksourceview -> moving the bug to it.

Does the problem happens also with C and other languages? I can't see anything
particular in the java.lang synatx file...

Comment 2 Mario Bruckschwaiger 2005-09-21 09:49:42 UTC

This happens also with C source code file (and I suppose also with other languages).

Comment 3 Mario Bruckschwaiger 2005-09-21 09:51:17 UTC

The gtksourceview version is gtksourceview-1.4.1-1

Comment 4 Paolo Maggi 2006-07-26 12:17:30 UTC

Probably this bug is not specific to the Java language.
In gtksourcetag.c we explicit use \n as line terminator.

Comment 5 Paolo Maggi 2006-07-26 13:51:03 UTC

*** Bug 340354 has been marked as a duplicate of this bug. ***

Comment 6 Paolo Borelli 2006-07-26 13:52:33 UTC

*** Bug 343910 has been marked as a duplicate of this bug. ***

Comment 7 Paolo Maggi 2006-08-23 15:47:46 UTC

I have tried to fix this bug without success.

As I said in comment #4, gtksourcetag.c explicitly use \n as line terminator. 
But replacing it with "$" or with [\n\r] didn't solve the problem.

One of the problems is that regex does not match "\r" when using $ and we
are using "$" in a lot of .lang files.

Furthermore some .lang files also use "\n".

BTW, fixing all the "\n" I have found (at least related to c.lang and core code) 
did not solve the problem, so I gave up.

We will try to investigate again this problem when releasing the new engine.
May be PCRE will help us.

Comment 8 Yevgen Muntyan 2006-08-23 19:32:41 UTC

From new engine code:

/* Line terminator characters (\n, \r, \r\n, or unicode paragraph separator)
 * are removed from the line text. The problem is that pcre does not understand
 * arbitrary line terminators, so $ in pcre means (?=\n) (not quite, it's also
 * end of matched string), while we really need "((?=\r\n)|(?=[\r\n])|(?=\xE2\x80\xA9)|$)".
 * It could be worked around by replacing line terminator in matched text with
 * \n, but it's a good source of errors, since offsets (not all, unfortunately) returned
 * from pcre need to be compared to line length, and adjusted when necessary.
 * Not using line terminator only means that \n can't be in patterns, it's not a
 * big deal: line end can't be highlighted anyway; if a rule needs to match it, it can
 * can use "$" as start and "^" as end.
 */

An example is trailing backslash rule:

        <context id="line-continue">
            <start>\\$</start>
            <end>^</end>
        </context>

Using '\n' in lang files is broken (and simply won't work).

Comment 9 Yevgen Muntyan 2006-08-23 19:34:17 UTC

(In reply to comment #8)
> From new engine code:
> 
> /* Line terminator characters (\n, \r, \r\n, or unicode paragraph separator)
>  * are removed from the line text. 

Need to make sure that \r followed by \n in pathological cases (line ending with \r, next line ending with \n, deleting second line body) is handled right.

Comment 10 Yevgen Muntyan 2007-05-26 21:09:24 UTC

If some lang file uses "\n" explicitely, then it's a lang file bug which should be fixed (this one, java, is fine).