GNOME Bugzilla – Bug 620241
Syntax highlighting broken for Sweave documents
Last modified: 2014-02-08 16:13:50 UTC
Sweave documents (*.rnw; *.Rnw; *.snw; *.Snw) are latex documents with embedded R or S code. The current latex.lang file is not Sweave-aware so syntax highlighting of these documents is a mess. A particularly annoying case is when a block of R/S contains code of the form object$property. To the latex.lang file, the $ looks like the start of a maths environment. Example Sweave document: \documentclass[a4paper,11pt]{article} \begin{document} \title{Allometry of field metabolic rates} \maketitle <<echo=>>= d <- data.frame(a=1:10, b='xxx') print(d$a) print(d$b) @ \end{document}
More information. There are two ways to embed R/S in Sweave documents: 1. Inline: \Sexpr{nrow(fmr)} 2. Block <<>>= d <- data.frame(a=1:10, b='xxx') print(d$a) print(d$b) @ The Sweave tool takes the .Rnw file as input, runs the embedded R/S code, and creates a regular latex .tex file that contains output from the embedded R/S code. The generated .tex file may contain verbatim R/S code inside an Scode/Sinput/Soutput blocks that are rendered by latex. Our latex.lang should be able to correctly highlight both the .Rnw and generated .tex files. Examples are in the attached files test.Rnw and test.tex, test2.Rnw and test2.tex. Running "R CMD Sweave test.Rnw" produces test.tex. Running "pdflatex test" creates test.pdf. Homepage for Sweave: http://www.stat.uni-muenchen.de/~leisch/Sweave/ Homepage for R: http://www.r-project.org/ R is an opensource implementation of the S language.
Created attachment 162444 [details] Example Sweave document
Created attachment 162445 [details] Latex file generated by test.Rnw
Created attachment 162446 [details] Second example Sweave document
Created attachment 162447 [details] Latex file generated by test2.Rnw
Created attachment 162449 [details] [review] A potential fix for this bug A potential fix for this bug. No attempt is made to syntax-highlight the embedded R/S code. These are typically short code fragments and are treated as comments by the patch.
I have not looked at the patch (yet) or read the report carefully, but the suggested approach here is to create a sweave.lang file. Depending on the needs, either sweave.lang imports latext contexts or latex.lang needs to be modified to provide "hooks" overridable by sweave.lang. See for intance how html&php or docbook&xml are done.
Created attachment 162526 [details] [review] A better potential fix for this bug I second possible fix for this, following the suggested route of creating sweave.lang. I could not find a mime type for sweave documents and have left the mimetypes property as text/plain.
Review of attachment 162526 [details] [review]: Some comments following. ::: data/language-specs/sweave.lang @@ +1,1 @@ +<?xml version="1.0"?> please add the copyright, we only accept LGPL @@ +2,3 @@ +<language id="sweave" _name="Sweave" version="2.0" _section="Markup"> + <metadata> + <property name="mimetypes">text/plain</property> Remove the mime type as it is not really needed. @@ +3,3 @@ + <metadata> + <property name="mimetypes">text/plain</property> + <property name="globs">*.rnw;*.Rnw;*.snw;*.Snw</property> You are missing the comment properties @@ +14,3 @@ + + <definitions> + <context id="embedded-R"> Is this context really needed? the swave context already include inline-R and R-block. You Could just define those context outside this one and include them in the sweave, os is there any reason for this?
Created attachment 162530 [details] [review] A third potential fix for this bug Updates from review. No comment properties added - these are optional? Comments in Sweave docs are highlighted correctly.
the property comments are used for the comment plugin.
Created attachment 162541 [details] [review] A fourth potential fix for this bug Update from review. line-comment-start property added to sweave.lang metadata.
How can I help so that this fix can be merged? I manually applied it and it seems to work fine...
The latex.lang have changed in the meantime, so there is most probably a conflict if we try to apply the above patch on the master branch. The copyright line is still missing. See the top of c.lang for an example.
Created attachment 267939 [details] [review] "Fourth potential fix" rebased onto master incl. copyright I have rebased the patch of Lawrence onto master and added his copyright (I hope this is ok?).
Review of attachment 267939 [details] [review]: Thanks for updating the patch. A few comments below. ::: data/language-specs/latex.lang @@ +81,2 @@ <context id="verbatim-env" style-inside="true" style-ref="verbatim" class-disabled="no-spell-check"> + <start>(\\begin)\{(verbatim\*?|lstlisting|alltt|Scode|Sinput|Soutput)\}</start> If I understand correctly, Scode, Sinput etc are LaTeX environments, appearing only in .tex files, not Sweave files. And those environments require a package. If so, add a comment to explain a bit where Scode, Sinput etc are coming from. @@ +578,3 @@ + --> + <context id="embedded-lang-hook"> + <start>\\never-match</start> In html.lang, it is: <start>\%{def:never-match}</start> @@ +580,3 @@ + <start>\\never-match</start> + <end></end> + </context> Have you tried without the embedded-lang-hook context? ::: data/language-specs/sweave.lang @@ +5,3 @@ + + Authors: Lawrence Hudson + Copyright (C) 2010-2014 Lawrence Hudson <quicklizard@googlemail.com> The copyright year must be the year when the .lang file is released. Since it was not released in 2010, the year should be only "2014". If we consider that the lang file was released in 2010 (in my opinion it was not the case), then the copyright should be "2010, 2014", with a comma. @@ +54,3 @@ + </context> + + <replace id="latex:embedded-lang-hook" ref="R-block"/> I think you can remove this line. Referencing latex:latex below should be enough to import the latex contexts.
Created attachment 268451 [details] [review] Sweave patch with Sebastien's comments Thanks for your comments, I have included them all. I excluded the Scode chunks for now as they are not relevant for Sweave files, only for TeX files with the appropriate package loaded. I tried to implement this in the latex.lang file (see patch below) closely following the dev wiki, but without luck...any hints on why this is not working is much appreciated.
Created attachment 268452 [details] [review] Scode highlighting in TeX documents (not working)
Comment on attachment 268451 [details] [review] Sweave patch with Sebastien's comments Thanks, I've pushed the commit. I've also done some changes: https://git.gnome.org/browse/gtksourceview/commit/?id=f3eb99ec87ce120e9add885ad0df54fac6b77094 One question, in: <start>^<<.*>>=</start> With the ^, the R block must begin the line. So the R block delimiter can not be indented. Maybe we should modify as "^\s*<...", so spaces are allowed before the delimiter.
Review of attachment 268452 [details] [review]: ::: data/language-specs/latex.lang @@ +49,3 @@ <style id="paragraph" _name="Paragraph Heading" map-to="def:heading5"/> <style id="subparagraph" _name="SubParagraph Heading" map-to="def:heading6"/> + <style id="r-block" _name="r-block" map-to="r:r"/> The map-to="r:r" is not correct, R.lang contains several styles. @@ +50,3 @@ <style id="subparagraph" _name="SubParagraph Heading" map-to="def:heading6"/> + <style id="r-block" _name="r-block" map-to="r:r"/> + Please avoid trailing spaces. @@ +94,3 @@ + <end>(\\end)\{\%{2@start}\}</end> + <include> + <context ref="r:r"/> You should apply the correct style to the different components of <start> and <end>. The "r-block-env" context must also be included in the main context at the end of the file, with a higher priority (i.e. above) the context used for the generic \begin,\end environment. @@ +99,1 @@ <!--using brackets is an experimental feature from the listings package. The Please add an empty line between two contexts.
Normally the "r-block" style is not needed, the R styles will be applied when referencing R.lang with <context ref="r:r"/>.
(In reply to comment #19) > (From update of attachment 268451 [details] [review]) > Thanks, I've pushed the commit. I've also done some changes: > https://git.gnome.org/browse/gtksourceview/commit/?id=f3eb99ec87ce120e9add885ad0df54fac6b77094 > > One question, in: > <start>^<<.*>>=</start> > > With the ^, the R block must begin the line. So the R block delimiter can not > be indented. Maybe we should modify as "^\s*<...", so spaces are allowed > before the delimiter. I agree!
Created attachment 268499 [details] [review] Embedded R Code in TeX files Thanks for the help, it works!
Comment on attachment 268499 [details] [review] Embedded R Code in TeX files Commit pushed. I've made some modifications: - the r-block style is not used, so it has been removed - indentation: 2 spaces - avoid trailing spaces - enable the no-spell-check class, instead of disabling it. The spell checking must not be applied to the \begin and \end commands. - reference r:r at the end of the <include>. - for \begin and \end, apply the common-commands style, not 'command'.
(In reply to comment #22) > I agree! Done.