GNOME Bugzilla – Bug 788911
Poor W32-meson compatibility
Last modified: 2018-05-22 13:12:31 UTC
Since 1.26, gtk-doc got a new, shiny Python rewrite, and i hoped that this would fix meson integration[1] when running on Windows. I was wrong. The crux of the matter is the fact that gtk-doc remains very *nix-y due to the tools that it uses (xsltproc & docbook, vim). For example, it must ensure that XML_CATALOG_FILES is a :-separated list of *nix paths, not a ;-separated list of W32 paths, otherwise xsltproc[2] will fail to find the stylesheets (or whatever it is that it looks for in the filesystem). Vim also requires some state that doesn't seem[3] to survive the MSYS(bash)->W32 (python-meson+python-gtk-doc)->MSYS(vim) trip, and fails to do the highlighting that it is supposed to do (so far i was unable to figure out exactly which environmental variable breaks it). This means that the W32/MSYS separation line should be drawn between meson and gtk-doc (whereas previously it didn't exist - gtk-doc was invoked by autotools & make, both of which were on MSYS side of things already, same as gtk-doc itself). That might be doable (though i doubt that meson devs will happily accept the necessary patches). Obviously, by remaining *nix-y gtk-doc will [likely] remain unusable for MSVC users (not that i care, you understand). What is the official Windows compatibility policy for gtk-doc? And for Gnome in general, as far as documentation goes. [1] meson integration matters because gtk4 generation of the stack already lost autotools support, and i fear that gtk3 generation of the stack is next in line [2] MSYS-xsltproc is used in favour of MinGW-xsltproc because otherwise various *nix-y paths widely hard-coded into stylesheets don't work [3] Note that my MSYS2 environment is slightly unusual (also, quite old), so it's possible that official MSYS2 works around this somehow
Hej, I know, but without help changing all that will take some time. The plan is to get rid of the docbook parts totally (if people don't mind to loose pdf output). There really is no need to convice me, I am not clinging to something that make life hard for people not using linux. If you want to help, you are very welcome. We just need to do this carefully and step by step to keep things working.
Well, as i have said, first thing is to clarify the destination point. Once that is defined, it would be possible to figure out how to get there. So, getting rid of docbook...what does that entail? I mean, AFAIU, gtk-doc rips the doc-comments out of the source code, forms some kind of document using some kind of templates, and then feeds that to xsltproc to do...something...That's the best i explanation i can come up with, without actually reading through the source code step by step.
https://git.gnome.org/browse/gtk-doc/tree/doc/design-2.x.txt#n79
I added some more details for a first step: https://git.gnome.org/browse/gtk-doc/tree/doc/design-2.x.txt#n98
I've started a tool that can hopefully replace the need to xslt (and the docbook stylesheets) under tools/db2html (once it works it will be moved into gtkdoc-mkhtml. This uses a few python modules (anytree, lxml, jinja2). @LRN, are those useable from windows? One thing that I could get some help is to make it possible to release gtk-doc on https://pypi.python.org/pypi. We need a setup.py, etc.
This would be a nice project to help portability: https://bugzilla.gnome.org/show_bug.cgi?id=792661
So, here's an update on the state of gtk-doc vs MSYS2 vs meson: 1) I've picked up a neat trick from upstream MSYS2, where you prepend a value of an environment variable with a space. This prevents MSYS2 from mangling its contents. This allows for XML_CATALOG_FILES to reach MSYS libxml unmolested (actually, i've patched my own msys libxml to accept a special MSYS_XML_CATALOG_FILES variable, which is exactly the same thing as XML_CATALOG_FILES, but with a leading space that the code now throws away; this might not be necessary however, since XML_CATALOG_FILES is naturally space-delimited; i just haven't tested that) 2) A few minor changes to gtkdoc-depscan are needed (use os.pathsep instead of ':', use subprocess.Popen() instead of popen(), strip both '\r' and '\n'). 3) fixxref needs some major changes in the way vim is invoked (MUST invoke vim via os.getenv('SHELL')! I've spent a whole evening trying to make it work without that, and failed), and in the temporary files being used for that invocation (can't open NamedTemporaryFile twice on W32, can't unlink it while it's open). 4) I *heartily* approve of the gtkdoc/config.py file. It's easy enough (one sed invocation) to patch it in a post-installation step to ensure that all paths are Windows-compatible and point at the right places. This is a cheap and reasonably well-working solution to the relocatability problem. The rest of the fixes go to meson (gnome module there is still very much *nix-centric) - correctly finding python scripts (or programs, if using upstream MSYS2), adding LD_LIBRARY_PATH to PATH and, most importantly, replacing '\\' with '/' in multiple places (though again, this is for MSVC Python; upstream MSYS2 with its MinGW Python might not require that). I have patches, if anyone interested (fixxref being the most non-trivial of them). That said, using something other than xsltproc and docbook would be welcome (if only due to potential for speed increase). Oh, right, about that. Pygments and anytree should be easy to install. lxml is a bit trickier, as it seems to be using C libxml, and interfacing between MSVC-Python and libraries built with MinGW is always a pain. I'll try to look into that.
Wow. Lets me comment on a few things: 2) gtkdoc-depscan: is optional and not run by default. Anyway I take compatibility patches 3) in gtkdoc-mkhtml2 (last release and git) I am switching to pygments, this way I don't need to shell out to tools. I'll need to test this a bit more, but will most likely also use this from gtkdoc-fixxref in the next release.
One more question. Would depending on python 3.X be okay for windows. If so, I'd drop python 2.7 support.
Python3 is OK.
Created attachment 370274 [details] [review] Multiple mkhtml2 fixes * Handle comments correctly (etree._Comment has no .tag attribute) * Grab ids correctly (always use get_id(), not xml.attrib['id']) * Open output file using utf-8 encoding * Correctly handle deprecated keywords (no value, split() returns 1-element list) Miraculously, lxml that i've installed from pip just *worked*, without any CRT-induced incompatibilities. Maybe it's Python3.5 magic. Who knows?
Created attachment 370275 [details] [review] Use UNIX EOLs For the purpose of comparing the output of mkhtml and mkhtml2 it would be better for the output to always use the same EOLs even on Windows.
Review of attachment 370274 [details] [review]: Thanks for the changes, but I'd like to submit them with a matching change description. If you can break out the comment-etree one, that is good and makes sense. Also we can make one commit fro the utf8 output encoding and the newlines. ::: gtkdoc/mkhtml2.py @@ +838,3 @@ + x = Dummy() + x.xml = s + s_id = get_id (x) what are you trying to fix here? a section without 'id'. Do you have an example for it? @@ +1094,3 @@ + split.append ('') + keywords.append ('{}="{}"'.format (*split)) + return ' ' + ' '.join(keywords) Again do you remember what was wrong here.
On IRC: > <ensonic> I made some comment on your patches. If you prefer I'll chop them into pieces and use your author name still. I really don't care. These are mostly trivial patches, you can do what you want with them. Chop them up, or treat them as suggestions. (In reply to Stefan Sauer (gstreamer, gtkdoc dev) from comment #13) > Review of attachment 370274 [details] [review] [review]: > > ::: gtkdoc/mkhtml2.py > @@ +838,3 @@ > + x = Dummy() > + x.xml = s > + s_id = get_id (x) > > what are you trying to fix here? a section without 'id'. Do you have an > example for it? Yeah, it crashed on s.attrib['id'], because 's' had no 'id' attribute. > > @@ +1094,3 @@ > + split.append ('') > + keywords.append ('{}="{}"'.format (*split)) > + return ' ' + ' '.join(keywords) > > Again do you remember what was wrong here. This code: > return' ' + ' '.join(['%s="%s"' % tuple(c.split(':', 1)) for c in cond]) crashed. Some values of 'c' did not have a ':' inside of them (specifically, one case was c='deprecated'). c.split(':', 1) returns ['deprecated'], one element, and the format requires two. By the way, i would advise to use '...{}...'.format (args) instead of '...%X...' % (args). At the very least, it does not require '%' characters to be escaped, and it seems to be more idiomatic Python in my opinion. Oh, just in case - i tested all this on glib from git master HEAD (specifically, the docs/reference/glib documentation), using gtk-doc-1.28 with mkhtml2.py (only that file; fixxref.py was from 1.28) from gtk-doc master HEAD. After running the docbook version over it, to ensure that all build files were produced first.
Comment on attachment 370275 [details] [review] Use UNIX EOLs commit 225e37cc0ac6061aefdcdc52b31292a28368555f Author: LRN <lrn1986@gmail.com> Date: Fri Mar 30 20:47:19 2018 +0200 mkhtml2: Specify the line-endings and the encoding This is to ensure we get consistent output regardless of the platform. See https://bugzilla.gnome.org/show_bug.cgi?id=788911
Comment on attachment 370274 [details] [review] Multiple mkhtml2 fixes Thanks! All changes are in and verified building 'glib/docs/reference/gobject'. I handled the missing 'id' for the refsect differently. commit 0530af38534d990e3b8a433fe6c7acc007d74be7 (HEAD -> master, origin/master, origin/HEAD) Author: Stefan Sauer <ensonic@users.sf.net> Date: Fri Mar 30 21:10:28 2018 +0200 mkhtml2: skip sections without 'id' atts for refentry nav It only makes sense to link to them if they have an id. commit 37526398947a65acb39c4051603672c45afe4c97 Author: LRN <lrn1986@gmail.com> Date: Fri Mar 30 21:05:59 2018 +0200 mkhtml2: handle deprecated without description keywords for devhelp2 If there is no value, split() returns 1-element list. See https://bugzilla.gnome.org/show_bug.cgi?id=788911 commit 78fc71abfc444264d4e2b7c2fbfb8ae311c0d64a Author: LRN <lrn1986@gmail.com> Date: Fri Mar 30 20:52:03 2018 +0200 mkhtml: handle comments correctly Since etree._Comment has no .tag attribute, just pass the comment through.
You should use the convert_code() handler for 'function' chunks instead of convert_span(). The contents are the same, the only difference is that it puts '<code' instead of '<span'. You can modify MakeXRef() to return '<GTKDOCLINK HREF="{}">{}</GTKDOCLINK>' for the 'if id in NoLinks' case, instead of just returning text. You can change the Links dict to hold a tuple of [href, title] instead of just href string. That way you can modify MakeXRef() to make it return '<a class="link" href="{}"{}>{}</a>'.format (href, title_attribute, text), where title_attribute is either 'title="..."' or '', if title value in Links is ''. You'd have to modify parts of mkhtml2.py to also put tuples into that dict, not just fixxref.py This will require changes in add_id_links(), you'd need to use "etree.XPath('//*[@id]')" xpath to get the elements with IDs, then grab the id attribute (trivial) of each element and its title child (element.find ('title')), and then get its text, and put both id and text into links. You can add extra checks to generate_nav_links() and have it insert '<td><img src="up-insensitive.png" width="16" height="16" border="0"></td>' only when the > 'nav_up' in ctx and ('nav_prev' not in ctx or ctx['nav_prev'] != ctx['nav_up']) condition fails (otherwise you never going to get up-insensitive).
commit 13cc0bb914a1dfdd0372242e2e5e2efc673fa4bc Author: Stefan Sauer <ensonic@users.sf.net> Date: Fri Mar 30 23:06:49 2018 +0200 mkhtml2: only have one converter for 'code' Replace uses of convert_literal with it. Also use convert_code for <function>. Will look at the others in the next days. Thanks for the reports and the testing. Regarding "{}".format(), yes it is the way to for for python 3.X. I've been using python 2.X in the past too much for the % expansion to be stuck :/
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gtk-doc/issues/38.