GNOME Bugzilla – Bug 712211
giscanner is broken when include paths contain symlinks
Last modified: 2015-02-07 17:01:33 UTC
In my case my jhbuild prefix is /home/carlos/gnome which is a symlink to /home/carlos/gnome-<version>. Now my gir files for glib, gio and gobject contain only constants (because iin that case macro_scan is TRUE in gi_source_scanner_add_symbol()). One of the problems is that in SourceScanner.parse_files we are passing real paths to the scanner.append_filename but not to scanner.lex_filename. But that's not enough to fix it, because we are still getting non real paths from the preprocessor, and process_linemarks is setting the non real path as the current file of the scanner.
Created attachment 259727 [details] [review] giscanner: Make sure we use real paths in more places This patch seems to fix the problem, I hope this covers all places were we need to convert paths.
Created attachment 259819 [details] [review] Updated patch I had forgotten the include paths passed to the scanner with -I command line option.
There's still a case when this fails, it happens when the include path doesn't contain a symlink, but the one created by the compiler does. This is happening in WebKit, let me explain the case. In WebKit, the public API sources are in Source/WebKit2/API/gtk/, but the public headers and installed under $prefix/include/webkitgtk-3.0/webkit2 so that from applications you can do #include <webkit2/webkit2.h>. The public headers that include other public headers also use the form #include <webkit/foo.h>. For this to work when building WebKit itlsef we need a dir webkit2 containing the headers, so we create a symlink in $buildir/DerivedSources/WebKit2/webkit2gtk/include/webkit2 pointing to Source/WebKit2/API/gtk/. This way we can pass -I$buildir/DerivedSources/WebKit2/webkit2gtk/include/ to the compiler. In this case there's no symlink at all in the include path. However, the compiler builds the path using the include dir + #include path, $buildir/DerivedSources/WebKit2/webkit2gtk/include/webkit2/foo.h. This is what the scanner receives in the linemarks generated by the preprocessor. So, I think we could bring back the g_realpath and use it *only* in process_linemarks(). What do you think?
I confirm that using realpath in process_linemarks() on top of my patch fixes the issue.
Created attachment 259827 [details] [review] giscanner: Make sure that the current path set from linemarks is also a realpath
Created attachment 259828 [details] [review] giscanner: Make sure that the current path set from linemarks is also a real path Went too fast, fixed some typos in the commit message, patch is the same, sorry for the noise
Review of attachment 259828 [details] [review]: Ok, so we are partially reverting the patches from bug 710320. But looking at that again, the performance hit was burning CPU on O(N^2) iteration on the file list, not I/O. It's unfortunate though to call realpath() every time we see a linemark =( But...I'm not sure how to do better. I'm OK with this patch then.
(In reply to comment #7) > Review of attachment 259828 [details] [review]: Thanks for the review. > Ok, so we are partially reverting the patches from bug 710320. But looking at > that again, the performance hit was burning CPU on O(N^2) iteration on the file > list, not I/O. > > It's unfortunate though to call realpath() every time we see a linemark =( Yes, I can make some measurements to see the impact of this in the performance. > But...I'm not sure how to do better. I'm OK with this patch then. Note that the other patch is also needed, they solve similar but different problems. First one fixes the problem of include paths containing symlinks, while the second one fixes the problem of include paths generated by the preprocessor containing symlinks (even if the provided include paths are already resolved).
Review of attachment 259819 [details] [review]: Right, OK.
(In reply to comment #8) > Yes, I can make some measurements to see the impact of this in the performance. If there's nothing we can do about it though, then... Hmm. It strikes me that what we're really trying to accomplish here is only scan explicitly included files, right? If the pervasive realpath() usage turns out to be a large performance hit, we could potentially just try to do say: 1) Exclude the file if it comes from a "system" path (i.e. /usr/include) 2) Include the file only if its *basename* matches one of those on the list to be scanned. I.e. rather than looking at the full path of /path/to/source/webkit2/foo.h, we just compare foo.h.
This broke the gnome-shell build: https://build.gnome.org/continuous/buildmaster/builds/20131115.35/build/output.txt
I think the problem might be that gnome-shell is using -I path instead of -Ipath, and the path we are passing to os.path.realpath has a trailing whitespace which returns the wrong result, see: In [2]: os.path.realpath("../../src") Out[2]: '/home/carlos/src/gnome/git/gnome-shell/src' In [3]: os.path.realpath(" ../../src") Out[3]: '/home/carlos/src/gnome/git/gnome-shell/src/st/src'
hmm, if that was the problem I wonder how it worked before, because -I path also confuses previous code in process_cflags_begin that is considering -I and the path as two different command line options. We could try to workaround the problem in gobject-instrospection, but i think gnome-shell should not be using -I path, since it's not valid for gcc either. Could someone confirm that using -Ipath fixes the problem?
I'm wrong -I path is valid in gcc, I'll fix it in gobject-introspection then.
Created attachment 259962 [details] [review] giscanner: Handle the case when there's a space between -I cpp flag and the path Assuming the problem is the -I path used by gnome-shell, this patch should fix the problem. I can't built gnome-shell at the moment, could someone confirm this patch fixes the build?
Review of attachment 259962 [details] [review]: Thanks a lot for the followup fix! The commit message is excellent. I've tested this locally, it works. Will push now.
This needed a followup fix: https://git.gnome.org/browse/gobject-introspection/commit/?id=ff2e8dd92d43e11705fcd2b493737bb58c4f2daa I'd assumed you had just revived the old realpath() code, so I didn't look at it closely. realpath() is a crummy API...
Oops, sorry, since we were using 1025 for the escaped_filename I assumed it was safe to use the same value here.
[Mass-moving gobject-introspection tickets to its own Bugzilla product - see bug 708029. Mass-filter your bugmail for this message: introspection20150207 ]