GNOME Bugzilla – Bug 558436
avoid having scanner load app code
Last modified: 2015-02-07 16:45:37 UTC
<owen> walters: Did you ever figure out a standard method gobject-introspecting a binary for plugin usage? <walters> owen: not yet; right now our story is to build shared libraries owen: which you can then ship in an app-private libdir <owen> walters: Can you? <walters> owen: /usr/lib64/myapp/myapp.so <owen> walters: Using rpath? <walters> owen: wrapper script which sets LD_LIBRARY_PATH this is how mozilla works AIUI it doesn't win any beauty contests but it works <danw> why do the plugins need to introspect the binary at runtime? <owen> danw: they don't danw: the question was how to introspect it at build time without compiling everything twice danw: the gir-* stuff needs to dlopen() the library ... I guess to pull out gobject introspection information <danw> ah. that's even worse then. it loses beauty contests *and* adds to startup time, to avoid extra compile time if gir only needs to dlopen the library/app to pull out the gobject introspection information, we could just make it get it from somewhere else instead in this case oh, wait, no, i see what you mean <walters> hm...i'm not sure what the exact impact on startup time is but it can't be that large <halfline> did you know a binary can be a dso too? <danw> on ELF platforms <halfline> i did that when i wrote an exploit to test an Xorg errata a while back: http://rstrode.fedorapeople.org/quuuuuuuuuuuuuuuuux.c <walters> we're just talking about a one-line shell script <halfline> read about it in one of uli's papers <danw> walters: (a) bash is notoriously slow to start up, (b) i was actually talking about additional ld.so work <owen> walters: If it's one line ... that is, the path is jut hard coded in it ... then when not use -rpath instead? <walters> danw: if invoked as /bin/sh ? <danw> walters: it just feels like there *must* be a better way... but maybe there isn't <walters> owen: i think the fedora build system strips rpaths by default...maybe there is a way to override danw: suggestions happily accepted =) danw: there was a thread on gtk-devel-list a bit ago about it <halfline> honestly, we have a resident expert in house you guys you should ping foo <owen> danw: We discussed having some way to run the app "--introspect-me" that would embed python into the app so the gir-* machinery could run halfline: foo isn't very good at giving portable solutions <danw> owen: yeah, i remember this thread. i just didn't remember seeing the "install a private library" part <owen> danw: I don't think that was discussed at that point <owen> walters: Hmm, what if we had g_object_class_dump_introspection_to_stdout() <walters> owen: not a bad idea <owen> walters: So, --introspect-me just loops over the objects to export and dumps out their introspection in XML or whatever, then gir-* does the heavy lifting from that <owen> (We could put g_object_class_dump_introspection_to_stdout() in libgirepository for now, though it's logically part of GObject)
Executive summary: * Require apps have a --introspect cmdline argument * This calls g_irepository_dump_types, which accepts a list of symbols on stdin, and outputs an XML-formatted representation of them on stdout * Scanner is modified to have a --binary argument; if this is given instead of --library, the scanner knows to exec this binary with --introspect * glibtransformer.py needs to be updated to know which mode we're in (--binary or --library) Continuing this thought, maybe we can require libraries to build stub binaries so we only have the --binary code path?
(In reply to comment #1) > Executive summary: > > * Require apps have a --introspect cmdline argument > * This calls g_irepository_dump_types, which accepts a list of symbols on > stdin, and outputs an XML-formatted representation of them on stdout > * Scanner is modified to have a --binary argument; if this is given instead of > --library, the scanner knows to exec this binary with --introspect > * glibtransformer.py needs to be updated to know which mode we're in (--binary > or --library) > > Continuing this thought, maybe we can require libraries to build stub binaries > so we only have the --binary code path? That's kinda what gtk-doc does: creating pango-scan.c and linking/running...
Created attachment 122384 [details] [review] add support for --introspect
Created attachment 122385 [details] [review] tweaked patch The previous was an old version, this one works with mutter.
Created attachment 122456 [details] [review] merged to current trunk, bugfixes This version merges to current trunk; I also got rid of the need for --uninstalled-library; by default now we use -L. to search for libraries in the current directory. This combined with the --libtool argument to use libtool --mode=link before the linker command replaces the need for the argument.
Generally looks very reasonable; I forsee some portability heartache for the "build a stub binary" approach when someone wants to port to MSVC or similar, but I guess there is always the option of building a stub program yourself and using --program. Here's what I saw reading through the patch; it's pretty minor stuff. Once you get gir-repository building and clean up things (especially debug printfs), I don't see any reason that this can't go in. Though it probably would be good to get a sign-off on the idea from someone with a less casual connection to gobject-introspection than me :-) * The --uninstalled-introspection-srcdir/builddir options really clutter p the gobject-introspection. Maybe good to put them into the definition of SCANNER/G_IR_SCANNER to at least just have once for makefile. * It's pretty ugly to have --uninstalled-introspection options in the --help for g-ir-scanner which is a public program. There doesn't seem to be any obvious way to hide them with OptParse ... would using envvars make sense? * gdump.c would be a lot shorter and easier if you make output_printf() use g_markup_vprintf() ... g_markup_vprintf() was really added exactly for this type of usage. * trivial: dump_interface_type looks like it will dump parent="GInterface"? * debug g_printerr() in gdump.c:dump_type() ? * g_ir_repository_dump() doc comment looks cut-and-pasted and not fully fixed up. * g_irepostiory_dump() has trailing ' on the g_criticals * I think if gdump.c fails to open input/output it should exit(1) so the make false and doesn't mysteriously build a bad .gir. * trivial: loop in g_irepository_dump() leaks 'line' on duplicate types. * I found it confusing how a tuple of ([args], [files to delete]) was passed to set_introspection_binary() and passed around as 'binary'. * Might make things cleaner to have a process global list of temporary files that you clean up in a finally: in main? * In _execute_binary() and _compile_binary() I think it's a little inelegant to call subprocess.check_call() and let it stacktrace with CalledProcessError() the process fails. Makes diagnosing problems in invocation harder. * When you call check_call() you don't need to do stdout=sys.stdout, stderr=sys.stderr, that's the default. * When calling pkg-config I would definitely check the return value, or it's going to be even more confusing what is going in. * Should respect a PKG_CONFIG envvar. * There's a lot of leftover 'print args' and print "%r' % args when executing programs in the scanner. Needs cleaning up. (No real opinion about whether it's good to output the commands being executed to stderr - make and libtool do do that...) * Didn't really follow what was going on GLibTransformer._adjust_transfer() is that some sort of unrelated change? * 'libtool_infection = not not options.libtool' props for the name, but why 'not not'? Is this a leftover from before you caught the store_true issue for the --libtool argument?
(In reply to comment #6) > Generally looks very reasonable; I forsee some portability heartache > for the "build a stub binary" approach when someone wants to port to > MSVC or similar, but I guess there is always the option of building > a stub program yourself and using --program. Yeah; actually the compilation approach will only be needed for MacOS X, I believe. On Windows and on ELF platforms there's no difference between shared libraries and loadable modules, so we could just ship the tmp-introspect binary (well, call it g-ir-dumper or something). > Here's what I saw reading through the patch; it's pretty minor stuff. > Once you get gir-repository building and clean up things (especially > debug printfs), I don't see any reason that this can't go in. Though > it probably would be good to get a sign-off on the idea from someone > with a less casual connection to gobject-introspection than me :-) Yeah, it is a big patch; will ask jdahlin or tko. > * The --uninstalled-introspection-srcdir/builddir options really > clutter p the gobject-introspection. Maybe good to put them into > the definition of SCANNER/G_IR_SCANNER to at least just have once > for makefile. Done. > * It's pretty ugly to have --uninstalled-introspection options in > the --help for g-ir-scanner which is a public program. > There doesn't seem to be any obvious way to hide them with > OptParse ... would using envvars make sense? Done. > * gdump.c would be a lot shorter and easier if you make output_printf() > use g_markup_vprintf() ... g_markup_vprintf() was really added > exactly for this type of usage. Wow, that function is made of awesome. Done. > * trivial: dump_interface_type looks like it will dump > parent="GInterface"? Fixed. > * debug g_printerr() in gdump.c:dump_type() ? I'd like to keep the debug prints for just a little bit after the patch lands; once it's solidified we can trim up the output. > * g_ir_repository_dump() doc comment looks cut-and-pasted and > not fully fixed up. Done. > * g_irepostiory_dump() has trailing ' on the g_criticals Fixed. > * I think if gdump.c fails to open input/output it should exit(1) > so the make false and doesn't mysteriously build a bad .gir. Hmm. > * trivial: loop in g_irepository_dump() leaks 'line' on duplicate > types. Fixed. > * I found it confusing how a tuple of ([args], [files to delete]) was > passed to set_introspection_binary() and passed around as 'binary'. Yeah...the usage sort of grew and tuples were easy to hand; probably needs a real object now....fixed. > * Might make things cleaner to have a process global list of temporary > files that you clean up in a finally: in main? I think it's good to keep the temporaries around if the process fails. At least for now. > * In _execute_binary() and _compile_binary() I think it's a little > inelegant to call subprocess.check_call() and let it stacktrace > with CalledProcessError() the process fails. Makes diagnosing > problems in invocation harder. What do you suggest instead? > * When you call check_call() you don't need to do stdout=sys.stdout, > stderr=sys.stderr, that's the default. Ah, right. Fixed. > * When calling pkg-config I would definitely check the return value, > or it's going to be even more confusing what is going in. The packages are ones that introspection itself relies on. > * Should respect a PKG_CONFIG envvar. Ok, done (though not sure why one would want that). > * There's a lot of leftover 'print args' and print "%r' % args when > executing programs in the scanner. Needs cleaning up. (No real > opinion about whether it's good to output the commands being executed > to stderr - make and libtool do do that...) Don't they print to stdout? Like I said above I'd like to keep the prints for now but am fine with tweaking or removing them later. > * Didn't really follow what was going on GLibTransformer._adjust_transfer() > is that some sort of unrelated change? The deal here is that function had a hack to hold on to the in-process GType while introspecting signals so that it could peek at it later to see which parameters were G_TYPE_OBJECT. Now that we use out of process data I had to replace that with code to look at the introspection tree (which is actually cleaner I think). > * 'libtool_infection = not not options.libtool' props for the name, but > why 'not not'? Is this a leftover from before you caught the store_true > issue for the --libtool argument? Removed now.
Created attachment 122518 [details] [review] merged to trunk, rework for Owen's comments This version should fix for Owen's comments, and merges to trunk.
I haven't fully read and understood the patch, but two comments: 1) why "--introspect in,out" rather than using stding and stdout? 2) in theory, GOptionContext could implement --introspect itself internally, by g_module_open()ing a module containing g_irepository_dump()...
1) Far less fragile - this lets me use g_critical and such without worrying about it getting mixed into the XML output 2) Good idea, I'll take a look.
> > * Might make things cleaner to have a process global list of temporary > > files that you clean up in a finally: in main? > > I think it's good to keep the temporaries around if the process fails. At > least for now. The suggestion was more to do a centralized cleanup. try: run_main() except SystemExit, e: if e.code == 0: remove_temporaries() But anyways, it was just a idea I wanted to throw out rather than something concrete to fix. > > * In _execute_binary() and _compile_binary() I think it's a little > > inelegant to call subprocess.check_call() and let it stacktrace > > with CalledProcessError() the process fails. Makes diagnosing > > problems in invocation harder. > > What do you suggest instead? try: subprocess.check_call() except CalledProcessError: printf "Compiling introspection binary failed" sys.exit(1) (Could have a utility function to print the invoked command, execute it, and then check the result and exit cleanly on failure) > > * When calling pkg-config I would definitely check the return value, > > or it's going to be even more confusing what is going in. > > The packages are ones that introspection itself relies on. I think a couple of extra lines of code is better than blind assurance that PKG_CONFIG_PATH is set correctly, etc. > > * Should respect a PKG_CONFIG envvar. > > Ok, done (though not sure why one would want that). We support it everywhere else, that's the main reason. > > * There's a lot of leftover 'print args' and print "%r' % args when > > executing programs in the scanner. Needs cleaning up. (No real > > opinion about whether it's good to output the commands being executed > > to stderr - make and libtool do do that...) > > Don't they print to stdout? Like I said above I'd like to keep the prints for > now but am fine with tweaking or removing them later. Yeah, I guess 'make | less' works :-). I don't have a particular problem with leaving the prints long-term, but it would be better to have them looking like a command rather than (['arg1',], ['arg2']....)
Comment on attachment 122518 [details] [review] merged to trunk, rework for Owen's comments Looks good, thanks for cleaning this up. As I mentioned on IRC, there are other advantages to this approach, we could reconsider putting the typelib inside the shared libraries now, as we can now avoid building everything twice. A couple comments: - Missing a simple test on how this would work in the case of a program - It would make sense to move the Compiling/executing of the binary dump to as separate python module - Can girparser and ast nodes be reused for parsing of the introspected data? - Can distutils be reused for compilation? Distutils knows how to compile python extension on all platforms we care about. - The tests should ideally not change, why is there a need to change them? - Will this break gir-repository and the other users of g-ir-scanner? - The man page for g-ir-scanner should be updated - --program/--program-args -> --introspection-command? There shouldn't be a need for 2 arguments imho. - Is there a reason to not use libtool by default? Can the scanner be smart enough and always use libtool if it's installed/available? - temporary files should be created in /tmp - what about creating the relevant GParamFlags constants so we can avoid magic numbers? - I didn't really look at the C code, but I think Owen did a good job on that already
(In reply to comment #12) > (From update of attachment 122518 [details] [review] [edit]) > Looks good, thanks for cleaning this up. > As I mentioned on IRC, there are other advantages to this approach, we could > reconsider putting > the typelib inside the shared libraries now, as we can now avoid building > everything twice. > > A couple comments: > - Missing a simple test on how this would work in the case of a program Good point, fixed. > - It would make sense to move the Compiling/executing of the binary dump to as > separate python module Fixed. > - Can girparser and ast nodes be reused for parsing of the introspected data? We've been moving girparser/ast to the point where they require something that looks exactly like GIR. If we changed the "gdump XML" to look like GIR it would have a lot of empty stubs; c:type, no methods, etc. There's also no recursive types in gdump which makes parsing simple. The other downside is that when we change the GIR you'd have *another* place to modify (gdump.c), whereas right now the gdump XML is very lightweight (no namespaces, etc.) and targeted just for its use case. The code for parsing it is actually very very small; we just use elementtree's xpath-like stuff. > - Can distutils be reused for compilation? Distutils knows how to compile > python > extension on all platforms we care about. Hm, interesting idea but I don't think we can; the problem I see with that is that we need to support using libtool, and distutils does not that I see. (A sensible choice ;) But sadly the automake people bought into libtool and 95% of GNOME uses automake) > - The tests should ideally not change, why is there a need to change them? I'll factor that change out, it's not required. It sort of fell out when I was trying to rework the transfer default function. > - Will this break gir-repository and the other users of g-ir-scanner? I'll attach the gir-repository patch; it's quite trivial. > - The man page for g-ir-scanner should be updated Done. > - --program/--program-args -> --introspection-command? There shouldn't be a > need for 2 arguments imho. Well, this actually looks like: --program=./myapp --program-arg=--plugin=foo --program-arg=--bar If we wanted to make it: --introspection-command="myapp --plugin=foo --bar" we'd have to get into the business of parsing shell. The above pattern looks ugly I know, but it has precedent in compilation tools (think -Wl for gcc, which passes args to the linker). Also keep in mind that I think 90% of people will only need --program. I only added it for gnome-shell where we need to load our plugin =/ > - Is there a reason to not use libtool by default? Can the scanner be smart > enough and always use > libtool if it's installed/available? By default if it's installed? Maybe, but we should still support operation without. It's pure overhead if you aren't using libtool (and I'm not in one of my projects), plus I just don't want to encourage the libtool agenda ;) ...after thinking about this some more - like on how to detect automake/libtool usage I gave up, now there is just a --no-libtool option for those few, proud uninfected projects that serve as a beacon of hope for the future, against the forces of darkness and shell script. > - temporary files should be created in /tmp I originally tried this actually, the problem is that libtool will then create /tmp/.libs, which is insecure, won't work with parallel compilation etc. Putting them in the cwd makes sense I think as that's how other compilation works. > - what about creating the relevant GParamFlags constants so we can avoid magic > numbers? Done.
Working on splitting up the patch for final commit.
Committed r912
[Mass-moving gobject-introspection tickets to its own Bugzilla product - see bug 708029. Mass-filter your bugmail for this message: introspection20150207 ]