Bug 606044 – Add support for dtrace/systemtap static markers

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 606044 - Add support for dtrace/systemtap static markers


Summary:	Add support for dtrace/systemtap static markers


Status:	RESOLVED FIXED

Product:	glib
Classification:	Platform
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-01-04 19:12 UTC by Alexander Larsson
Modified:	2010-11-26 12:35 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Initial support for dtrace and systemtap (13.16 KB, patch) 2010-01-04 19:13 UTC, Alexander Larsson	none	Details \| Review
Add dtrace and systemtap support for gobject (13.92 KB, patch) 2010-01-04 19:13 UTC, Alexander Larsson	none	Details \| Review
Initial support for dtrace and systemtap (13.79 KB, patch) 2010-03-30 13:15 UTC, Colin Walters	accepted-commit_now	Details \| Review
Add dtrace and systemtap support for gobject (13.99 KB, patch) 2010-03-30 13:15 UTC, Colin Walters	reviewed	Details \| Review
Initial support for dtrace and systemtap (13.83 KB, patch) 2010-03-31 14:31 UTC, Colin Walters	none	Details \| Review
Add dtrace and systemtap support for gobject (14.90 KB, patch) 2010-03-31 14:31 UTC, Colin Walters	none	Details \| Review
allow naming GSource, which could be used in a tapset (4.20 KB, patch) 2010-04-20 21:57 UTC, Havoc Pennington	committed	Details \| Review
Initial support for dtrace and systemtap (16.64 KB, patch) 2010-05-11 19:59 UTC, Colin Walters	none	Details \| Review
Add dtrace and systemtap support for gobject (15.08 KB, patch) 2010-05-11 20:00 UTC, Colin Walters	none	Details \| Review
disassembly and gobject-performance benchmark with dtrace disabled (3.71 KB, text/plain) 2010-05-11 20:01 UTC, Colin Walters		Details
disassembly and gobject-performance benchmark with dtrace enabled (5.26 KB, text/plain) 2010-05-11 20:01 UTC, Colin Walters		Details
disassembly and gobject-performance benchmark with dtrace enabled (but no semaphores) (3.71 KB, text/plain) 2010-05-11 21:50 UTC, Colin Walters		Details
gcc -S disassembly and gobject-performance benchmark with dtrace disabled (2.04 KB, text/plain) 2010-05-27 00:01 UTC, Colin Walters		Details
disassembly and gobject-performance benchmark with dtrace enabled (but no semaphores) (1.76 KB, text/plain) 2010-05-27 00:02 UTC, Colin Walters		Details
Initial support for dtrace and systemtap (18.43 KB, patch) 2010-05-27 01:09 UTC, Colin Walters	committed	Details \| Review
Add dtrace and systemtap support for gobject (18.11 KB, patch) 2010-05-27 01:10 UTC, Colin Walters	committed	Details \| Review
gcc -S disassembly and gobject-performance benchmark with dtrace disabled (2.10 KB, text/plain) 2010-05-27 15:09 UTC, Colin Walters		Details
gcc -S disassembly and gobject-performance benchmark with dtrace enabled (no semaphores) (2.55 KB, text/plain) 2010-05-27 15:09 UTC, Colin Walters		Details

Description Alexander Larsson 2010-01-04 19:12:48 UTC

Having some support for static markers for tracing is very useful when debugging apps or system behaviour. I have some initial probes that I will attach.

Comment 1 Alexander Larsson 2010-01-04 19:13:21 UTC

Created attachment 150787 [details] [review]
Initial support for dtrace and systemtap

This adds static markers for dtrace, which are also usable
by systemtap. Additionally it adds a tapset for systemtap
that makes it easier to use the static markers.

--enable-dtrace will enable just the markers and --enable-systemtap
will enable the markers and the systemtap tapsets.

This initial set of probes is rather limited:

* allocation and free using g_malloc & co
* allocation and free using g_slice
* gquark name tracking (useful for converting quarks to strings in probes)

Notes on naming:

Its traditional with dtrace to use probe names with dashes as
delimiter (slice-alloc). Since dashes are not usable in identifiers
the C code uses double underscores (slice__alloc) which is converted
to dashes in the UI. We follow this for the shared lowlevel probe
names.

Additionally dtrace supports putting a "provider" part in the probe
names which is essentially a namespacing thing. On systemtap this
field is currently ignored (but may be implemented in the future), but
this is not really a problem since in systemtap the probes are
specified by combining the solib file and the marker name, so there
can't really be name conflicts.

For the systemtap tapset highlevel probes we instead use names that
are systemtapish with single dashes as separators.

Comment 2 Alexander Larsson 2010-01-04 19:13:31 UTC

Created attachment 150788 [details] [review]
Add dtrace and systemtap support for gobject

This adds static markers and systemtap tapsets for:

* type creation
* object lifetimes (creation, ref, unref, dispose, finalize)
* signal creation and emission

Signal emissions and finalization marker have a corresponding
*_end (or *-end in dtrace) version that is when the corresponding
operation is finished.

Comment 3 Alexander Larsson 2010-01-04 20:02:41 UTC

See also: http://blogs.gnome.org/alexl/2010/01/04/tracing-glib/

Comment 4 Colin Walters 2010-03-30 13:15:39 UTC

Created attachment 157480 [details] [review]
Initial support for dtrace and systemtap

This adds static markers for dtrace, which are also usable
by systemtap. Additionally it adds a tapset for systemtap
that makes it easier to use the static markers.

--enable-dtrace will enable just the markers and --enable-systemtap
will enable the markers and the systemtap tapsets.

This initial set of probes is rather limited:

* allocation and free using g_malloc & co
* allocation and free using g_slice
* gquark name tracking (useful for converting quarks to strings in probes)

Notes on naming:

Its traditional with dtrace to use probe names with dashes as
delimiter (slice-alloc). Since dashes are not usable in identifiers
the C code uses double underscores (slice__alloc) which is converted
to dashes in the UI. We follow this for the shared lowlevel probe
names.

Additionally dtrace supports putting a "provider" part in the probe
names which is essentially a namespacing thing. On systemtap this
field is currently ignored (but may be implemented in the future), but
this is not really a problem since in systemtap the probes are
specified by combining the solib file and the marker name, so there
can't really be name conflicts.

For the systemtap tapset highlevel probes we instead use names that
are systemtapish with single dashes as separators.

Comment 5 Colin Walters 2010-03-30 13:15:45 UTC

Created attachment 157481 [details] [review]
Add dtrace and systemtap support for gobject

This adds static markers and systemtap tapsets for:

* type creation
* object lifetimes (creation, ref, unref, dispose, finalize)
* signal creation and emission

Signal emissions and finalization marker have a corresponding
*_end (or *-end in dtrace) version that is when the corresponding
operation is finished.

Comment 6 Colin Walters 2010-03-30 13:22:36 UTC

I rebased these two patches to git master, with some fixes:

* Also generate the systemtap .o asm probe objects, since they appear to be required now; the generated headers reference a semaphore object per probe point.
* Add a copyright header to gtrace.h

Comment 7 Colin Walters 2010-03-30 13:24:35 UTC

Also note that the bug Alex referenced ( http://sourceware.org/bugzilla/show_bug.cgi?id=11124 ) on the subject of systemtap inefficiency appears to have been fixed, so we should double check whether we can enable these probes on Fedora 14.

Comment 8 Colin Walters 2010-03-30 14:15:40 UTC

Review of attachment 157480 [details] [review]:

One comment, but otherwise going to mark this ACN.

::: glib/glib_probes.d
@@ +1,2 @@
+	probe mem__alloc(void*, unsigned int, unsigned int, unsigned int);
+provider glib {

We should probably document these somehow; is the right place here or in the .stp.in file?  I don't think it's obvious to someone looking at say "glib.mem_alloc" that "failable" means "this is a try_malloc".

Comment 9 Colin Walters 2010-03-31 14:18:52 UTC

Review of attachment 157481 [details] [review]:

::: gobject/gtrace.h
@@ +1,2 @@
+#define __GLIBTRACE_H__
+#ifndef __GLIBTRACE_H__

Should probably rename this file to be clear it's for gobject; maybe call it gobject_trace.h?

Also need a copyright.

Comment 10 Colin Walters 2010-03-31 14:31:33 UTC

Created attachment 157590 [details] [review]
Initial support for dtrace and systemtap

Comment 11 Colin Walters 2010-03-31 14:31:50 UTC

Created attachment 157591 [details] [review]
Add dtrace and systemtap support for gobject

Comment 12 Colin Walters 2010-03-31 14:38:59 UTC

I should mention this is still off by default, but it'd help if someone with x86 asm knowledge took a look at the updated SystemTap in Fedora 13 and checked whether the perf hit was acceptable.

Comment 13 Matthias Clasen 2010-03-31 14:43:39 UTC

I don't have principal objections to this. 

Here are some files that need to be changed:

docs/reference/glib/building.sgml - include documentation about configure options

docs/reference/glib/running.sgml - should probably add a section about profiling that should include a) a list of the current markers and b) some information on how to make use of this

Comment 14 Colin Walters 2010-03-31 15:13:59 UTC

(In reply to comment #13)
> I don't have principal objections to this. 
> 
> Here are some files that need to be changed:
> 
> docs/reference/glib/building.sgml - include documentation about configure
> options

Done.  Though I realized, maybe we should wait on the performance analysis before putting this in, because I documented it as off-by-default, but that's not where we want to be ultimately.

> docs/reference/glib/running.sgml - should probably add a section about
> profiling that should include a) a list of the current markers and b) some
> information on how to make use of this

Documenting the tapset in running.sgml makes sense; but how about a live.gnome.org wiki page for example scripts?  Or maybe we could have one or two in the docs, and then point to the wiki page for up-to-date user contributed examples?

Comment 15 Alexander Larsson 2010-04-06 09:49:37 UTC

I didn't do a thorough performance analysis, i just looked at the code and it added lots of unnecessary instructions (it should need very little). Verifying that the systemtap "fix" for this works is quite easy, just build with and without dtrace enabled and compare some of the affected functions in the gcc -S output.

Comment 16 Frank Ch. Eigler 2010-04-06 16:59:41 UTC

Regarding the sdt.h-related extra instructions, actual measurements of
the incremental cost of dormant (inactive) instrumentation would be
very welcome.  We expect these to be small because (a) most of it is
to be bypassed under normal circumstances due to a semaphore variable
that is normally 0, and (b) the bypassed part is labeled with gcc
branch prediction directives that ought normally put it out of line
of the hot instruction path.

Comment 17 Havoc Pennington 2010-04-20 21:57:36 UTC

Created attachment 159206 [details] [review]
allow naming GSource, which could be used in a tapset

A really useful tapset would replicate bug 588139 - namely, timing each GSource dispatch, and trying to show which source was dispatched and which took too long. Or similarly, doing a timeline of dispatches would be nice.

I think part of such a tapset would be this attached patch to add g_source_set_name() or something.

The tracing in gmain.c dispatch though should also give the name of the source's callback, as in bug 588139. This would be useful instead of or in addition to the manually-set source name. I don't know if systemtap can already find the name of a function pointer or if a bfd hack as in bug 588139 would be required.

something like:
TRACE (GLIB_BEFORE_SOURCE_DISPATCH(source, callback, source_name));
TRACE (GLIB_AFTER_SOURCE_DISPATCH(source, callback, source_name));

where callback is either the raw function pointer if systemtap etc. can use that, or a callback name somehow otherwise, I don't know.

The point of this would be to find interactivity problems in apps.

Comment 18 Colin Walters 2010-05-11 19:59:56 UTC

Created attachment 160852 [details] [review]
Initial support for dtrace and systemtap

This is a simple rebase to the current git master.

Comment 19 Colin Walters 2010-05-11 20:00:07 UTC

Created attachment 160853 [details] [review]
Add dtrace and systemtap support for gobject

Rebased to git master.

Comment 20 Colin Walters 2010-05-11 20:01:20 UTC

Created attachment 160854 [details]
disassembly and gobject-performance benchmark with dtrace disabled

Comment 21 Colin Walters 2010-05-11 20:01:43 UTC

Created attachment 160855 [details]
disassembly and gobject-performance benchmark with dtrace enabled

Comment 22 Colin Walters 2010-05-11 20:03:04 UTC

As you can see from comparing the benchmark numbers, there's a ~3% slowdown in the gobject-performance benchmark.  My instinct is that for real-world applications, this is going to end up being noise, basically, because creating and destroying GObjects isn't where most apps spend a lot of time.

Comment 23 Colin Walters 2010-05-11 20:05:10 UTC

But Alex - could you take a look at the attached disassembly?

A little bit of system information.  The benchmarks were run on a Fedora 13 virtual machine.

walters@localhost glib]$ rpm -q gcc
gcc-4.4.4-2.fc13.x86_64
[walters@localhost glib]$ uname -a
Linux localhost.localdomain 2.6.33.3-85.fc13.x86_64 #1 SMP Thu May 6 18:09:49 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
[walters@localhost glib]$ cat /etc/system-release
Fedora release 13 (Goddard)
[walters@localhost glib]$

Comment 24 Colin Walters 2010-05-11 20:13:14 UTC

Review of attachment 159206 [details] [review]:

This patch looks fine to me.

Comment 25 Frank Ch. Eigler 2010-05-11 21:29:03 UTC

Colin, would you mind running another build/benchmark, by adding
#undef STAP_HAS_SEMAPHORES into the dtrace-generated .h file,
replacing the #define STAP_HAS_SEMAPHORES one?  It looks like
the semaphore logic may be counter-productive in the simple
case of the gobject allocate function.

Comment 26 Colin Walters 2010-05-11 21:50:01 UTC

Created attachment 160865 [details]
disassembly and gobject-performance benchmark with dtrace enabled (but no semaphores)

Ok, this one should be without semaphores.

Comment 27 Alexander Larsson 2010-05-12 10:27:12 UTC

The dissassembly in comment 26 without HAS_SEMAPTHORES is identical to the non-dtrace version, which is exactly what we want here (and it should be easily possible for such a trivial marker).

However, the semaphores one does an extra semaphore check:
    db8c:       48 8b 05 7d 8d 23 00    mov    0x238d7d(%rip),%rax        
    db93:       66 83 38 00             cmpw   $0x0,(%rax)
    db97:       75 50                   jne    dbe9 <g_object_ref+0xc9>

plus it spills stuff more on the stack and has some extra out of line code for the actual marker code.

For the common case this amounts to larger code cache footprint, one memory read that may be uncached and/or frequently read on other cpus, and a few extra instructions executed. This is not horrible, but its kinda irritating when you know nothing of it is needed.

So, I don't really agree with the decision to make every probe use semaphores, In fact I deliberately designed the marker points so that minimal or no extra data would need to be calculated for the marker (which is why e.g. the g_object_ref probe returns the old value and the gobject.object_ref probe has to manually add +1 to get the new one). Semaphores are interesting when the marker has to do a lot of work calculating otherwise unused values, but they are pure overhead when its not needed.

I would prefer if you had to enable semaphore use on a per-marker level.

Also, colin, if you want to look at assembly output, don't disassemble like that, instead look at the assembly produced by gcc when you build with -S (and optionally -fverbose-asm might be interesting)

Comment 28 Frank Ch. Eigler 2010-05-12 13:08:22 UTC

Alexander, I agree.  In stap land we'll improve sdt.h to parametrize
and/or automate the semaphore logic.

I am a bit suspicious of Colin's last version though, since there 
ought to be at least a nop inserted into the instruction stream.
I wonder if the sdt.h stuff was entirely disabled by accident in
that build.

http://sourceware.org/bugzilla/show_bug.cgi?id=11592

Comment 29 Nick Schermer 2010-05-25 19:54:44 UTC

Review of attachment 159206 [details] [review]:

Shouldn't the name be freed in g_source_unref_internal when the source is destroyed?

Comment 30 Colin Walters 2010-05-25 20:55:30 UTC

(In reply to comment #29)
> Review of attachment 159206 [details] [review]:
> 
> Shouldn't the name be freed in g_source_unref_internal when the source is
> destroyed?

Good catch Nick!  I've fixed this in my local commit, along with a few cleanups to the docs, and I've annotated some of the glib mainloop test just to get some coverage of the function.

Comment 31 Colin Walters 2010-05-27 00:01:40 UTC

Created attachment 162059 [details]
gcc -S disassembly and gobject-performance benchmark with dtrace disabled 

This time with gcc -S

Comment 32 Colin Walters 2010-05-27 00:02:05 UTC

Created attachment 162060 [details]
disassembly and gobject-performance benchmark with dtrace enabled (but no semaphores)

this time with gcc -S

Comment 33 Colin Walters 2010-05-27 00:04:59 UTC

So, it looks to me like the difference between the two is pure noise, i.e. within 1%.

But I need another analysis of the disassembly.  I have a new patch prepped which enables this by default, in anticipation of a positive outcome.

Comment 34 Colin Walters 2010-05-27 01:09:48 UTC

Created attachment 162062 [details] [review]
Initial support for dtrace and systemtap

* Enabled by default
* Add documentation to running.sgml
* Document each probe point using the SystemTap doc style

Comment 35 Colin Walters 2010-05-27 01:10:02 UTC

Created attachment 162063 [details] [review]
Add dtrace and systemtap support for gobject

* Document each probe point

Comment 36 Alexander Larsson 2010-05-27 08:47:01 UTC

The attachement in comment #32 doesn't have the whole function in it, it only shows the first part of the marker (the probe section stuff). You need to go on until:

	.size	IA__g_object_ref, .-IA__g_object_ref
	.p2align 4,,15

or something like that.

Comment 37 Colin Walters 2010-05-27 15:09:20 UTC

Created attachment 162111 [details]
gcc -S disassembly and gobject-performance benchmark with dtrace disabled

Comment 38 Colin Walters 2010-05-27 15:09:43 UTC

Created attachment 162112 [details]
gcc -S disassembly and gobject-performance benchmark with dtrace enabled (no semaphores)

Comment 39 Colin Walters 2010-05-27 15:10:55 UTC

Note the new code is generated with:

[walters@lightbox gobject (systemtap)]$ rpm -q gcc
gcc-4.4.4-5.el6.x86_64
[walters@lightbox gobject (systemtap)]$ cat /etc/redhat-release 
Red Hat Enterprise Linux Workstation release 6.0 Beta (Santiago)
[walters@lightbox gobject (systemtap)]$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU     E6850  @ 3.00GHz
stepping	: 11
cpu MHz		: 2000.000
cache size	: 4096 KB

Comment 40 Alexander Larsson 2010-05-27 18:36:19 UTC

Yes, I like this one. 

It adds one extra register being saved and restored on the stack (%rbx), adding 2 instructions on entry and 2 on exit.

The extra instructions it uses for the actual marker is:

 movl	%eax, %ebp          # Move old refcount to %ebp
 movq	(%rbx), %rax        # Store class pointer in %rax
 nop /* %rbx (%rax) %ebp */ # NOP Instruction that will be breakpoint

So, a total of 7 instructions, including 1 memory read that may not be cached (wheras the 2 stack memory accesses are certainly cached).

I very much doubt that anyone cares about this effect on performance. Some other probes are a bit more complex, but I doubt any slowdown will be measurable.

Comment 41 Colin Walters 2010-05-27 19:07:08 UTC

Attachment 162062 [details] pushed as bef9efd - Initial support for dtrace and systemtap
Attachment 162063 [details] pushed as 8e41be1 - Add dtrace and systemtap support for gobject

Comment 42 Colin Walters 2010-05-27 19:11:44 UTC

Ok, let's go with this then.  The pushed patch is on by default.  Awesome work alex!  It's already helped me a lot with GNOME Shell work.

Comment 43 Alexander Larsson 2010-05-27 19:31:58 UTC

colin: Please publicize this a bit to get more people to use systemtap, its kinda unknown atm, which is sad, given how very useful it is.

Maybe we can start a wiki page with example stap scripts for various kind of debugging problems.

Comment 44 Frank Ch. Eigler 2010-05-27 19:37:48 UTC

We'd welcome such scripts in the systemtap distribution proper,
and/or http://sourceware.org/systemtap/wiki/WarStories

Comment 45 Stefan Sauer (gstreamer, gtkdoc dev) 2010-11-26 12:35:26 UTC

Should the files be actually installed somewhere as part of "make install"?

Also having something more useful in
http://library.gnome.org/devel/glib/stable/glib-running.html
under "SystemTap" headline would help people to get started.

What about a gnome-journal article about it?