GNOME Bugzilla – Bug 545714
GST_READ_UINT_* and GST_WRITE_UINT_* are not always safe to use
Last modified: 2008-10-07 11:59:27 UTC
+++ This bug was initially created as a clone of Bug #348114 +++ See the above bug for details. These macros should be safe to use no matter what. Just documenting the problem isn't really an option. Possible solutions that come to mind: (a) always use the slow (read+shift bytes individually) variants (ugh), maybe add _UNSAFE variants so those who know what they're doing can use those (b) gcc seems to have a may-alias attribute, which looks like it might be useful (can we make it ignore the casts for aliasing purposes that way, in combination with a few typedefs?) (if yes, what to do about other compilers) (c) use something along the lines of: static inline guint64 __gst_gimme64 (const guint8 * mem) { guint64 n; memcpy (&n, mem, 8); return n; } #if G_ENDIANNESS == G_LITTLE_ENDIAN #define GST_READ_UINT64_LE(mem) __gst_gimme64(mem) #else ... #endif and trust that the compiler will be clever enough to optimize away the memcpy (which recent gcc seems to do with >= -O1, no idea about other compilers). This isn't entirely pretty, but at least has the nice side-effect that we don't need to special-case based on GST_HAVE_UNALIGNED_ACCESS any longer (right?). Opinions? Anyone?
IMO, let's just use the unaligned access code all the time. Yeah, it's slower, but it's unlikely to be in a critical path.
The *only* place I've ever used the READ/WRITE macros is in parsing code, where speed definitely matters.
How many times does it get used per buffer? 10? 100? A few dozen extra clock cycles isn't very interesting. Yes, "every clock cycle counts", but these don't count very much.
It's a good point. A quick grep says the macros are mostly used in demuxers and typefinders, so that'd be the logical place to check. It's not a hard test either, if someone can be bothered to build the entire stack twice and compare the speed of typefinding and demuxing a bunch of files.
I'd imagine that if anything then the various mpeg video parsers would be most affected (although it's unlikely to matter much compared to what it takes to decode the audio/video, assuming that's done on the cpu).
I don't think you could measure any difference if the inline memcpy() variants are used as gcc at least compiles this to single instructions on x86/amd64. Well, on x86 the 64 variants are of course compiled to shift&or... What would be that advantage of the shift&or variants over the memcpy() ones? IIRC gcc optimizes them to single instructions too if possible.
2008-10-07 Sebastian Dröge <sebastian.droege@collabora.co.uk> * docs/gst/gstreamer-sections.txt: * gst/gstutils.h: Always use the unaligned variants of GST_READ_UINT* and GST_WRITE_UINT* as it's too easy to break the ISO C strict aliasing rules with simple casts to the corresponding type and this would introduce hard to debug bugs. Fixes bug #545714. API: Add GST_READ_UINT24_(LE|BE) and GST_WRITE_UINT24_(LE|BE).