GNOME Bugzilla – Bug 772426
Serialization of int8/uint8 arrays is slow
Last modified: 2017-06-07 16:25:05 UTC
The compiler generates code that treats every element as a separate value.
Created attachment 336944 [details] [review] Optimize (de)serialization of (u)int8 arrays
Bump
Review of attachment 336944 [details] [review]: The patch appears to use g_variant_get_data () (https://developer.gnome.org/glib/stable/glib-GVariant.html#g-variant-get-data) to serialize and g_variant_new_from_data () (https://developer.gnome.org/glib/stable/glib-GVariant.html#g-variant-new-from-data) to de-serialize. I'm confused as to why the serialize function g_variant_get_data is used in deserialize_array() in /codegen/valagvariantmodule.vala. Shouldn't this be serialize_array() ? Vice versa for g_variant_new_from_data () used in serialize_array (). For maintaining readability in the code base it may be better to factor out the code into its own serialize_buffer_array() and deserialize_buffer_array() functions. How much faster is this? I'm presuming the main use case is sending large amounts of binary data over D-Bus? This needs some test cases to guard against regression. A D-Bus server/client test, similar to https://bugzilla.gnome.org/show_bug.cgi?id=744595 may be relevant.
Created attachment 347947 [details] [review] gdbus: Optimize (de)serialization of (u)int8 arrays Incorporated feedback from Al Thomas (thanks!). If anyone wants to pitch in with a benchmark test-case that would be awesome – I don't have the bandwidth to take on that right now.
(In reply to Al Thomas from comment #3) > Review of attachment 336944 [details] [review] [review]: > > The patch appears to use g_variant_get_data () > (https://developer.gnome.org/glib/stable/glib-GVariant.html#g-variant-get- > data) to serialize and g_variant_new_from_data () > (https://developer.gnome.org/glib/stable/glib-GVariant.html#g-variant-new- > from-data) to de-serialize. I'm confused as to why the serialize function > g_variant_get_data is used in deserialize_array() in > /codegen/valagvariantmodule.vala. Shouldn't this be serialize_array() ? Vice > versa for g_variant_new_from_data () used in serialize_array (). The g_variant_get_data() happens in the deserialization code-path. Maybe the diff-viewer made it confusing? > For maintaining readability in the code base it may be better to factor out > the code into its own serialize_buffer_array() and > deserialize_buffer_array() functions. Totally agree. Patch updated. Thanks! > How much faster is this? I don't remember the numbers, though I remember it was a massive speed-up when coupled with the corresponding optimizations already in GLib. > I'm presuming the main use case is sending large amounts of binary data over D-Bus? Correct :-) We're using it in www.frida.re to transfer an optional binary payload along with a string containing JSON. > This needs some test cases to guard against regression. A D-Bus > server/client test, similar to > https://bugzilla.gnome.org/show_bug.cgi?id=744595 may be relevant. Oh that's a nice-looking test, I'll try adapting it. Thanks!
(In reply to Ole André Vadla Ravnås from comment #5) > (In reply to Al Thomas from comment #3) > > This needs some test cases to guard against regression. A D-Bus > > server/client test, similar to > > https://bugzilla.gnome.org/show_bug.cgi?id=744595 may be relevant. > > Oh that's a nice-looking test, I'll try adapting it. Thanks! I'll leave this for later, got a bit too much on my plate right now.
Created attachment 352495 [details] [review] gdbus: Optimize (de)serialization of (u)int8 arrays Same code but with a performance test added. It exercises all code-paths involved, and measures the time spent relative to method calls and signal emissions of an (u)int8 array compared to the same kind of operation without any arguments. Here are some empirical measurements to give you an idea of the performance impact: *** without optimizations *** call ratio int8: 1886.70 call ratio uint8: 1880.16 emit ratio int8: 19440.91 emit ratio uint8: 17467.62 *** with optimizations *** call ratio int8: 33.06 call ratio uint8: 30.79 emit ratio int8: 69.53 emit ratio uint8: 69.27 Measured by repeating each operation 10 times, each of which carries a 4 MiB payload. The ratio will obviously depend on the payload size.
Testing performance is nice, but checking the data integrity (which is usually done in tests) is even more important. Adding gchar and guchar seems easy and appropriate, but would it be possible to widen this to all simpletype-arrays?
Created attachment 353326 [details] [review] gvariant: Optimize (de)serialization of arrays with type-signature "ay"
Attachment 353326 [details] pushed as 7a28221 - gvariant: Optimize (de)serialization of arrays with type-signature "ay"