GNOME Bugzilla – Bug 784211
Should the many Glib::ustring::compose() and ::format() templates be replaced with single variadic templates?
Last modified: 2017-09-05 18:45:35 UTC
Currently there are separate template functions for each number of arguments to Glib::ustring::compose(). Presumably, this should be replaced with a single variadic template function. Is that correct?
Yes, please.
Thanks! I'll resurrect my patch from a couple of weeks ago I think that we should also replace compose_argc with compose_ilist. That is: Instead of passing around a pointer to an array and its size so that we do not have to template on the size of the array, we can use std::initializer_list, which pretty much exists for this purpose, as it uses compiler magic to abstract a variable sized array behind iterators. This is a bit more concise and considerably more idiomatic. If there are no objections, I'll make that part of the same patch set.
compose_argv(), that is. I also wonder about that helper class Stringify: * I haven't got a proof-of-concept yet, but instinctively I feel like it could be replaced with methods that simply return ustrings by value (subject to copy elision), instead of what seems like a convoluted mechanism of stashing the [copied or format()ted] string in itself, just so that it can give a pointer out to compose_argv. It should be possible to construct the various overloads in place in the initializer_list and then just have compose_ilist() loop over those. * Are all the comments about overloads that are required for Visual Studio still current? The horrible ambiguity between different variations of char continues to mystify me, and I don't have VS to test, but I feel like all the added noise of these shouldn't be required, at least not anymore. I guess I'll try to catch one of the Windows folk on IRC some time!
Created attachment 355051 [details] [review] ustring: Remove unused template arguments
Created attachment 355052 [details] [review] ustring: #include <cstddef> unconditionally It may not be needed for std::ptrdiff_t, as that is only used conditionally, but we use std::size_t either way in some Stringify specialisations, and while that is almost certain to be pulled in from <string> or some other standard header, we might as well be explicit.
Created attachment 355053 [details] [review] ustring: Use std::initializer_list instead of pointer + size Acrobatics are needed so that the main compose function does not have to be templated on the number of strings due to the bloat that would cause. Since C++11, instead of passing a pointer to an array & its size, we can std::initializer_list, which pretty much exists for this purpose, as it uses compiler magic to abstract a variable sized array behind iterators. The only drawback is that initializer_list does not have an operator[], but its elements are guaranteed to be allocated in the order defined, so we can use (*(begin() + index)) – a bit ugly, but we only need it once!
Created attachment 355054 [details] [review] ustring: Replace 9×compose() with 1 variadic template No more gratuitous repetition, hooray for modern C++!
Created attachment 355055 [details] [review] ustring: Replace 8×format() with 1 variadic template as per the previous commit for Glib::ustring. Now, users can pass format() as many arguments as their compiler or stack will allow, rather than being limited to a maximum of 8.
I should point out that make check, including its tests for ustring::compose and format, is still 100% happy after all this. :)
(In reply to Daniel Boles from comment #9) > I should point out that make check, including its tests for ustring::compose > and format, is still 100% happy after all this. :) I just looked at those tests, and they don't actually check for any expected output, just that the functions don't throw... Maybe I'll work on those another time. A weird difference between glibmm-2-52 and master, even before my changes, is that master format()s the pointer address of &i in the test with commas, despite it being a hex address, for example: 0x7ff,fb2,8d6,b1c Any ideas why the two branches differ here? A quick search suggests it might be something to do with locales, if anything changed there between them.
Review of attachment 355054 [details] [review]: I've since renamed compose_ilist() to compose_private(), and removed the comment about perfect forwarding, since it hardly matters and conflicts with Stringify (ambiguous overloads)
Ping for a review? I figured a WIP branch might be nicer to read, so I put one up: https://git.gnome.org/browse/glibmm/log/?h=wip/dboles/ustring There are no functional differences from the patches here.
It looks good to me. Please go ahead and push it to master. Thanks for taking care of this.
Thanks for the review! I've merged it. The one thing I still wonder about is the oddity in Comment 10, but since it already occurred before this, I guess we can deal with it separately, if needed.
When I build glibmm after configuring with --enable-warnings=fatal, I get ustring.cc: In static member function ‘static Glib::ustring Glib::ustring::compose_private(const Glib::ustring&, std::initializer_list<const Glib::ustring*>)’: ustring.cc:1317:17: error: comparison of unsigned expression >= 0 is always true [-Werror=type-limits] if (index >= 0 && index < ilist.size()) ~~~~~~^~~~ This is because of const std::size_t index = Ascii::digit_value(stop[1]) - 1; if (index >= 0 && index < ilist.size()) Ascii::digit_value() can return 0 or -1. Ascii::digit_value(stop[1]) - 1 can be negative. A possible fix is const int index = Ascii::digit_value(stop[1]) - 1; if (index >= 0 && static_cast<std::size_t>(index) < ilist.size()) Or can be rely on the fact that a negative value usually becomes a huge positive value when a signed int is converted to an unsigned int? const std::size_t index = Ascii::digit_value(stop[1]) - 1; if (index < ilist.size()) I prefer the fix with a signed index.
(In reply to Daniel Boles from comment #10) > A weird difference between glibmm-2-52 and master, even before my changes, > is that master format()s the pointer address of &i in the test with commas, > despite it being a hex address, for example: > > 0x7ff,fb2,8d6,b1c > > Any ideas why the two branches differ here? A quick search suggests it might > be something to do with locales, if anything changed there between them. Yes, something changed there. See bug 661588. The git master of the gtkmm tutorial (gtkmm-documentation) contains gtkmm-4.0 is used in combination with glibmm-2.54, which sets the global locale for your program. The older glibmm-2.4 does not do that, and gtkmm-3.0 does it only to some extent. What this means is briefly that if your gtkmm-3.0 program contains a call to std::locale::global(std::locale("")), you can probably remove it. If you don't want glibmm or gtkmm to set the global locale for you, you should add a call to Glib::set_init_to_users_preferred_locale(false) before any call to Glib::init() or Gtk::Application::create(). glibmm-2.54 should be changed to glibmm-2.56. Surprising that thousands separators are used in hexadecimal numbers. When I run glibmm_ustring_compose with locale sv_SE.UTF-8, it's written 0x7ff c37 5ed 0b4
Whoops, I'm usually really paranoid about warnings, so I don't know how that slipped through; apologies. (In reply to Kjell Ahlstedt from comment #15) > A possible fix is > > const int index = Ascii::digit_value(stop[1]) - 1; > if (index >= 0 && static_cast<std::size_t>(index) < ilist.size()) I guess it doesn't matter here, since we only accept placeholders 1~9, but personally I don't like using plain ints for indices, as a general rule, because it seems wrong (you know, in case of that really likely, practical scenario of a container with > 2.5 million items... :P) > Or can be rely on the fact that a negative value usually becomes a huge > positive > value when a signed int is converted to an unsigned int? > > const std::size_t index = Ascii::digit_value(stop[1]) - 1; > if (index < ilist.size()) Perhaps weirdly, I like this more, instinctively. However, we would have to cast the result of digit_value() to std::size_t first, then subtract - as while underflow is defined for unsigned values, it is undefined behaviour for integers, AFAIK. Anything that mixes integers of different signages and widths makes me flinch, and usually is doomed to code full of compromises. So, while I prefer the latter, I guess all we really need is something that works; please do whatever you prefer, or whatever is most in keeping with how you normally handle such situations in glibmm.
Ignore the bit about underflow; I have no idea why I thought that was relevant... in my defence, I was thinking about 2 other things at the same time!
Created attachment 359096 [details] [review] ustring: Avoid warning due to type confusion error: comparison of unsigned expression >= 0 is always true [-Werror=type-limits] if (index >= 0 && index < ilist.size()) As ilist.size() is static_assert()ed in compose() to be in range of int, we can safely convert to int for use as the bound. For clarity, convert to a new int, rather than comparing to static_cast<int>(ilist.size()). Using an int is clearer than e.g. relying on std::size_t underflowing to its max if digit_value() returns < 0 and therefore being < ilist.size(). -- This seems like the most readable and self-documenting way to do it, but I might be wrong.
(In reply to Daniel Boles from comment #19) > its max if digit_value() returns < 0 and therefore being < ilist.size(). s/therefore/hence not/, already updated locally
Your patch in comment 19 looks great. It's okay with me.
Attachment 359096 [details] pushed as 27309b1 - ustring: Avoid warning due to type confusion