GNOME Bugzilla – Bug 769532
Python3, ListStores and 64bit integers
Last modified: 2016-08-14 15:26:06 UTC
Created attachment 332753 [details] Shows the difference between append() and insert_with_valuesv() I've found a lot of strange issues with python3 and ListStores with 64bit integers, see attached example. Some outstanding problems: * Gtk.ListStore(int).append([10**10]) will fail, because 10**10 is bigger than 2^31, despite `int` being the only integer type in python3 * Gtk.ListStore(GObject.TYPE_GUINT64).append([10**10]) works correctly * Gtk.ListStore(GObject.TYPE_GUINT64).insert_with_valuesv(-1, [0], [10**10]) will succeed, but truncate the inserted value to int32 range The last example is not an academical test, on python2 it works perfectly (with long's) and it's 8x faster than append. To recover it in python3 I could do an explicit GObject cast: addr = GObject.Value(GObject.TYPE_UINT64, addr) and add this, but this cast loses all the speed that insert_with_valuesv() gains. Some armchair debugging: python int is mapped to C int32 regardless of py2/3. I'd be very happy even to have a workaround, if a fix isn't feasible. Thanks. OS: Debian sid Version: python-gi 3.20.1-1
Some details on what's going on here: > Gtk.ListStore(int).append([10**10]) PyGObject has some default mappings of Python types to GTypes. As you suspected Python int gets mapped to GObject.TYPE_INT (long to TYPE_LONG). The append() function in this case is a Python override which takes the column type and creates the right GValue and then passes it along. > Gtk.ListStore(GObject.TYPE_GUINT64).append([10**10]) same as above, but instead of TYPE_INT you use TYPE_UINT64 > Gtk.ListStore(GObject.TYPE_GUINT64).insert_with_valuesv(-1, [0], [10**10]) If you use insert_with_valuesv directly the 10**10 will be converted to a GValue and because it is an int, default to a TYPE_INT. Only after that will it be passed to insert_with_valuesv. Arguably this should raise and not silently truncate. If you want to use insert_with_valuesv() here you have to set up the GValue manually. This will still be faster than append() as you remove some branches, function calls and can use one GObject.Value for inserting multiple rows: l = Gtk.ListStore(GObject.TYPE_UINT64) value = GObject.Value() value.init(GObject.TYPE_UINT64) for i in range(100): value.set_uint64(i) l.insert_with_valuesv(-1, [0], [value]) ---- If you want to work with arbitrary Python ints you can also use TYPE_PYOBJECT which just stores the Python object without converting it to a C integer. When using a TreeView you'd then have to use a custom cell renderer func which converts the Python object to text apply it to the cell renderer. l = Gtk.ListStore(GObject.TYPE_PYOBJECT) value = GObject.Value() value.init(GObject.TYPE_PYOBJECT) for i in range(100): value.set_boxed(i) l.insert_with_valuesv(-1, [0], [value])
A huge thank you to Christoph Reiter for explaining me what was going on. Honestly, the fact that the native py2 long behaviour (and its speed) is unobtainable in py3 felt like a huge let down. Have you though about providing a dummy `fakelong` for py3 which is an `int` on the python side (subclassed from int) but an `int64` on the C side? I used a workaround similar to the one CR proposed, but I reinstanced the GValue every time, with: addr = GObject.Value(GObject.TYPE_UINT64, int(addr, 16)) If I instance it only once, I get a large speed boost (thanks a lot!), but it's still slightly slower than py2. append2: 370 ms append3: 330 ms valuesv2: 55 ms valuesv3: 70 ms old valuesv3: 150 ms Any idea on what overhead is still left in py3 (apart the extra GValue.set_uint64() call)? Relevant code portion (test is done with ~4k lines, string manipulation is irrelevant to timing): start_time = timeit.default_timer() if misc.PY3K: addr = GObject.Value(GObject.TYPE_UINT64) off = GObject.Value(GObject.TYPE_UINT64) for line in lines: line = str(u(line)) (mid, line) = line.split(']', 1) mid = int(mid.strip(' []')) (addr_str, off_str, rt, val, t) = list(map(str.strip, line.split(',')[:5])) t = t.strip(' []') if t == 'unknown': continue # `insert_with_valuesv` has the same function of `append`, but it's 7x faster # PY3 has problems with int's, so we need a forced guint64 conversion # Still 3x faster even with the extra baggage if misc.PY3K : addr.set_uint64(int(addr_str, 16)) off.set_uint64(int(off_str.split('+')[1], 16)) else : addr = long(addr_str, 16) off = long(off_str.split('+')[1], 16) self.scanresult_liststore.insert_with_valuesv(-1, [0, 1, 2, 3, 4, 5, 6], [addr, val, t, True, off, rt, mid]) #~ self.scanresult_liststore.append([addr, val, t, True, off, rt, mid]) print((timeit.default_timer() - start_time)*1000, 'ms')
I guess the difference is due to Python code vs C code. I use the following pattern everywhere (classes derived from object are automatically converted to TYPE_PYOBJECT): from gi.repository import Gtk, GObject class MyEntry(object): def __init__(self, addr, off): self.addr = addr self.off = off l = Gtk.ListStore(GObject.TYPE_PYOBJECT) for i in range(100): l.insert_with_valuesv(-1, [0], [MyEntry(i, -i)]) for row in l: print row[0].addr, row[0].off
I may try this last suggestion, but for the sake of the poor guy who'll inherit this code I'll likely stick with my latest approach. Thank you, C.R. Is there somewhere I should write about the silent truncation of Gtk.ListStore(GObject.TYPE_GUINT64).insert_with_valuesv(-1, [0], [10**10]) ?
(In reply to andreastacchiotti from comment #4) > Is there somewhere I should write about the silent truncation of > Gtk.ListStore(GObject.TYPE_GUINT64).insert_with_valuesv(-1, [0], [10**10]) ? I've opened bug 769789 for this. Is there anything else you think should be addressed or can we close this issue?
I wonder if this suggestion: > Have you though about providing a dummy `fakelong` for py3 which is an `int` on the python side (subclassed from int) but an `int64` on the C side? is feasible and where should I post it in that case. Other than that, this can be closed, thanks for your help.
(In reply to andreastacchiotti from comment #6) > I wonder if this suggestion: > > > Have you though about providing a dummy `fakelong` for py3 which is an `int` on the python side (subclassed from int) but an `int64` on the C side? > > is feasible and where should I post it in that case. I remember some discussion on this, but can't seem to find it right now :/ Feel free to open a new bug for this. Imo we should prioritize documenting the current behavior first before trying to make it easier. I've started something here: https://pygobject.readthedocs.io/en/latest/gobject.html but my motivation is currently lacking :) > Other than that, this can be closed, thanks for your help. OK.