Bug 679438 – Python 3

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 679438 - Python 3


Summary:	Python 3


Status:	RESOLVED FIXED

Product:	gobject-introspection
Classification:	Platform
Component:	general
Version:	2.33.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gobject-introspection Maintainer(s)
QA Contact:	gobject-introspection Maintainer(s)

URL:
Whiteboard:

Duplicates:	669866 691241 728079 (view as bug list)
Depends on:
Blocks:	684103 python3

Reported:	2012-07-05 09:59 UTC by Christoph Höger
Modified:	2015-09-30 13:51 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
giscanner: Use Python 3 compatible octal literal syntax (1.07 KB, patch) 2014-05-02 04:35 UTC, Simon Feltman	committed	Details \| Review
giscanner: Use modern exception handling for Python 3 compatibility (893 bytes, patch) 2014-05-02 04:35 UTC, Simon Feltman	committed	Details \| Review
giscanner: Use range instead of xrange for Python 3 compatibility (1018 bytes, patch) 2014-05-02 04:35 UTC, Simon Feltman	committed	Details \| Review
giscanner: Use binary files for comparison utility (890 bytes, patch) 2014-05-02 04:35 UTC, Simon Feltman	none	Details \| Review
giscanner: Convert map() results to list (3.31 KB, patch) 2014-05-02 04:35 UTC, Simon Feltman	none	Details \| Review
giscanner: Use items() instead of iteritems() (10.75 KB, patch) 2014-05-02 04:35 UTC, Simon Feltman	none	Details \| Review
giscanner: Use absolute_import for all Python files (11.76 KB, patch) 2014-05-02 04:35 UTC, Simon Feltman	none	Details \| Review
giscanner: Enable "true division" for all Python files (12.30 KB, patch) 2014-05-02 04:35 UTC, Simon Feltman	none	Details \| Review
giscanner: Use print as a function for Python 3 compatibility (20.90 KB, patch) 2014-05-02 04:35 UTC, Simon Feltman	none	Details \| Review
giscanner: Replace repr format usage with string formatter (19.33 KB, patch) 2014-05-02 04:35 UTC, Simon Feltman	none	Details \| Review
giscanner: Use unicode literals in all Python files (26.16 KB, patch) 2014-05-02 04:35 UTC, Simon Feltman	none	Details \| Review
giscanner: Port scanner extension module to work with Python 3 (6.84 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Use pickle when cPickle is not available (2.16 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Use builtins module in Python 3 (3.72 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Use StringIO instead of cStringIO in Python 2 (5.57 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Decode command output for Python 3 compatibility (2.48 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Encode data passed to subprocess.stdin.write (1.58 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Encode sha1 input for Python 3 compatibility (1.48 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Update namespace sort for Python 3 compatibility (1.91 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
docwriter: Update for Python 3 compatibility (3.65 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Use rich comparison methods for Python 3 compatibility (10.78 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
giscanner: Sort unknown parameters in error message (1.29 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
configure.ac: Add --with-python configure flag (1.38 KB, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review
Change update-glib-annotations to use Python 3 (711 bytes, patch) 2014-05-02 04:36 UTC, Simon Feltman	none	Details \| Review

Description Christoph Höger 2012-07-05 09:59:38 UTC

Hi all,

I just (mostly) finished a gobject-introspection port to windows [1]. This includes:

- Building for Python 3.2 (especially g-ir-scanner has been ported to Python 3)

- Building with cmake (including a port of introspection.m4)

I basically needed this to create the GObject-mainloop from python to write a dbus service (what a long tail in the end ...). Nevertheless, I did a certain amount of work and it would be a shame if it was all lost.

I understand that most of you will probably not want switch from autotools to cmake, but just in case, feel free to do so.

Anyway, I guess the Python patches should be usable in an autotools world, too ;). There are plenty of small fixes (and probably not all discovered yet). I also rewrote parts of the c-extensions to work with Python 3.2. In general I tried to stay backwards compatible, but might have failed.

I know that those patches over there are not in the best pull-request shape, that is mostly because I needed to get the work done. I don't have that much time, but feel free to ask for certain changes to the repo if that helps upstreaming the work.

best regards

Christoph

[1] https://github.com/choeger/gobject-introspection/tree/windows_python3_cmake

Comment 1 André Klapper 2012-07-09 10:38:28 UTC

Yeah, it requires somebody to break this into smaller, consumable patches. I'm afraid that nobody will have time to find and extract the actual improvements.

Comment 2 Colin Walters 2012-07-10 21:24:18 UTC

(In reply to comment #1)
> Yeah, it requires somebody to break this into smaller, consumable patches. I'm
> afraid that nobody will have time to find and extract the actual improvements.

Yeah, or even just split up python 3 from cmake.

Comment 3 Matthias Clasen 2013-03-07 13:07:47 UTC

not going to complete this for 3.8, unless somebody shows up and does the work.

Comment 4 Emmanuele Bassi (:ebassi) 2013-08-26 17:25:46 UTC

*** Bug 691241 has been marked as a duplicate of this bug. ***

Comment 5 Emmanuele Bassi (:ebassi) 2013-08-27 10:41:04 UTC

*** Bug 669866 has been marked as a duplicate of this bug. ***

Comment 6 Matthias Clasen 2013-09-02 16:59:01 UTC

dropping of the blocker list.

Comment 7 Simon Feltman 2014-01-07 00:24:02 UTC

One thing that could help here is the usage of "six":
http://pythonhosted.org/six/

This provides a compatibility shim that can be shipped with GI if we need to support multiple version of Python.

Comment 8 Simon Feltman 2014-05-02 04:14:10 UTC

*** Bug 728079 has been marked as a duplicate of this bug. ***

Comment 9 Simon Feltman 2014-05-02 04:35:19 UTC

Created attachment 275597 [details] [review]
giscanner: Use Python 3 compatible octal literal syntax

Comment 10 Simon Feltman 2014-05-02 04:35:23 UTC

Created attachment 275598 [details] [review]
giscanner: Use modern exception handling for Python 3 compatibility

Comment 11 Simon Feltman 2014-05-02 04:35:26 UTC

Created attachment 275599 [details] [review]
giscanner: Use range instead of xrange for Python 3 compatibility

Comment 12 Simon Feltman 2014-05-02 04:35:29 UTC

Created attachment 275600 [details] [review]
giscanner: Use binary files for comparison utility

Explicitly open files for comparison in utils.files_are_identical()
in binary mode for reading (rb).

Comment 13 Simon Feltman 2014-05-02 04:35:36 UTC

Created attachment 275601 [details] [review]
giscanner: Convert map() results to list

Convert the results map() calls to a list for Python 3 compatibility.
In Python 3, map() returns an iterable "map object" which does not
allow indexing or iteration more than once.

Comment 14 Simon Feltman 2014-05-02 04:35:40 UTC

Created attachment 275602 [details] [review]
giscanner: Use items() instead of iteritems()

Replace usage of iteritems() and itervalues() with items() and values()
respectively.

Comment 15 Simon Feltman 2014-05-02 04:35:43 UTC

Created attachment 275603 [details] [review]
giscanner: Use absolute_import for all Python files

Use absolute_import to ensure Python 3 compatibility of the code base.

Comment 16 Simon Feltman 2014-05-02 04:35:47 UTC

Created attachment 275604 [details] [review]
giscanner: Enable "true division" for all Python files

Import Python 3 compatible "true division" from the future (PEP 238).
This changes the Python 2 classic division which uses floor division
on integers to true division. Verfied we don't actually use the
division operator anywhere in the code base so this a safety for
supporting both Python 2 and 3.

Comment 17 Simon Feltman 2014-05-02 04:35:51 UTC

Created attachment 275605 [details] [review]
giscanner: Use print as a function for Python 3 compatibility

Use future import "print_function" and update relevant uses of print
as a function call. See: PEP 3105

Comment 18 Simon Feltman 2014-05-02 04:35:55 UTC

Created attachment 275606 [details] [review]
giscanner: Replace repr format usage with string formatter

Replace occurances of "%r" (repr) in format strings where the intended
behaviour is to output a quoted string "'foo'" with explicit usage
of "'%s'". This is needed to move the codebase to unicode literals
in order to upgrade to Python 3. Python 2 unicode strings are expanded
with repr formatting prefixed with a "u" as in "u'foo'" which causes
failures for various text formatting scenarios.

Comment 19 Simon Feltman 2014-05-02 04:35:59 UTC

Created attachment 275607 [details] [review]
giscanner: Use unicode literals in all Python files

Add unicode_literals future import which turns any string literal
into a unicode string. Return unicode strings from the Python C extension
module. Force writing of annotations (g-ir-annotation-tool) to output utf8
encoded data to stdout.
This is an initial pass at following the "unicode sandwich"
model of programming (http://nedbatchelder.com/text/unipain.html)
needed for supporting Python 3.

Comment 20 Simon Feltman 2014-05-02 04:36:04 UTC

Created attachment 275608 [details] [review]
giscanner: Port scanner extension module to work with Python 3

Define portable macros for use between Python 2 and 3.
Replace usage of PyString related functions with PyBytes.
Update pygi_source_scanner_parse_macros to support both
PyBytes and PyUnicode.

Comment 21 Simon Feltman 2014-05-02 04:36:08 UTC

Created attachment 275609 [details] [review]
giscanner: Use pickle when cPickle is not available

This adds compatibility with Python 3 which removed the
cPickle module.
Explicitly use binary files for reading and writing the cache.

Comment 22 Simon Feltman 2014-05-02 04:36:12 UTC

Created attachment 275610 [details] [review]
giscanner: Use builtins module in Python 3

Add conditional import for Python 3's renamed builtins module.

Comment 23 Simon Feltman 2014-05-02 04:36:16 UTC

Created attachment 275611 [details] [review]
giscanner: Use StringIO instead of cStringIO in Python 2

Replace usage of the Python 2 cStringIO module with StringIO and
conditionally use io.StringIO for Python 3. This is needed to build
up a unicode version of the XML since cStringIO does not support
unicode. Add XMLWriter.get_encoded_xml() which returns a utf-8 encoded
bytes object of the XML data.
Open files for reading/writing in binary mode since we explicitly
encode and decode as utf-8.

Comment 24 Simon Feltman 2014-05-02 04:36:19 UTC

Created attachment 275612 [details] [review]
giscanner: Decode command output for Python 3 compatibility

Decode the output of various subprocess calls assuming ascii
encoding.

Comment 25 Simon Feltman 2014-05-02 04:36:23 UTC

Created attachment 275613 [details] [review]
giscanner: Encode data passed to subprocess.stdin.write

ASCII encode bytes sent to subprocess.stdin.write to ensure
Python 2 and 3 compatibility.

Comment 26 Simon Feltman 2014-05-02 04:36:26 UTC

Created attachment 275614 [details] [review]
giscanner: Encode sha1 input for Python 3 compatibility

Comment 27 Simon Feltman 2014-05-02 04:36:29 UTC

Created attachment 275615 [details] [review]
giscanner: Update namespace sort for Python 3 compatibility

Use key function instead of cmp for list.sort which is compatible
with both Python 2 and 3. Make sure a list is returned from split
function. Don't use identity comparison "is" on strings.

Comment 28 Simon Feltman 2014-05-02 04:36:32 UTC

Created attachment 275616 [details] [review]
docwriter: Update for Python 3 compatibility

Convert the results of various filter() calls to lists. This is
needed because filter() returns a generator in Python 3 and len()
checks are used on the results (which doesn't work on a generator).
Explicitly open resulting files for output in binary mode.

Comment 29 Simon Feltman 2014-05-02 04:36:35 UTC

Created attachment 275617 [details] [review]
giscanner: Use rich comparison methods for Python 3 compatibility

Add lt, le, gt, ge, eq, ne, and hash dunder methods to all classes that
implement custom comparisons with __cmp__. This is needed to support Python 3
compatible sorting of instances of these classes.
Avoid using @functools.total_ordering which does not work for some of these
classes and also is not available in Python 2.6.

Comment 30 Simon Feltman 2014-05-02 04:36:39 UTC

Created attachment 275618 [details] [review]
giscanner: Sort unknown parameters in error message

Sort the parameters displayed for the "unknown parameters"
error message. The parameter names are stored in a set which
returns a different ordering between Python 2 and 3
(set/dict ordering should not be relied upon anyhow).
This fixes test failures in warning tests.

Comment 31 Simon Feltman 2014-05-02 04:36:43 UTC

Created attachment 275619 [details] [review]
configure.ac: Add --with-python configure flag

Add --with-python flag which overrides the $PYTHON environment
variable when used.

Comment 32 Simon Feltman 2014-05-02 04:36:47 UTC

Created attachment 275620 [details] [review]
Change update-glib-annotations to use Python 3

Comment 33 Simon Feltman 2014-05-02 04:45:17 UTC

Pushing trivial fixes.

Attachment 275597 [details] pushed as 7af0443 - giscanner: Use Python 3 compatible octal literal syntax
Attachment 275598 [details] pushed as c83efc8 - giscanner: Use modern exception handling for Python 3 compatibility
Attachment 275599 [details] pushed as 1c1039b - giscanner: Use range instead of xrange for Python 3 compatibility

Comment 34 Simon Feltman 2014-05-02 05:07:21 UTC

Review of attachment 275612 [details] [review]:

I was uncertain about the codec to use when reading from the subprocesses stdin with this patch. We could use the systems default, but I think it should probably be based on what the given commands being run output which needs some research. ASCII has the benefit of giving us an early error during testing, as opposed to making a guess or using the system encoding which may fail later on...

We interact with pkgconfig and gcc via. stdin and stdout which I think need to be double checked for what they can encode/decode. Or can we rely on some default?

Comment 35 Simon Feltman 2014-05-02 05:13:29 UTC

Review of attachment 275613 [details] [review]:

Similar to previous comment, I am unsure about the encoding to use. This is sending data to gcc via. stdin. gcc supposedly defaults to decoding utf8 for input files but I am not sure this is the case when using stdin. I just found [1] which states textual output from the preprocessor using -E does in fact result in utf-8 output, so that at least answers part of my question in the previous patch.


[1] http://gcc.gnu.org/onlinedocs/cpp/Character-sets.html

Comment 36 André Klapper 2015-02-07 17:16:40 UTC

[Mass-moving gobject-introspection tickets to its own Bugzilla product - see bug 708029. Mass-filter your bugmail for this message: introspection20150207 ]

Comment 37 Marvin Schmidt 2015-04-26 19:58:43 UTC

What is holding up getting this patches into master? I'd really like to see Python 3 support and it looks like all the bits are there, no?

Comment 38 Emmanuele Bassi (:ebassi) 2015-04-26 20:33:14 UTC

(In reply to Marvin Schmidt from comment #37)
> What is holding up getting this patches into master?]

A proper review.

Comment 39 Thomas Caswell 2015-09-29 15:27:32 UTC

Review of attachment 275600 [details] [review]:

This is correct, if binary vs text mode matters, this will do the right thing, if it does not it is a no-op

See https://docs.python.org/3/library/functions.html#open

Comment 40 Thomas Caswell 2015-09-29 15:30:52 UTC

Review of attachment 275601 [details] [review]:

This is correct.

Comment 41 Thomas Caswell 2015-09-29 15:33:27 UTC

Review of attachment 275602 [details] [review]:

This looks correct and should work on both 2 and 3, but it may be better to pick up a dependency on six (or just vendor the relevant bits) to use the `iteritems` function, ex `dd.iteritems()` -> `six.iteritems(dd)` which will give a generator is all cases.

Comment 42 Thomas Caswell 2015-09-29 15:36:34 UTC

Review of attachment 275603 [details] [review]:

Looks fine.

Only comment would be to merge all of the `__future__` imports into a single line.  I assume there will be a `from __future__ import print_function` coming someplace else down the patch list.

Comment 43 Thomas Caswell 2015-09-29 15:40:59 UTC

Review of attachment 275604 [details] [review]:

+1

Comment 44 Thomas Caswell 2015-09-29 15:42:19 UTC

Review of attachment 275605 [details] [review]:

+1.

Same comment as 2 patches above, might be worth merging all of the `__future__` imports into one line.

Comment 45 Thomas Caswell 2015-09-29 15:57:17 UTC

Review of attachment 275606 [details] [review]:

L538 in annotationparser.py seems to have a bug, but it was pre-existing.

This seems reasonable.

Comment 46 Thomas Caswell 2015-09-29 16:04:53 UTC

Review of attachment 275607 [details] [review]:

in matplotlib we do use `from __future__ import unicode_literals` and it has caused a good deal of headache, but I don't know if not using it provides any more/less.

Comment 47 Thomas Caswell 2015-09-29 16:05:11 UTC

Review of attachment 275607 [details] [review]:

Other wise, this looks good.

Comment 48 Thomas Caswell 2015-09-29 16:07:02 UTC

Review of attachment 275609 [details] [review]:

+1, looks correct.

Comment 49 Thomas Caswell 2015-09-29 16:08:07 UTC

Review of attachment 275610 [details] [review]:

+1, looks correct.

Comment 50 Thomas Caswell 2015-09-29 16:10:45 UTC

Review of attachment 275611 [details] [review]:

Looks correct.

Comment 51 Thomas Caswell 2015-09-29 16:13:19 UTC

Review of attachment 275614 [details] [review]:

+1 looks fine

Comment 52 Thomas Caswell 2015-09-29 16:16:50 UTC

Review of attachment 275615 [details] [review]:

I don't see where "Make sure a list is returned from split function. " is done.

Otherwise looks good.

Comment 53 Thomas Caswell 2015-09-29 16:21:28 UTC

Review of attachment 275615 [details] [review]:

This also fixes an inconsistency in the existing implementation as `self._sort_order(a, b) == self._sort_order(b, a)` if they are both in the namespace.  The current code should fall back to checking if both are in the namespace and sorting on `a[2]`. This patch fixes this behavior.

Comment 54 Thomas Caswell 2015-09-29 16:25:08 UTC

Review of attachment 275616 [details] [review]:

L375 adds a sorted call which I think is a change in behavior.

I assume this is needed for deterministic output (as in py3.4+ the iteration order of dictionaries is random process-to-process (as the hash seed is randomized per-process)).

Other than a minor change in behavior, which is unavoidable due to upstream python changes, +1

Comment 55 Colin Walters 2015-09-29 16:30:36 UTC

Thomas, are you doing any testing of this against git master?

If so, I would try to apply the easy ones early, rather than reviewing the whole batch at once then waiting for someone else to apply.

Comment 56 Thomas Caswell 2015-09-29 16:35:11 UTC

Review of attachment 275617 [details] [review]:

A year and a half later, it is probably safe to drop 2.6 support, but that is above my pay-grade on this project.

Bunch of duplicated code, but probably not worth making a mix-in or some such for this.

Comment 57 Thomas Caswell 2015-09-29 16:40:44 UTC

Review of attachment 275618 [details] [review]:

in 3.3+ (by default, and user configurable earlier) the ordering depends on a hash seed.

+1

Comment 58 Thomas Caswell 2015-09-29 16:45:03 UTC

Review of attachment 275616 [details] [review]:

^off by one version, 3.3 is when dict iteration order is randomized by default.

Comment 59 Thomas Caswell 2015-09-29 16:45:48 UTC

Review of attachment 275612 [details] [review]:

Maybe consult `locale` here?  Isn't this sort of thing very system dependent?

Comment 60 Thomas Caswell 2015-09-29 19:30:39 UTC

This stack of patches re-based onto current master is at https://github.com/GNOME/gobject-introspection/pull/4 

Please advise on g-o's process for this sort of thing?

Comment 61 Colin Walters 2015-09-30 03:05:41 UTC

(In reply to Thomas Caswell from comment #60)
> This stack of patches re-based onto current master is at
> https://github.com/GNOME/gobject-introspection/pull/4 
> 
> Please advise on g-o's process for this sort of thing?

So...I *personally* think bugzilla/splinter is for single patches but it breaks down once you go past 3.

If you reference github (or external git repositories) in general that's OK by me.

Comment 62 Colin Walters 2015-09-30 03:08:37 UTC

I played with this a bit and hit:

ab9cb31951ea199b326eb543fac8516eb3ccc08e is the first bad commit

  GISCAN   GLib-2.0.gir
Caught exception: <type 'exceptions.TypeError'> TypeError("'output_dir' must be a string or None",)
> /usr/lib64/python2.7/distutils/ccompiler.py(329)_setup_compile()
-> raise TypeError, "'output_dir' must be a string or None"
(Pdb) q


Before I try to debug this I'm going to go ahead and merge the patches up before that commit, to make some progress here.

Comment 63 Colin Walters 2015-09-30 03:09:55 UTC

(In reply to Thomas Caswell from comment #41)
> Review of attachment 275602 [details] [review] [review]:
> 
> This looks correct and should work on both 2 and 3, but it may be better to
> pick up a dependency on six (or just vendor the relevant bits) to use the
> `iteritems` function, ex `dd.iteritems()` -> `six.iteritems(dd)` which will
> give a generator is all cases.

Yeah...it's annoying.  But g-i isn't really in a performance fast path now, and I'm not sure I want to bring six into the picture yet...so we can just take the overhead of duplicating lists?

Comment 64 Colin Walters 2015-09-30 03:23:46 UTC

Fixed on Python 2 with:

diff --git a/giscanner/ccompiler.py b/giscanner/ccompiler.py
index a401cfd..481b1e4 100644
--- a/giscanner/ccompiler.py
+++ b/giscanner/ccompiler.py
@@ -236,7 +236,7 @@ class CCompiler(object):
                                      include_dirs=includes,
                                      extra_postargs=extra_postargs,
                                      output_dir=source_str[tmpdir_idx + 1:
-                                                           source_str.rfind(os.sep)])
+                                                           source_str.rfind(os.sep)].encode('UTF-8'))
 
     def link(self, output, objects, lib_args):
         # Note: This is used for non-libtool builds only!

Looking at more issues...

SyntaxError: (unicode error) 'rawunicodeescape' codec can't decode bytes in position 17-18: truncated \uXXXX

Comment 65 Colin Walters 2015-09-30 03:36:28 UTC

Okay, fixed that.  Pushed all of this to git master.

Thomas (and anyone else) - let's do followup bugs/PRs etc. for issues?  OK to close this?

Comment 66 Colin Walters 2015-09-30 03:37:18 UTC

Also thanks Simon, you are awesome as usual!  Sorry about the long delay...I'm going to try to spend a bit more time on g-i again FWIW.

Comment 67 Thomas Caswell 2015-09-30 04:01:48 UTC

Awesome!  Thanks for the quick response to my trouble making :)