GNOME Bugzilla – Bug 52811
Need encoding conversion for GIOChannel
Last modified: 2011-02-18 15:47:37 UTC
(May be a duplicate, can't find the original though) Conversion from external encodings to UTF-8 will be a fundamental operation with GTK+-2.0, and we need to make it convenient http://mail.gnome.org/archives/gtk-devel-list/2001-March/msg00399.html
Created attachment 471 [details] [review] conbined patch for set_encoding, error checks, buffering, blocking vs non blocking
created an attachment as a proposed patch, giochannel.h has a set of new g_io_channel functions to address those issues discussed on the gtk-devel-list, containg: * charset code conversion at read/write, read/write by chars * buffering * blocking vs non-blocking (controled by *set_flags/get_flags) * Reading a file a line at a time, Reading a whole file at once * error handling with more error feedbacks Both API design and implementation are to be carefully reviewed. We really want someone of win32 to review and test the giowin32 backend specifically, look at the errors returned by g_io_win32_msg_[read,write] and give some sort of comments on what they should be. Also, please advise me if I should post this patch on the mailing list to attract more people to review it.
Created attachment 480 [details] testcase
Created attachment 481 [details] makefile for gio_test.c testcase
I'm adding a small diff relative to the patch above which: 1. Cleans up the handling of the backward compatibility of G_IO_ERROR a little. 2. Makes the character buffer arguments of the various write functions of type const gchar* instead of gchar*, fixing bug 52907.
Created attachment 504 [details] [review] patch mentioned above
This is a new version of the patch with some fairly extensive changes. It now has it using dynamic buffering with GString, so channel->buf_size ends up being more of a recomendation than a fixed size. The buffer still won't get longer than channel->buf_size + (size of the longest line in the file) unless you have a lot of read/write errors. The dynamic buffering fixes several errors/issues. Some of the improved functionality is: 1) It is possible in the new version to change the encoding type or buffer size in the middle of reading/writing. 2) It is also possible in the new version to have a file open for both reading and writing, while still using buffering. 3) Buffering is now safe for binary data (as long as you only use g_io_channel_read_chars, g_io_channel_write_chars, and g_io_channel_read_to_end). I also added a function g_io_channel_seek_position to do buffer-aware seeking. If anyone can think of a better name for this function, I'd I'd be happy to hear it. One bit of functionality that I removed was the ability to do unbuffered reads/writes which are encoded with g_convert. Due to the possible presence of partial characters in the input, I can't see a pratical way of doing encodng on unbuffered data and still handling the errors sanely. Fortunately, unbuffered read/write is probably most useful for binary data, which doesn't need to be encoded anyway. Setting the encoding to G_IO_CHANNEL_ENCODE_RAW does raw read/write which is unbuffered, with two exceptions: 1) If you switch to G_IO_CHANNEL_ENCODE_RAW from another encoding in midstream, any data already in the buffer is handled before raw read/write to the channel begins. 2) The function g_io_channel_read_to_end still uses the buffer to read in the file contents. If an error occurs, any data already read in is left in the buffer. Also, g_io_channel_read_line is unusable when the encoding is G_IO_CHANNEL_ENCODE_RAW (as you would expect for an encoding intended for binary data). The main work that needs to be done on this functionality is still in giowin32.c. The new code in that file has not even been compiled, much less tested. The function stubs and whatnot have been set up in such a way that, with luck, everything will still compile on the first pass. The things that need to be looked at are: 1) The conversion of the read, write, and seek functions from GIOError to GError. If there aren't any typos in the changes, the old API g_io_channel_[read,write,seek] should still work. The issue is making sure that the GError error message are sufficiently varied before the new API is set in stone. Since g_io_win32_fd_[read,write,seek] use errno, they are probably fine. Similarly, g_io_win32_no_seek should obviously return G_CHANNEL_ERROR_SPIPE (seeking not allowed on this kind of channel). For sockets, g_io_win32_sock_[read,write] look like they have errors that map 1 to 1 into errno, but it would be good to get confirmation on this from someone who knows something about windows. Something like g_channel_error_from_errno for WSAError (or whatever these are called) and use of the appropriate function analogus to strerror would also be nice. The real issue is with g_io_msg_[read,write]. The literal GIOError to GChannelError conversion I made is probably not really correct, so it would be good to find out what these errors really are. 2) Implementation of the backend for g_io_channel_new_from_file and g_io_channel_[set,get]_flags. At this point, they're compilation stubs. Any feedback would be greatly appreciated.
Created attachment 539 [details] [review] patch with dynamic buffers
The next patch fixes up an ugly hack in g_io_channel_fill_buffer. It depends on the GString patch in bug 54903.
Created attachment 545 [details] [review] g_io_channel_fill_buffer messy hack patch
The next patch attempts to add some thread safety to the buffering of GIOChannel. Attempting to read/write the same channel from different threads will still probably give you garbage, but at least now you shouldn't hit any of the g_assert() calls in the code. I don't really know anything about threads, so I would appreciate it if someone could look this over. The idea is to keep the internal state of the buffer sane.
Created attachment 546 [details] [review] thread safety patch
This next patch fixes a few bugs and does some cleanups to the online documentation.
Created attachment 567 [details] [review] bug fix patch
This next patch edits files in glib/docs/reference/glib, specifically glib-sections.txt and tmpl/iochannels.sgml, to do the non-inline part of the documentation. This includes marking the old functions g_io_channel_[read,write,seek] depricated, as well as the documentation for g_io_channel_new_file(), which is difficult to place inline since it is defined in both giounix.c and giowin32.c. All the other new functions are documented inline.
Created attachment 568 [details] [review] documentation patch
This is a new combined patch which has everything above except for the g_io_channel_fill_buffer() patch which depends on changes to GString. It also includes the new functions g_io_channel_get_encoding() and g_io_channel_get_use_fallback(), to match the corresponding set functions.
Created attachment 624 [details] [review] new combined patch
The next patch contains the changes to g_io_channel_fill_buffer() which depend on the changes to GString in bug 54903. This is fortunately a small patch, and is a code cleanup instead of a bug fix.
Created attachment 625 [details] [review] fill buffer patch
*** Bug 52907 has been marked as a duplicate of this bug. ***
This is a replacement for the previous combined patch, which reversed some unrelated patches due to confusion in my source tree.
Created attachment 636 [details] [review] corrected combined patch
This is a new version of the patch which addresses the issues raised in http://mail.gnome.org/archives/gtk-devel-list/2001-April/msg00360.html The major issue remaining is the win32 implementation, which is still in the same state (completely unimplemented).
Created attachment 689 [details] [review] new version of patch
This is an updated version of the test code for the newest version of the patch.
Created attachment 690 [details] test case source
*** Bug 54903 has been marked as a duplicate of this bug. ***
The latest patch is malformed - seems to be missing large chunks :-( - not sure if this is a CVS problem or a problem uploading the patch. If the patch you have locally matches the patch available for download, you might want to try using 'diff -r' between two trees rather than cvs diff. I'll try to check over the patch as well as I can with what is in the patch.
The patch got mangled on upload, here's another try.
Created attachment 692 [details] [review] (hopefully) unmangled patch
The patch is still messed up. I'll have to find some other way to get it out.
There is now a copy of the patch at http://www.elfhame.net/~rsteinke/giochannel-june-17.2.diff
This patch is now in CVS. There are still a couple of API issues to fix up and a lot of non-API issues that could use a bit of improvement: http://mail.gnome.org/archives/gtk-devel-list/2001-June/msg00522.html Probably individual bugs should be filed to track the more important non-API issues once we get the API stuff settled.
This is a small patch to fix up the API and (hopefully) get win32 compiling.
Created attachment 715 [details] [review] small API patch
Another patch against CVS. This one includes Owen's requested fixes to the API, and much better error handling. The test code now uses an input file internal to the tree (glib/tests/iochannel-test-infile). I'm posting a sample input file separately from the patch, but any appropriate file included at the right position in the tree will work.
Created attachment 743 [details] [review] finalize API and error handling patch
Bugzilla messed up the patch again (does it have some sort of problem with longer files?) You can find the patch at http://www.elfhame.net/~rsteinke/giochannel-july-15.diff
Created attachment 744 [details] sample iochannel-test-infile
Most remaining issues listed as new bugs. See bug 57689, bug 57690, bug 57691, bug 57692, bug 57694, and bug 57695.
Additional new bugs, bug 57771 and bug 57773.
Patch is now in, remaining issues are in separate bugs.