GNOME Bugzilla – Bug 311350
Loops forever with DE locale on whois search with invalid address
Last modified: 2007-03-16 21:14:39 UTC
Version details: 1.2.0-1 Distribution/Version: Debian/sid Hi, When gnome-nettool is started with a DE locale, ie. LANG=de_DE@euro gnome-nettool, it will loop forever if you whois an invalid hostname such as "google.de#". This is specific to the DE locale (doesn't happen in C or FR), and to invalid hostnames (still works for google.de). This is Debian bug http://bugs.debian.org/319610, reported by Ulrich Schenck. Bye,
When I try google.de# I get the following answer: "Hierfür ist kein Whois-Server bekannt." I have tested using Ubuntu Dapper using CVS HEAD version and 2.14.1. Probably this bug is obsolete. May you try it with a newer version and confirm it?
This bug is not obsolete, I still get it. Presumably, you tried with an UTF-8 locale instead of de_DE@euro. gnome-nettool didn't set the encoding of the output io channel for the processes it spaws, and this is set by default to UTF-8. When the output of a command (which runs under the same locale as gnome-nettool) is invalid UTF-8, this bug occurs (presumably because the read() calls fail). I'll attach a patch which set the encoding of the channel to the current locale, but it now outputs text in the wrong encoding in the result widget. I didn't go any further because obviously the gnome-nettool maintainers have to decide: - what encoding they use internally in the program (all strings UTF-8, all strings in the current locale, some strings in UTF-8 etc.) - what locale they use to spawn processes I suppose the safest would be to always convert all input to UTF-8 and all output to the current locale.
Created attachment 63644 [details] [review] Set the I/O channel's encoding to the current locale with some error handling
Widgets only accepts UTF-8. So, the output should be UTF-8. If the g_io_channel is set to a locale charset, then for every program would be needed to comment the code where a conversion take place (whois_foreach, *_foreach), where the output of each line is converted to UTF-8 from locale. It wouldn't be necessary anymore, because the text (the parameter name 'line') is UTF-8.
*** Bug 164579 has been marked as a duplicate of this bug. ***
Germán, I don't understand everything of your suggestion, but if widgets expect UTF-8, then it's mandatory that their input is UTF-8, but it's not mandatory that gnome-nettool uses UTF-8 internally. IMO, it would be the cleanest to set the channel to the current locale, and immediately convert the string to UTF-8. If you want to go even further and handle the case where installed programs do not always output stuff in the current locale, you might want to set the channel's encoding to 8-bit / raw and try various encodings in a specific order, as gedit does: first the locale's encoding (which is the expected encoding), then utf-8, finally iso-8859-1 or something similar.
I agree with your first statement. With your patch is happening that the channel works as expected in all cases, but the output given to every command is converted to UTF-8, again. If you look in whois_foreach, when each line is processed, this is converted from locale to UTF8. To get your patch in shape, it would be necessary do not use g_locale_to_utf8 anymore, because the output is in the right way. In other case, the output will be shown badly. For instance, if you try with your patch seeing the whois information for ubiobio.cl, you will see strange characters.
Ah, now I see what you meant, not only are you processing regular program messages from the whois program (such as "No whois server is known for this kind of object.") and their translations (such as "Hierfür ist kein Whois-Server bekannt."), you also want to process the output of whois queries as some charset, such as "Más información (More information):" in the result of whois ubiobio.cl. Clearly, using a real encoding to parse output in various encodings looks like a bad idea. The real problem is the usage of a command-line program instead of a library, and the whois program is not particularly helpful in detecting it's output: bee% whois aaaaaaaaaa && echo true Aucun serveur whois n'est connu pour ce type d'objet. true bee% whois non-existent && echo true No match found for non-existent. # ARIN WHOIS database, last updated 2006-04-17 19:10 # Enter ? for additional hints on searching ARIN's WHOIS database. true bee% whois non-existent.fr && echo true %% %% This is the AFNIC Whois server. %% %% Rights restricted by copyright. %% See http://www.afnic.fr/afnic/web/legal %% %% Use '-h' option to obtain more information about this service. %% %% [YOUR REQUEST] >> -V Md4.7 non-existent.fr %% %% No entries found in the AFNIC Database. true etc. => lots of different responses, no way to distinguish an interesting result from a server failure or a client failure (no exit code) My opinion is the following: the whois protocol has no encoding information (see 4/ in RFC3912) and the encoding of the information can not be inferred by gnome-nettool, but it is assumed that some US-ASCII text will be visible; the whois program may output whois data, or errors, but probably not both, so I'd recommend the following approach short-term: - set encoding to binary on the channel to retrieve the raw data - convert from the current locale to UTF-8 for display, garbled decoding should be skipped (replaced by '?'), the output should still be readable, as it would be for an english speaker on a LC_ALL=C terminal; this would work for both errors and whois data, but some whois data would be garbled - if you're brave, you can add encoding selection as a widget in gnome-nettool Mid-term, the solution is to fix whois to distinguish its output (I suppose only errors) from whois data, at least returning an error code when there's an error. Long-term, the solution is to use some sort of libwhois. Very long-term, set an encoding in whois queries. :)
I would also add in the short term: - Catch the error using Gerror for g_io_channel_read_line and such, to avoid giving the impression of "Hang up". If you press the "Stop" button when gnome-nettool looks "hanged up", it complains there is no process to stop. If you process the error, it is giving the information necessary to do anything related.
May I commit this patch? (It's in the "reviewed" state, not "accepted-commit_now")
It seems fine to me. It has only missing a ChageLog entry. Add an entry to ChangeLog and commit it in trunk. Anyway, if you have any proposal for comment #8, it will be welcomed. Thanks Loïc.
(I don't have any additional proposal for comment #8; I'll stick to the short term solution) Committed in trunk as r591. 2007-03-16 Loïc Minier <lminier@svn.gnome.org> * src/nettool.c: (netinfo_text_buffer_insert): Set the I/O channel's encoding to the current locale and add some error handling; fixes bug #311350.
Ok. I forgot it, it would be nice to update the NEWS file also. Thanks.
I've committed this as r592 in branches/gnome-2-18. I'm sorry I didn't add a NEWS entry, from the instructions I received with the SVN account I thought this was something to do at the time one prepares a new release. I'm not too confortable with starting a new NEWS header, especially for trunk as I don't know what number the next release will be; could you add the NEWS snippet?
I added an entry in trunk. It was: gnome-nettool 2.19.1, 2007- -------------------------------- - #311350: Fixed locace problem in whois (Lo�s Minier) It's better update it according the events happens, because it makes easier to do a release summary. Not everybody does in their projects, so it is not a rule, but it simplify the release work (despite we have made very few changes in Nettool in a while).