GNOME Bugzilla – Bug 317461
GKeyFile doesn't allow for case-insensitive matching and doesn't use proper UTF-8 string matching functions
Last modified: 2013-10-29 16:49:41 UTC
Please describe the problem: GKeyFile does not allow for case insensitive matching. If you would have a key file that had a section: [Playlist] And you would look for that section using e.g. g_key_file_get_value (keyfile, "playlist", "some_key", &error); you would get no result because insensitive matching is not implemented. Case insensitive matching is useful in cases where it is unclear whether a section name or key name is in the correct casing, or for particular key files where the casing is not defined. Steps to reproduce: 1. Look up a key or section with a given name, but a different casing 2. You get no result Actual results: No result for lookups with same textual content but different casing. Expected results: Given an option to make GKeyFile match case-insensitive, return the given section/key/value independent of case. Does this happen every time? Yes. Other information:
Created attachment 52789 [details] [review] Proposed fix for this bug, and also adds proper UTF-8 matching instead of strcmp() This patch adds an additional GKeyFileFlag: G_KEY_FILE_MATCH_INSENSITIVE. Dependent upon this flag, a comparison function is being defined which is either g_utf8_collate(), or a custom written function _g_utf8_casecollate() (see code). It also replaces the improper use of strcmp() and uses (as described above...) UTF-8 comparison functions.
Hello guys. Could some glib developer review this patch? I second the case-insensitive match for section names. I need it to use in Vinagre, which reads .ini like files often generated by windows apps. Thanks.
(In reply to comment #1) > Created an attachment (id=52789) [edit] > Proposed fix for this bug, and also adds proper UTF-8 matching instead of > strcmp() > > This patch adds an additional GKeyFileFlag: G_KEY_FILE_MATCH_INSENSITIVE. > Dependent upon this flag, a comparison function is being defined which is > either g_utf8_collate(), or a custom written function _g_utf8_casecollate() > (see code). > > It also replaces the improper use of strcmp() and uses (as described above...) > UTF-8 comparison functions. How is strcmp use 'improper'? This patch adds an unacceptable behaviour change by using g_utf8_collate() instead of strcmp() for the key names in the default case. Why do you need unicode-aware keyname comparisions anyway? Key and group names usually are ascii-only (and the use case presented in comment 0 only deals with ascii-only ones). So strcmp() is fine for the case-sensitive case, and g_ascii_strcasecmp() should be good enough for the case-insensitive case.
Created attachment 122733 [details] [review] Proposed patch What about this one? I have tested it and it worked fine.
GKeyFile parses the file format that is specified in http://standards.freedesktop.org/desktop-entry-spec/latest. It clearly states that "Case is significant everywhere in the file." I'd be more sympathetic to requests like this if there was a clear specification of the win ini format somewhere that would let us implement a 'windows ini' mode. Adding random toggles like this are the way to go, imo. What if the ini file you wants to parse contains keys before the first group ? Will you ask for another toggle for that ? Wrt to the patch, g_ascii_strcasecmp is not suitable, since "keys and group names in key files are not restricted to ASCII characters."
I don't find that last quote anywhere in the spec. In fact, sect. "Basic format of the file", subsect. "Group headers" it says Group names may contain all ASCII characters except for [ and ] and control characters. and under "Entries" Only the characters A-Za-z0-9- may be used in key names. (admittedly right below it says "As the case is significant, the keys Name and NAME are not equivalent." :)
The quote is from the GKeyFile docs, not from the desktop entry spec.
My 2 cents here: Case folding is unacceptable since it breaks under, say, Turkish locale where 'i' doesn't uppercase to 'I'. If keys are ASCII-only, then adding an option to use g_ascii_strcasecmp() is fine. If keys are NOT ASCII-only though, UTF-8 collation must be used to avoid mismatches because of normalization issues. So, I agree with the requester that there's something to change here. Not sure in which direction.
Keyfiles have case-sensitive keys. I don't think we want to support this feature, ASCII or otherwise.