GNOME Bugzilla – Bug 61726
gtk_text_iter_starts/ends_sentence() work wrong with text containing dot followed by uppercase char
Last modified: 2011-02-04 16:09:49 UTC
According to Unicode spec. v3.0 (see 5.15) and comment to #61560 text like "One.Two" is two sentences, not one. But the specified functions work wrong in this case. Test case: main (int argc, char** argv) { gtk_init (&argc, &argv); test ("One.Two", 4); test ("One.two", 4); test ("One!Two", 4); test ("One?Two", 4); test ("One?!Two", 5); } void test (gchar* str, int pos) { GtkTextIter iter; GtkTextBuffer* buffer; gboolean returns; buffer = gtk_text_buffer_new (NULL); gtk_text_buffer_set_text (buffer, str, strlen (str)); gtk_text_buffer_get_iter_at_offset(buffer, &iter, pos); returns = gtk_text_iter_starts_sentence (&iter); printf ("returns = %d\n", returns); } Output: returns = 0 returns = 0 returns = 1 returns = 1 returns = 1 I assume that gtk_text_iter_starts_sentence() should return TRUE in the first case. All other cases are handled correctly. Same for gtk_text_iter_ends_sentence().
Move open bugs from milestones 2.0.[012] -- > 2.0.3, since 2.0.2 is already out.
Move GtkTextView 2.0.4 bugs to 2.0.5
Looking a bit at the code in pango/break.c and at http://www.unicode.org/unicode/reports/tr29/ I find that your expectations don't quite match tr29, since it has the rule: Don't break after ambiguous terminators like period if the first following letter is lowercase, or if the preceding word is contains an uppercase letter. For example, a period may be an abbreviation or numeric period, and not mark the end of a sentence. so Pango actually reports the expected results, but just acidentally. Pango doesn't consider periods to be candidates for sentence ends unless they're followed by either closing punctuation or space. Should this bug be moved to Pango ?
Moving bugs from older 2.0.x milestones to 2.0.10.
Is this a pango bug, Owen?
*** This bug has been marked as a duplicate of 97545 ***