After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 131576 - The spell checker should not break words on contractions
The spell checker should not break words on contractions
Status: RESOLVED FIXED
Product: gspell
Classification: Other
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: gspell maintainers
gspell maintainers
: 304534 335626 596486 621810 750336 (view as bug list)
Depends on: 97545
Blocks:
 
 
Reported: 2004-01-15 17:07 UTC by Maxim Dziumanenko
Modified: 2016-03-05 21:14 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Maxim Dziumanenko 2004-01-15 17:07:35 UTC
Description of Problem:
gedit spell plugin marks valid words with apostrophe as misspelled

Steps to reproduce the problem:
1. Run gedit
2. activate spell plugin
3. select spell language English(American)
4. turn on automatic spell checking
5. enter word "couldn't" in text area

Actual Results:
word underlined with red line

Expected Results:
non-underlined word

How often does this happen? 
Always

Additional Information:
Same for other languages. For example, valid Ukrainian word "пам'ять" and
many others are marked as misspelled. 
I think apostrophe serves as word separator in this context.

Exception of this rule are words like "can't", "it's" because they're left
and right parts are valid words (at the aspell point of view).
Comment 1 Paolo Maggi 2004-01-18 09:52:24 UTC
This is a duplicated of bug #97545.

Evan Martin suggested a workaround in bug #97861:

==============

This, at the top of gedit-automatic-spell-checker.c, should do it:

static gboolean
gtkspell_text_iter_forward_word_end(GtkTextIter *i) {
    GtkTextIter iter;

/* heuristic: 
 * if we're on an singlequote/apostrophe and
 * if the next letter is alphanumeric,
 * this is an apostrophe. */

    if (!gtk_text_iter_forward_word_end(i))
        return FALSE;

    if (gtk_text_iter_get_char(i) != '\'')
        return TRUE;
    
    iter = *i;
    if (gtk_text_iter_forward_char(&iter)) {
        if (g_unichar_isalpha(gtk_text_iter_get_char(&iter))) {
            return (gtk_text_iter_forward_word_end(i));
        }
    }

    return TRUE;
}

static gboolean
gtkspell_text_iter_backward_word_start(GtkTextIter *i) {
    GtkTextIter iter;

    if (!gtk_text_iter_backward_word_start(i))
        return FALSE;

    iter = *i;
    if (gtk_text_iter_backward_char(&iter)) {
        if (gtk_text_iter_get_char(&iter) == '\'') {
            if (gtk_text_iter_backward_char(&iter)) {
                if (g_unichar_isalpha(gtk_text_iter_get_char(&iter))) {
                    return (gtk_text_iter_backward_word_start(i));
                }
            }
        }
    }

    return TRUE;
}

#define gtk_text_iter_backward_word_start
gtkspell_text_iter_backward_word_start
#define gtk_text_iter_forward_word_end gtkspell_text_iter_forward_word_end


It is a hack, but not that bad as far as hacks go.

===============

If bug #97861 will not be closed before 2.6, I will recosider applying
the above hack.

I'm not closing this bug as a duplicate of bug #97861 since I want to
have a reminder in the gedit list of bugs.

Setting severity to major since it is a pretty annoying bug.
Chaging summary too.
Comment 2 Paolo Maggi 2004-02-12 11:26:04 UTC
Is anyone willing to test the Evan Martin's patch I have attached to
this bug in my previous comment?
I really have no time to do it.
Comment 3 Maxim Dziumanenko 2004-02-12 11:43:55 UTC
This patch works pretty well for Ukrainian and English (at least).
I think this patch should be commited, if there are no better ways to
fix this bug.
Comment 4 Luis Villa 2004-02-14 03:50:55 UTC
Paolo, do you still want to get it in? If you still feel uncomfortable
with the level of testing, maybe you could email d-d-l or gnome-love
and ask people to try it?
Comment 5 Paolo Maggi 2004-02-21 17:09:25 UTC
Sorry Evan if I have added you to the CC list of this bug.
In bug #97851 you proposed the workaround you have implemented in
gtkspell.
I'm not so convinced it really works.
Probably also gtk_text_iter_inside_word, gtk_text_iter_starts_word and
C. should be wrapped. 
What does it happen when the current text is:
"don"
and you paste "'t know" ?
Am I on crack?
Maxim: may you test it too?
Comment 6 Paolo Maggi 2004-02-21 17:22:19 UTC
You can see the problem I'm speaking of in the following way:

0. Apply the patch proposed by Evan
1. Create an empty document
2. Activate auto spell check (be sure to use english as language)
3. Write "do't know" -> gedit marks "do't" as an error
4. Add "n" before "'t", press END
5. gedit will mark "'t" as an error -> this is wrong

I think the only solution is waiting for bug #97545.

Probably the patch does not work with languages like italian where
apostrophe is used in a different way,
For example "un'altra" is a contraction for "una altra", it should be
considered as two words (like in english)
But "un'altro" is wrong, it should be considered as a single word. The
right syntax in this case is "un altro"


Comment 7 Nathan Fredrickson 2004-02-23 03:53:10 UTC
I can confirmed that this problem also exists in the current version
of gtkspell (2.0.5).  The newest gtkspell checks words at a different
time (when the cursor exits the word), but still contains the
apostrophe hack.

I agree Paolo, the only real solution is to fix <a
href="http://bugzilla.gnome.org/show_bug.cgi?id=97545">bug 97545</a>
in Pango.  The apostrophe hack we're using in gtkspell really just
covers one english-centric case: typing a contraction directly in a
language that has apostrophe rules like english.
Comment 8 Paolo Maggi 2004-03-25 14:56:28 UTC
Update summary
Comment 9 Paolo Borelli 2005-05-17 16:22:13 UTC
*** Bug 304534 has been marked as a duplicate of this bug. ***
Comment 10 Daniel Holbach 2006-08-02 10:02:14 UTC
Mentioned in https://launchpad.net/distros/ubuntu/+source/gedit/+bug/36227 as well.
Comment 11 Steve Frécinaux 2007-01-02 18:03:44 UTC
Is this still relevant or has the switch to Enchant solved it ?
Comment 12 Paolo Maggi 2007-01-02 18:05:12 UTC
It is still relevant.
Comment 13 John Baptist 2010-04-01 11:14:26 UTC
Hello,

Are there currently any plans to fix this, or has work really not advanced since 2004? It's a fairly visible and embarrassing bug.
Comment 14 dsdutkiewicz 2010-10-31 11:17:57 UTC
Hello,

this is still an issue in gedit 2.30.3, "didn't", "couldn't"
Comment 15 Paolo Borelli 2011-12-03 16:56:08 UTC
*** Bug 596486 has been marked as a duplicate of this bug. ***
Comment 16 Matěj Cepl 2013-10-08 16:57:51 UTC
*** Bug 335626 has been marked as a duplicate of this bug. ***
Comment 17 John Baptist 2014-03-29 23:26:13 UTC
Still a problem in gedit 3.10.
Comment 18 Paolo Borelli 2014-08-07 20:19:30 UTC
*** Bug 621810 has been marked as a duplicate of this bug. ***
Comment 19 Ray Griffin 2015-01-13 23:11:45 UTC
(In reply to comment #17)
> Still a problem in gedit 3.10.

Still is in 3.14.2 and in a day this bug reaches the 11 years open mark.
Comment 20 Paolo Borelli 2015-01-14 08:55:14 UTC
(In reply to comment #19)
> this bug reaches the 11 years open mark.


Well, this bug is still open because it is a valid complaint and because it makes easier from time to time to mark duplicates against it.

But honestly I do not see this being fixed at the gedit level in the forseable future, unless someone shows up and puts in the work.

This should really be addressed at a lower level in the stack, either in pango or in some spell-checking library
Comment 21 Leonardo Ferreira Fontenelle 2015-01-14 12:21:19 UTC
Aspell and IIRC enchant can check hyphenated words, if the dictionaries define hyphens as part of words.

Maybe you mean to mark this as "won't fix" and point to the 8-year-old bug report #383706, about incorporating spell checking into GTK+?
Comment 22 aziza 2015-04-08 11:21:21 UTC
It is still relevant.
Comment 23 Sébastien Wilmet 2015-04-14 11:37:02 UTC
I think GtkSpell has fixed this bug.
Comment 24 Leonardo Ferreira Fontenelle 2015-04-15 01:01:00 UTC
I tried GtkSpell 2.0.16 through Poedit 1.7.5 on Arch Linux, and didn't notice any improvement on this.
Comment 25 Sébastien Wilmet 2015-04-15 08:33:44 UTC
Of course you need a more recent version of GtkSpell. See the ChangeLog for the 3.0.5 and 3.0.6 versions:
http://gtkspell.sourceforge.net/ChangeLog
Comment 26 Leonardo Ferreira Fontenelle 2015-04-21 15:06:49 UTC
Sorry for the confusion -- I blame package names :)

Anyway, I tried again, with Evolution 3.16.1 and 3.0.7. "d'água" (a contraction of "da" and "água", meaning "of water") is considered as incorrect, even if it was explicitly included in the dictionary.

On the other hand, Enchant detects it correctly as a correct word:

$ echo "d'água" | enchant -a
@(#) International Ispell Version 3.1.20 (but really Enchant 1.6.0)
*

$ echo "dd'água" | enchant -a
@(#) International Ispell Version 3.1.20 (but really Enchant 1.6.0)
& dd'água 5 0: d'água, D'Água, d'Água, N'Água, n'Água
Comment 27 André Klapper 2015-06-03 17:27:16 UTC
*** Bug 750336 has been marked as a duplicate of this bug. ***
Comment 28 Sébastien Wilmet 2015-11-20 14:43:50 UTC
Re-assign to gspell.
Comment 29 Sébastien Wilmet 2016-03-05 21:14:03 UTC
Done. The implementation is probably not perfect, but it's a temporary solution. When the Pango bug #97545 will be fixed, it will be possible to simplify the code in gspell, and have a better implementation.