After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 749938 - Can't combine search specifiers
Can't combine search specifiers
Status: RESOLVED FIXED
Product: geary
Classification: Other
Component: search
0.10.x
Other Linux
: Normal normal
: 0.11.0
Assigned To: Geary Maintainers
Geary Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-05-27 00:50 UTC by Matthew Wong
Modified: 2016-04-27 05:14 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
relevant debug messages, combining search specifiers (1.99 KB, text/plain)
2015-05-27 00:50 UTC, Matthew Wong
  Details
fix to allow combining search specifiers (2.70 KB, patch)
2015-05-30 02:59 UTC, Matthew Wong
none Details | Review
fix to allow combining search specifiers (4.05 KB, patch)
2015-06-02 07:20 UTC, Matthew Wong
none Details | Review
fix to allow combine search specifiers (3.07 KB, patch)
2015-06-11 01:40 UTC, Matthew Wong
none Details | Review
fix to allow combine search specifiers, git format-patch (3.44 KB, patch)
2016-04-21 03:16 UTC, Matthew Wong
accepted-commit_now Details | Review
database disk image is malformed error (2.95 KB, text/plain)
2016-04-24 02:18 UTC, Michael Gratton
  Details

Description Matthew Wong 2015-05-27 00:50:59 UTC
Created attachment 304045 [details]
relevant debug messages, combining search specifiers

For example, the queries 'hello from:matthew' or 'to:matthew from:matthew' do not work correctly. I have attached the relevant debug messages, the sqlite error is: 'unable to use function MATCH in the requested context.' I don't think having multiple MATCH operators like that using AND is valid http://sqlfiddle.com/#!7/074ac/9
Comment 1 Robert Schroll 2015-05-27 05:34:27 UTC
I can reproduce this.  (Rather easily too -- I'm surprised I hadn't run across this already.)

Jim, do you know how to fix this, or should I start boning up on SQL?
Comment 2 Matthew Wong 2015-05-30 02:59:35 UTC
Created attachment 304298 [details] [review]
fix to allow combining search specifiers

Here is a fix attempt, please have a look. Basically it uses this query syntax, the ":" character to specify columns http://www.sqlite.org/fts3.html#section_3 . I am not sure if this approach is compatible with how ":" characters in queries are currently handled but it behaves fine for me.
Comment 3 Matthew Wong 2015-06-02 06:13:54 UTC
Comment on attachment 304298 [details] [review]
fix to allow combining search specifiers

diff --git a/src/engine/imap-db/imap-db-account.vala b/src/engine/imap-db/imap-db-account.vala
index 9e29cfa..efe8ff2 100644
--- a/src/engine/imap-db/imap-db-account.vala
+++ b/src/engine/imap-db/imap-db-account.vala
@@ -959,6 +959,7 @@ private class Geary.ImapDB.Account : BaseObject {
             Gee.List<SearchTerm>? terms = query.get_search_terms(field);
             if (terms == null || terms.size == 0)
                 continue;
+            field = field ?? "";
             
             // Each SearchTerm is an AND but the SQL text within in are OR ... this allows for
             // each user term to be AND but the variants of each term are or.  So, if terms are
@@ -977,43 +978,45 @@ private class Geary.ImapDB.Account : BaseObject {
             //
             // party* OR parti* eventful* OR event*
             StringBuilder builder = new StringBuilder();
+            bool is_first_exact = true;
             foreach (SearchTerm term in terms) {
                 if (term.sql.size == 0)
                     continue;
                 
                 if (term.is_exact) {
+                    if (is_first_exact)
+                        builder.append_printf("%s:", field);
                     builder.append_printf("%s ", term.parsed);
+                    is_first_exact = false;
                 } else {
                     bool is_first_sql = true;
                     foreach (string sql in term.sql) {
                         if (!is_first_sql)
                             builder.append(" OR ");
                         
-                        builder.append_printf("%s ", sql);
+                        builder.append_printf("%s:%s ", field, sql);
                         is_first_sql = false;
                     }
+                    is_first_exact = true;
                 }
             }
             
-            phrases.set(field ?? "MessageSearchTable", builder.str);
+            phrases.set(field ?? "", builder.str);
         }
         
         return phrases;
     }
     
-    private void sql_add_query_phrases(StringBuilder sql, Gee.HashMap<string, string> query_phrases) {
-        foreach (string field in query_phrases.keys)
-            sql.append(" AND %s MATCH ?".printf(field));
-    }
-    
     private int sql_bind_query_phrases(Db.Statement stmt, int start_index,
         Gee.HashMap<string, string> query_phrases) throws Geary.DatabaseError {
         int i = start_index;
+        StringBuilder query_expression = new StringBuilder();
         // This relies on the keys being returned in the same order every time
         // from the same map.  It might not be guaranteed, but I feel pretty
         // confident it'll work unless you change the map in between.
         foreach (string field in query_phrases.keys)
-            stmt.bind_string(i++, query_phrases.get(field));
+            query_expression.append_printf("%s ", query_phrases.get(field));
+        stmt.bind_string(i++, query_expression.str);
         return i - start_index;
     }
     
@@ -1105,10 +1108,10 @@ private class Geary.ImapDB.Account : BaseObject {
                 WHERE id IN (
                     SELECT docid
                     FROM MessageSearchTable
-                    WHERE 1=1
+                    WHERE MessageSearchTable
+                    MATCH ?
+                )
             """);
-            sql_add_query_phrases(sql, query_phrases);
-            sql.append(")");
             
             if (blacklisted_ids_sql != "")
                 sql.append(" AND id NOT IN (%s)".printf(blacklisted_ids_sql));
@@ -1809,8 +1812,11 @@ private class Geary.ImapDB.Account : BaseObject {
             WHERE docid IN (
         """);
         sql_append_ids(sql, id_map.keys);
-        sql.append(")");
-        sql_add_query_phrases(sql, query_phrases);
+        sql.append("""
+            )
+            AND MessageSearchTable
+            MATCH ?
+        """);
         
         Db.Statement stmt = cx.prepare(sql.str);
         sql_bind_query_phrases(stmt, 0, query_phrases);
Comment 4 Matthew Wong 2015-06-02 07:20:12 UTC
Created attachment 304412 [details] [review]
fix to allow combining search specifiers

Sorry, first time using bugzilla, I didn't manage to add my comment or attach the new patch... There are minor issues, e.g. some comments are no longer relevant and get_query_phrases doesn't need to return a HashMap, but it should actually work now.
Comment 5 Matthew Wong 2015-06-08 03:40:49 UTC
Comment on attachment 304412 [details] [review]
fix to allow combining search specifiers

Well, I spoke too soon again. Unfortunately the fix isn't as simple as I'd hoped, it doesn't work for searching columns for phrases, e.g. 'body:"hello there"'.
Comment 6 Robert Schroll 2015-06-08 16:20:40 UTC
Sorry I haven't been able to look at your patch yet.  For now, I'll let you continue your work, but if you'd like me to look at a WIP, just say so.  Thanks!
Comment 7 Matthew Wong 2015-06-11 01:40:48 UTC
Created attachment 305024 [details] [review]
fix to allow combine search specifiers

No problem, they haven't worked fully... Can you look at this one?
Comment 8 Robert Schroll 2015-06-14 23:25:06 UTC
In a little bit of testing, it seems to work.  I'm going to run this for a few days to make sure there are no surprises.

I think this partially invalidates the comment in the get_query_phrases about how the search system works?  Would you be willing to update that text?
Comment 9 Matthew Wong 2015-06-15 01:19:57 UTC
The comment is still correct, it's referring to the query syntax for each individual MATCH, not the way different MATCHes are combined like I initially thought.
Comment 10 Michael Gratton 2016-04-12 06:09:48 UTC
Hey Matthew,

I'm keen to land this patch, and will have a look at it later in the week. Are you still happy with them as-is? Also, if you have them in git, getting them formatted with `git format-patch` would be great.

Thanks!
Comment 11 Matthew Wong 2016-04-21 03:16:36 UTC
Created attachment 326466 [details] [review]
fix to allow combine search specifiers, git format-patch

Hi Michael. Yes I am still happy with it.
Comment 12 Michael Gratton 2016-04-24 02:18:31 UTC
Created attachment 326614 [details]
database disk image is malformed error

Hey Matthew, I have been running this patch for a while and seems to work fine, in that queries like "from:me hello" do indeed work. Thanks!

However looking at the debug log, a different error in a similar context as the ones you reported in the attachment above is being logged - debug output is attached. This particular output occurred running the search "from:me hello", where "hello" was actually found in message.

I'm pretty sure that contrary to the message the DB isn't corrupted, since I can reproduce this with either of two newly added accounts - unless Geary or the FTS implementation is doing something wrong.

Also, it doesn't seem affect the search results - rather just the highlighting.

So, do you also see this message with this patch applied when running a similar search? Since it affects just the highlighting, is there an additional query for that also needs to be fixed that the patch misses?
Comment 13 Michael Gratton 2016-04-25 06:16:37 UTC
Ah, nevermind, I was able to reproduce the issue in c:12 on master (see Bug 765515).
Comment 14 Michael Gratton 2016-04-25 06:47:00 UTC
Review of attachment 326466 [details] [review]:

So, finally to review the actual patch now... :)

::: src/engine/imap-db/imap-db-account.vala
@@ +1009,3 @@
+                sql.append_printf("""
+                    %s
+                    SELECT %s

Here, a series of additional SELECT sub-statements are generated for each additonal search term after the first.

What's the reason why you took that approach, rather than using the boolean AND/OR/NOT connectives within the MATCH argument string itself, per the FTS docs in section 3.1 or 3.2? <https://www.sqlite.org/fts3.html#section_3_1>? It strikes me that the query might more more efficient that way.

E.g. Doing:

> "MATCH 'term1' AND 'term2'"

Instead of:

> "MATCH 'term1' AND SELECT ... MATCH 'term2'"

I note the enhanced version (§3.1) is a compile-time dependency, but I wonder if we can get away with the standard version (§3.2) fine - use AND in search_async and no operator (an implicit OR) in do_get_search_matches should work?

Even if we have to use the enhanced version both Ubuntu and Fedora have have already enabled it. Arch and other distros that haven't would need a bug filed against them, but I'd be happy to do that.
Comment 15 Matthew Wong 2016-04-25 21:42:17 UTC
Yeah, it does look excessive, having to INTERSECT the (almost identical) SELECTs. It takes advantage of that "MATCH x AND y" approach when searching within one column but when it comes to searching different columns (using search specifiers) there are no examples shown. Basically I want to combine WHERE x MATCH y with WHERE a MATCH b. In the original code it tries to do 'WHERE field1 MATCH ? AND field2 MATCH ?' which unfortunately as I discovered doesn't work.
Comment 16 Michael Gratton 2016-04-26 01:31:32 UTC
Review of attachment 326466 [details] [review]:

Ah, I see now that the standard connectives are being used in ::get_query_phrases, which I guess means the additional selects are possibly the best we can do for the moment.

It doesn't seem to be a performance issue on my accounts with 50k+ messages, but that's on a SSD drive. Let's get this committed and see how it tests out for everyone else.
Comment 17 Michael Gratton 2016-04-26 01:32:46 UTC
Adam: Please go ahead and commit this when you have a moment.
Comment 18 Adam Dingle 2016-04-27 01:40:17 UTC
I've pushed this to master.  Mike, I'll leave it to you to close this bug if appropriate.
Comment 19 Michael Gratton 2016-04-27 05:14:23 UTC
Yep, this is good to resolve as fixed. Thanks all!