After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 750392 - Avoid NULL pointer dereference when EWS autodiscover fails
Avoid NULL pointer dereference when EWS autodiscover fails
Status: RESOLVED FIXED
Product: gnome-online-accounts
Classification: Core
Component: Exchange
3.16.x
Other All
: Normal normal
: ---
Assigned To: GNOME Online Accounts maintainer(s)
GNOME Online Accounts maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2015-06-04 11:29 UTC by Erik van Pienbroek
Modified: 2015-06-09 10:37 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
patch (887 bytes, patch)
2015-06-04 11:29 UTC, Erik van Pienbroek
none Details | Review
Save result of ews_client_autodiscover_response_cb when the first attempt succeeds (1.24 KB, patch)
2015-06-04 13:04 UTC, Erik van Pienbroek
none Details | Review
attempt 3 (2.26 KB, patch)
2015-06-05 15:01 UTC, Erik van Pienbroek
committed Details | Review
ewsclient: Only return FALSE when an error occurred (2.81 KB, patch)
2015-06-08 11:57 UTC, Debarshi Ray
committed Details | Review

Description Erik van Pienbroek 2015-06-04 11:29:01 UTC
Created attachment 304574 [details] [review]
patch

After adding a new Exchange account using gnome-control-center I noticed that goa-daemon segfaulted:

Program received signal SIGSEGV, Segmentation fault.

Thread 140266107299584 (LWP 13163)

  • #0 ensure_credentials_sync
    at goaexchangeprovider.c line 334
  • #1 ensure_credentials_in_thread_func
    at goaprovider.c line 545
  • #2 run_in_thread
  • #3 io_job_thread
  • #4 g_task_thread_pool_thread
  • #5 g_thread_pool_thread_proxy
  • #6 g_thread_proxy
  • #7 start_thread
  • #8 clone
329	                                          error);
330	  if (!ret)
331	    {
332	      if (error != NULL)
333	        {
334	          g_prefix_error (error,
335	                          /* Translators: the first %s is the username
336	                           * (eg., debarshi.ray@gmail.com or rishi), and the
337	                           * (%s, %d) is the error domain and code.
338	                           */
(gdb) 
339	                          _("Invalid password with username ‘%s’ (%s, %d): "),
340	                          username,
341	                          g_quark_to_string ((*error)->domain),
342	                          (*error)->code);
343	          (*error)->domain = GOA_ERROR;
344	          (*error)->code = GOA_ERROR_NOT_AUTHORIZED;
345	        }
346	      goto out;
347	    }
348	
(gdb) print error
$1 = (GError **) 0x7f923f7fdb50
(gdb) print (*error)->domain
Cannot access memory at address 0x0
(gdb) print (*error)
$3 = (GError *) 0x0


The attached patches resolves the segfault for me (in combination with gnome-online-accounts 3.16.2).


Unfortunately adding a new Exchange account with this patch applied still doesn't fully work yet as I now see the following error messages:

(goa-daemon:27973): GLib-GIO-CRITICAL **: g_simple_async_result_take_error: assertion 'error != NULL' failed

I'll continue looking into that issue and create a separate bug report if necessary
Comment 1 Debarshi Ray 2015-06-04 12:02:41 UTC
Review of attachment 304574 [details] [review]:

Thanks for taking the time to report the bug, and trying to track it down, Erik.

::: src/goabackend/goaexchangeprovider.c
@@ +330,3 @@
   if (!ret)
     {
+      if (error != NULL && *error != NULL)

The question is: why is error not getting set when autodiscovery fails. goa_ews_client_autodiscover is supposed to guarantee that if the return value is FALSE, then error will be set if a &error was passed. For some reason that guarantee is being violated, and we need to find out why. You could start looking at ews_client_autodiscover_response_cb

Papering over this, will only lead to things like (as you have already seen):
(goa-daemon:27973): GLib-GIO-CRITICAL **: g_simple_async_result_take_error: assertion 'error != NULL' failed

We need to fix the original cause of the bug. ie. ensure error gets set if autodiscovery fails.
Comment 2 Erik van Pienbroek 2015-06-04 12:29:34 UTC
I just did some more debugging.

While stepping through ews_client_autodiscover_response_cb I noticed that op_res gets set to 0 (FALSE) initially (line 205). After the ews_client_autodiscover_parse_protocol call the op_res gets set to 1 (TRUE) (line 295) which looks as expected to me.

After this the soup_session_cancel_message (line 319) gets called and data->pending gets decremented (2 -> 1) (line 339). Given that data->pending is still 1 the result of the op_res value doesn't get saved.

The second time the ews_client_autodiscover_response_cb get's called it is noticed that msg->status_code == SOUP_STATUS_CANCELLED and it continues directly to the out: section. Given that the original op_res (from the first attempt) isn't saved the value of op_res currently is FALSE which is also the return value to the original caller.

My theory is that op_res value should be saved from the first attempt where ews_client_autodiscover_response_cb was executed when a successful autodiscover result was found.
Comment 3 Erik van Pienbroek 2015-06-04 13:04:39 UTC
Created attachment 304581 [details] [review]
Save result of ews_client_autodiscover_response_cb when the first attempt succeeds

I made a small change so that the op_res from the first attempt gets saved (when it was successful) and used in the second attempt.

This resolves the g_simple_async_result_take_error assertion failures.

However, evolution still doesn't like the Exchange account yet. It does see the account, but it tries to connect to the wrong server ('foo.com' instead of 'mail.foo.com'). The AutoDiscover response (as found out by GOA) does have the correct server name:

<Autodiscover xmlns="http://schemas.microsoft.com/exchange/autodiscover/responseschema/2006">
  <Response xmlns="http://schemas.microsoft.com/exchange/autodiscover/outlook/responseschema/2006a">
<<snip>>
    <Account>
<<snip>>
      <Protocol>
<<snip>>
        <ASUrl>https://mail.foo.com/EWS/Exchange.asmx</ASUrl>
        <EwsUrl>https://mail.foo.com/EWS/Exchange.asmx</EwsUrl>
        <EcpUrl>https://owa.foo.com/ecp/</EcpUrl>
<<snip>>
      </Protocol>
    </Account>
  </Response>
</Autodiscover>


In gnome-control-center I had to use the servername 'foo.com' otherwise GOA wasn't able to figure out the autodiscover URL.

Is it GOA's responsibility to tell evolution what servername it should use or should evolution find out the servername by itself (using autodiscover) ?
Comment 4 Debarshi Ray 2015-06-04 16:33:41 UTC
(In reply to Erik van Pienbroek from comment #2)
> I just did some more debugging.

Your analysis is right. This is a regression caused by:

commit 6c3e3c2d2d9f688157513c0f0403645c4c196754
Author: Debarshi Ray <debarshir@gnome.org>
Date:   Fri Jan 9 15:59:28 2015 +0100

    ewsclient, httpclient: Stop using deprecated SoupSessionAsync
    
    As a result the minimum libsoup version was bumped to 2.42.
    
    Fixes: https://bugzilla.gnome.org/742651

That is when the behaviour of soup_session_cancel_message changed. Damn it. I knew back then that this could have side-effects, but I still missed it. I guess my test setup was always causing the first message to fail. I feel stupid.

Thanks.
Comment 5 Debarshi Ray 2015-06-04 16:40:58 UTC
Review of attachment 304581 [details] [review]:

Thanks for taking the time to write a new patch, Erik.

It would be great if you could add "Fall out from 6c3e3c2d2d9f688157513c0f0403645c4c196754" in the message's body followed by the URL to this bug in the commit message.

::: src/goabackend/goaewsclient.c
@@ +223,3 @@
   if (status == SOUP_STATUS_CANCELLED)
+    {
+      op_res = g_simple_async_result_get_op_res_gboolean(data->res);

Nit: please add a space between the function name and the opening parenthesis.

@@ +321,3 @@
            */
           soup_session_cancel_message (data->session, data->msgs[idx], SOUP_STATUS_CANCELLED);
+          g_simple_async_result_set_op_res_gboolean (data->res, op_res);

How about putting it just above the for loop? Would be nice to add a comment or g_assert to make it clear that op_res has to be TRUE at this point. Although it is obvious now, a future refactoring / change could introduce a regression just like this one. This function is fragile enough that a few extra comments won't hurt.

I think you can also remove the g_simple_async_result_set_op_res_gboolean call towards the end of the function.
Comment 6 Debarshi Ray 2015-06-04 16:53:13 UTC
(In reply to Erik van Pienbroek from comment #3)
> However, evolution still doesn't like the Exchange account yet. It does see
> the account, but it tries to connect to the wrong server ('foo.com' instead
> of 'mail.foo.com'). The AutoDiscover response (as found out by GOA) does
> have the correct server name:
> 
> <Autodiscover
> xmlns="http://schemas.microsoft.com/exchange/autodiscover/responseschema/
> 2006">
>   <Response
> xmlns="http://schemas.microsoft.com/exchange/autodiscover/outlook/
> responseschema/2006a">
> <<snip>>
>     <Account>
> <<snip>>
>       <Protocol>
> <<snip>>
>         <ASUrl>https://mail.foo.com/EWS/Exchange.asmx</ASUrl>
>         <EwsUrl>https://mail.foo.com/EWS/Exchange.asmx</EwsUrl>
>         <EcpUrl>https://owa.foo.com/ecp/</EcpUrl>
> <<snip>>
>       </Protocol>
>     </Account>
>   </Response>
> </Autodiscover>
> 
> 
> In gnome-control-center I had to use the servername 'foo.com' otherwise GOA
> wasn't able to figure out the autodiscover URL.
> 
> Is it GOA's responsibility to tell evolution what servername it should use
> or should evolution find out the servername by itself (using autodiscover) ?

Even though GOA performs the autodiscovery, it doesn't expose the outcome of that step to the applications. Instead it only exposes what the user had to interactively enter to add the account [*]. This means the email, password, user name and server. Everything except the password is stored in ~/.config/goa-1.0/accounts.conf. This is what GOA exposes to evolution.

[*] This is a general design philosophy of GOA to avoid an explosion of the app-facing API caused by trying to have properties for every possible detail that an application might be interested in.
Comment 7 Erik van Pienbroek 2015-06-04 22:12:27 UTC
Okay, I'll prepare a new patch based on your review comments and I guess I should file a separate bug report against evolution as well.
Comment 8 Erik van Pienbroek 2015-06-04 22:35:28 UTC
evolution bug filed at https://bugzilla.gnome.org/show_bug.cgi?id=737610
Comment 9 Erik van Pienbroek 2015-06-04 22:37:01 UTC
Ignore that previous comment, the evolution bug is filed at https://bugzilla.gnome.org/show_bug.cgi?id=750427
Comment 10 Erik van Pienbroek 2015-06-05 15:01:54 UTC
Created attachment 304654 [details] [review]
attempt 3
Comment 11 Debarshi Ray 2015-06-08 11:56:47 UTC
Review of attachment 304654 [details] [review]:

Thanks, Erik. I tweaked the commit message to include a link to this bug and the commit that introduced the regression.

::: src/goabackend/goaewsclient.c
@@ +225,3 @@
+      /* If a previous autodiscover attempt for the same GSimpleAsyncResult
+       * was succesful then no additional attempts need to be done any more
+       * and we can use the result from the earlier attempt */

Bonus points for adding the comment.
Comment 12 Debarshi Ray 2015-06-08 11:57:23 UTC
Created attachment 304767 [details] [review]
ewsclient: Only return FALSE when an error occurred
Comment 13 Debarshi Ray 2015-06-08 11:59:42 UTC
Looks solid in my testing, but my test setup almost always returns the messages in a particular order - failure first, success second, so please do give this a solid hammering.

Once again, thanks for reporting the bug and fixing it!
Comment 14 Erik van Pienbroek 2015-06-08 12:01:53 UTC
Thanks for accepting the patch.

I think it would still be a good idea to apply my initial patch (avoid null pointer dereference) as well. Even though it should not get triggered any more now that we've found the root cause I think it still is good to have an additional guard in place, just in case of future rebases/regressions. Perhaps a g_return_if_fail should do the trick as well to make it more obvious.

Additionally there's also another possible null pointer dereference in the same ensure_credentials_sync function, just a couple of lines above the area which my initial patch touched.
Comment 15 Debarshi Ray 2015-06-08 18:12:20 UTC
(In reply to Erik van Pienbroek from comment #14)
> Thanks for accepting the patch.
> 
> I think it would still be a good idea to apply my initial patch (avoid null
> pointer dereference) as well. Even though it should not get triggered any
> more now that we've found the root cause I think it still is good to have an
> additional guard in place, just in case of future rebases/regressions.
> Perhaps a g_return_if_fail should do the trick as well to make it more
> obvious.

This particular guarantee of setting an error when the return value represents a failure is quite widely assumed throughout the code base. Trying to paper them over will only lead to other subtly strange behaviour that will be harder to debug.