After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 631525 - libsoup 2.32.0 sometimes fail to load pages
libsoup 2.32.0 sometimes fail to load pages
Status: RESOLVED FIXED
Product: libsoup
Classification: Core
Component: Misc
2.32.x
Other Linux
: Normal normal
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
Depends on:
Blocks:
 
 
Reported: 2010-10-06 14:08 UTC by Ivan Bulatovic
Modified: 2010-11-28 15:02 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch that could fix the bug (974 bytes, patch)
2010-11-10 19:12 UTC, Sergio Villar
none Details | Review
soup-message-io: fix retry-after-unexpected-connection-close (5.81 KB, patch)
2010-11-10 21:58 UTC, Dan Winship
committed Details | Review
"libtool --mode=execute strace -f -s 2048 -o strace.txt ./misc-test -d -d" output (94.55 KB, text/plain)
2010-11-18 06:09 UTC, Yuri Sedunov
  Details
misc-test: fix up the new persistent connection timeout test (1.34 KB, patch)
2010-11-18 15:18 UTC, Dan Winship
none Details | Review
"libtool --mode=execute strace -f -s 2048 -o strace.txt ./misc-test -d -d" output (1.01 KB, text/plain)
2010-11-18 16:31 UTC, Yuri Sedunov
  Details
"libtool --mode=execute strace -f -s 2048 -o strace.txt ./misc-test -d -d" output (94.26 KB, text/plain)
2010-11-18 17:01 UTC, Yuri Sedunov
  Details
misc-test: fix up the new persistent connection timeout test (3.12 KB, patch)
2010-11-18 19:20 UTC, Dan Winship
committed Details | Review

Description Ivan Bulatovic 2010-10-06 14:08:51 UTC
libsoup 2.32.0 with libwebkit 1.2.4 causes some pages not to be loaded properly
rendering only a blank page, hitting enter on the address bar or refreshing the page works.

I'm not behind a proxy and this is not http-https related, happens on both. This problem usually appears after php search scripts (ie, forums) or even bugtrackers etc.

I've downgraded only libsoup from 2.32 to 2.32.2 and the problem was gone.
Comment 1 Ivan Bulatovic 2010-10-06 14:10:33 UTC
Sry, downgraded from 2.32 to 2.30.2...
Comment 2 Dan Winship 2010-10-06 14:27:24 UTC
need either a somewhat-reliable test case or more detailed information about what's going wrong (eg, tcpdumps maybe), or a bisection down to a specific commit
Comment 3 Ivan Bulatovic 2010-10-06 17:06:45 UTC
Bisect came down to this:

dc6395ccdb50e930bf71cd789bcb06e9b25aec44 is the first bad commit
commit dc6395ccdb50e930bf71cd789bcb06e9b25aec44
Author: Dan Winship <danw@gnome.org>
Date:   Sat May 29 16:37:34 2010 +0200

    Add SoupMessageQueueItemState, remove SoupMessageIOStatus
    
    SoupMessageIOStatus was always really more about the session than the
    message. (SoupServer I/O didn't use it at all.) Replace it with a new
    SoupMessageQueueItemState, on the queue item rather than the message.

I can't give you reliable test case because the test isn't reliable at all and I can't reproduce it 100% (sometimes works, sometimes doesn't). 

For example:

Pressing the View new posts calls this
http://www.ubuntu-rs.org/forum/search.php?action=getnew

Then it waits a lil' while and then tries to open this
http://www.ubuntu-rs.org/forum/search.php?action=results&sid=1148e250d65d704164ca04582eb08859

and that leaves me with blank page, if I hit enter on the address bar it gets loaded.

Or this on the Arch Forums
https://bbs.archlinux.org/search.php?action=show_new
Comment 4 vcap 2010-10-11 02:12:57 UTC
Caveat: i did not try to downgrade or anything like it, so it's possible that is a different problem. 

Here is what you wrote in the archlinux thread at https://bbs.archlinux.org/viewtopic.php?pid=838666#p838666 (slightly edited):

Since i upgraded to libsoup 2.32.0, i also get blank pages that i need to refresh manually with midori.

I did catch one instance with tcpdump and it turns out that what happened, in this specific case, seems to be this:
1/ libsoup send a GET request through a connection that has seen no traffic since 4mins or so (but is still theorically alive, i.e no "Connection: close" from either end or any other indication that the connection is dead).
2/ the remote end does not like this and answer with a tcp reset (it SHOULD NOT, i guess, but there are reasons/excuses for this behaviour).
3/ then, libsoup gives up instead of doing what it MUST[1], that is: automatically try to open a new connection to send the failed request.

In that case the remote end was 66.249.92.104, which is www.google.com as resolved by the local isp. And, indeed, i see a lot of blank pages with my google searches, lately.

I still have the tcpdump capture btw, but i would have to check/clean it a bit before putting it on a public place. 


[1] actually, i am not absolutely certain whose responsability it is to retry the request in a new connection (libsoup/webkit/midori?). But i am pretty sure it's not that of the human operator.
Comment 5 Sergio Villar 2010-11-10 19:12:38 UTC
Created attachment 174212 [details] [review]
Patch that could fix the bug

As discussed on IRC, this bug is probably caused by the fact that messages in STARTING state that get a socket EOF (most likely because the server closes a reused active connection) are not restarted anymore since the asynchronous session is implemented using a state machine.

This patch restores the previous behaviour, i.e, those messages are restarted.
Comment 6 vcap 2010-11-10 21:13:37 UTC
I applied Sergio's patch on top of the 2.32.0 tarball (building with the Arch Linux PKGBUILD). It seems to fix the problem i had with my google searches often returning a blank page in midori, as well as a similar problem with google reader (though it didn't manifest as blank pages, there).

tcpdump confirms that when reusing an old connection and getting a tcp reset, libsoup will now reissue the request on a new connection.
Comment 7 Dan Winship 2010-11-10 21:58:33 UTC
Thanks for figuring out the source of the problem, Sergio. The fix
was even simpler than what you had.
Comment 8 Dan Winship 2010-11-10 21:58:36 UTC
Created attachment 174215 [details] [review]
soup-message-io: fix retry-after-unexpected-connection-close

When sending a request on a previously-used connection, we have to
deal with the possibility of the server deciding to time out the
connection right as we start sending data (which sounds like a crazy
race condition, but is in fact pretty much standard behavior). This
got broken in the connection/session reorg earlier in the year. Fix
it.

Also, add a test to misc-test for this.

Based on patch from Sergio Villar.
Comment 9 Ivan Bulatovic 2010-11-11 00:50:43 UTC
Just tested the patch, works great, thanks.
Comment 10 Sergio Villar 2010-11-11 08:56:53 UTC
(In reply to comment #7)
> Thanks for figuring out the source of the problem, Sergio. The fix
> was even simpler than what you had.

Oh yeah, all the conditions of that if seemed to be saying "the problem is here, the problem is here" :)
Comment 11 Yuri Sedunov 2010-11-16 06:35:31 UTC
However, misc-test sometimes (or often) fails now.
Try to run this test several times and it exactly fails.

tests]$ ./misc-test -d
Host handling

Callback unref handling

SoupMessage reuse
  First message
  Redirect message
  Auth message
  Last message

OPTIONS *
  Testing with no handler
  Testing with handler

Abort with pending connection

Invalid Content-Length framing tests
  Content-Length larger than message body length
  Server claims 'Connection: close' but doesn't

Automatic Accept-Language processing
  LANGUAGE=C
  LANGUAGE=fr_FR
  LANGUAGE=fr_FR:de:en_US

Unexpected timing out of persistent connections
  Async session
    First message
    Second message
      Message was not retried after disconnect
  Sync session
    First message
    Second message
      Message was not retried after disconnect

misc-test: 2 error(s).
Comment 12 Dan Winship 2010-11-16 13:20:36 UTC
Hm... weird. That particular set of errors just indicates a bug in misc-test itself though; for some reason it's not closing the connection between the two messages, and so it's not managing to test the retry-after-connection-is-closed fix. So libsoup itself is still (presumably) fixed, there's just a bug in my test case for it. I'll try to figure out what's going on. (It works correctly for me.)

Running:

    libtool --mode=execute strace -f -s 2048 -o strace.txt ./misc-test -d -d

and then attaching the "strace.txt" file might help me figure it out.
Comment 13 Yuri Sedunov 2010-11-18 06:09:38 UTC
Created attachment 174753 [details]
"libtool --mode=execute strace -f -s 2048 -o strace.txt ./misc-test -d -d" output
Comment 14 Dan Winship 2010-11-18 15:18:36 UTC
Created attachment 174775 [details] [review]
misc-test: fix up the new persistent connection timeout test

Can you try this patch?
Comment 15 Yuri Sedunov 2010-11-18 16:31:40 UTC
Created attachment 174778 [details]
"libtool --mode=execute strace -f -s 2048 -o strace.txt ./misc-test -d -d" output

Patched misc-test also sometimes fails.
Comment 16 Dan Winship 2010-11-18 16:47:42 UTC
there's no strace output in that attachment
Comment 17 Yuri Sedunov 2010-11-18 17:01:40 UTC
Created attachment 174779 [details]
"libtool --mode=execute strace -f -s 2048 -o strace.txt ./misc-test -d -d" output

Sorry, see new attachment
Comment 18 Dan Winship 2010-11-18 19:20:57 UTC
Created attachment 174799 [details] [review]
misc-test: fix up the new persistent connection timeout test

Take 3
Comment 19 Yuri Sedunov 2010-11-18 19:34:33 UTC
Yes, 100 times misc-test passed
Comment 20 Dan Winship 2010-11-18 19:54:54 UTC
Comment on attachment 174799 [details] [review]
misc-test: fix up the new persistent connection timeout test

woo. thanks

Attachment 174799 [details] pushed as 81d7447 - misc-test: fix up the new persistent connection timeout test