Bug 547020 – HEAD request not allowed/working breaks query_info (e.g. Amazon S3, IMDB)

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 547020 - HEAD request not allowed/working breaks query_info (e.g. Amazon S3, IMDB)


Summary:	HEAD request not allowed/working breaks query_info (e.g. Amazon S3, IMDB)


Status:	RESOLVED NOTGNOME

Product:	gvfs
Classification:	Core
Component:	http backend
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gvfs-maint
QA Contact:	gvfs-maint

URL:
Whiteboard:

Duplicates:	601776 634335 (view as bug list)
Depends on:
Blocks:	510110

Reported:	2008-08-09 02:01 UTC by John Stowers
Modified:	2015-03-01 17:12 UTC

See Also:
GNOME target:	---
GNOME version:	2.23/2.24

Description John Stowers 2008-08-09 02:01:40 UTC

+++ This bug was initially created as a clone of Bug #545000 +++

Amazon S3 allows generation of links with an expiration date. An example of such a link is:
https://d134w4tst3t.s3.amazonaws.com:443/a?Signature=6VJ9%2BAdPVZ4Z7NnPShRvtDsLofc%3D&Expires=1249330377&AWSAccessKeyId=0EYZF4DV8A7WM0H73602

gvfs is not able to access this link

gvfs-info "https://d134w4tst3t.s3.amazonaws.com:443/a?Signature=6VJ9%2BAdPVZ4Z7NnPShRvtDsLofc%3D&Expires=1249330377&AWSAccessKeyId=0EYZF4DV8A7WM0H73602"

Error getting info: HTTP Client Error: Forbidden

Identical results when doing the same through (py)gio

import gio
f = gio.File("https://d134w4tst3t.s3.amazonaws.com:443/a?Signature=6VJ9%2BAdPVZ4Z7NnPShRvtDsLofc%3D&Expires=1249330377&AWSAccessKeyId=0EYZF4DV8A7WM0H73602")
f.query_info("standard::*")

<class 'gio.Error'>: HTTP Client Error: Forbidden

Works fine when clicking the link in firefox

Comment 1 Alexander “weej” Jones 2008-08-10 13:03:49 UTC

amazon.com does not support HEAD requests.

Comment 2 John Stowers 2008-08-22 02:00:49 UTC

Has anyone looked at this?

Comment 3 Bastien Nocera 2008-08-22 11:56:29 UTC

The tests/get in libsoup seems to work (though it'd be easier to check if the file contained some data), so this would probably be a gvfs issue.

Comment 4 Bastien Nocera 2008-08-22 12:08:39 UTC

A bit of debug:

$ GVFS_HTTP_DEBUG=all /usr/libexec/gvfsd-http uri="https://d134w4tst3t.s3.amazonaws.com/a?Signature=6VJ9%2BAdPVZ4Z7NnPShRvtDsLofc%3D&Expires=1249330377&AWSAccessKeyId=0EYZF4DV8A7WM0H73602"


setting 'uri' to 'https://d134w4tst3t.s3.amazonaws.com/a?Signature=6VJ9%2BAdPVZ4Z7NnPShRvtDsLofc%3D&Expires=1249330377&AWSAccessKeyId=0EYZF4DV8A7WM0H73602'
Added new job source 0x1ad9020 (GVfsBackendHttp)
Queued new job 0x1ad90a0 (GVfsJobMount)
+ try_mount: https://d134w4tst3t.s3.amazonaws.com/a?Signature=6VJ9%2BAdPVZ4Z7NnPShRvtDsLofc%3D&Expires=1249330377&AWSAccessKeyId=0EYZF4DV8A7WM0H73602
send_reply, failed: 0
register_mount_callback, mount_reply: 0x1ad1500, error: (nil)
backend_dbus_handler org.gtk.vfs.Mount:QueryInfo
Queued new job 0x1adc8d0 (GVfsJobQueryInfo)
> HEAD /a?Signature=6VJ9%2BAdPVZ4Z7NnPShRvtDsLofc%3D&Expires=1249330377&AWSAccessKeyId=0EYZF4DV8A7WM0H73602 HTTP/1.1
> Soup-Debug-Timestamp: 1219406742
> Soup-Debug: SoupSessionAsync 1 (0x1adc830), SoupMessage 1 (0x1adc970), SoupSocket 1 (0x1add0b0)
> Host: d134w4tst3t.s3.amazonaws.com
> User-Agent: gvfs/0.2.5
  
< HTTP/1.1 403 Forbidden
< Soup-Debug-Timestamp: 1219406743
< Soup-Debug: SoupMessage 1 (0x1adc970)
< x-amz-request-id: 30B1C4D3A24E62A9
< x-amz-id-2: JTRKS1c8jZtxFJ0NmqV/YHWbc+54AmHWuxDWTWebv/4LI8A+7RJR46saCwTYNWaF
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Date: Fri, 22 Aug 2008 12:05:42 GMT
< Server: AmazonS3
  
send_reply(0x1adc8d0), failed=1 (HTTP Client Error: Forbidden)

As Alex mentioned, you can't do a gvfs-info on it as S3 doesn't support seem to support HEAD.

Comment 5 Christian Schlotter 2008-08-22 13:39:57 UTC

(In reply to comment #3)
> The tests/get in libsoup seems to work (though it'd be easier to check if the
> file contained some data), so this would probably be a gvfs issue.

Here's a link to a file which contains data (4 bytes, 'asd\n'):

https://d134w4tst3t.s3.amazonaws.com:443/b?Signature=ttakwvULWoghvDf5WfaeaJmOHlw%3D&Expires=1250516229&AWSAccessKeyId=0EYZF4DV8A7WM0H73602

Comment 6 Dan Winship 2008-08-27 13:48:37 UTC

(In reply to comment #1)
> amazon.com does not support HEAD requests.

Doesn't appear to be true:

  http://docs.amazonwebservices.com/AmazonS3/2006-03-01/index.html?RESTObjectHEAD.html

Googling for a random s3.amazonaws.com URL turns up

  http://s3.amazonaws.com/apache.3cdn.net/3e5b3bfa1c1718d07f_6rm6bhyc4.pdf

which responds to HEAD just fine.

However, the documentation for expiring URLs (http://docs.amazonwebservices.com/AmazonS3/2006-03-01/index.html?RESTAuthentication.html) says that they are "suitable only for simple object GET requests"... So I'm guessing that this is a bug in S3, and they're forgetting to allow HEAD as well. Or more likely, they're checking the Signature and finding that it doesn't match, because the StringToSign used "GET" as the method, and the actual request used "HEAD". If you create a new signed URL using "HEAD" instead of "GET" in the StringToSign, does that make it possible to do HEAD but not GET? (You can do a HEAD request to a URL by using "curl -I URL".)

Comment 7 Christian Schlotter 2008-08-27 20:22:41 UTC

> If you create a new signed URL using "HEAD" instead
> of "GET" in the StringToSign, does that make it possible to do HEAD but not
> GET? (You can do a HEAD request to a URL by using "curl -I URL".)

I have written an e-mail to Amazon describing the problem. I do not have time at the moment to construct the request manually.

Comment 8 John Stowers 2008-09-06 08:24:12 UTC

Is there any chance that this might be fixed before GNOME 2.24, I would like to be able to support Amazon S3 in Conduit this cycle.

Comment 9 André Klapper 2008-09-06 11:41:00 UTC

John Stowers: You might want to ping the #nautilus irc channel too.

Comment 10 Christian Schlotter 2008-09-06 15:08:42 UTC

BTW, I have posted this issue to AWS Developer Connection[1] -- unfortunately no reaction from an AWS engineer so far!

[1] http://developer.amazonwebservices.com/connect/thread.jspa?threadID=24304

Comment 11 Dan Winship 2008-09-07 18:21:08 UTC

(In reply to comment #8)
> Is there any chance that this might be fixed before GNOME 2.24, I would like to
> be able to support Amazon S3 in Conduit this cycle.

There's not a lot gvfs can do here. You're trying to do a query_info, and at the moment, S3 returns nonsensical information in response to query_info. The only way this could be "fixed" at the gvfs level would be to have it check the URI to see if it looked like an S3 expiring URI, and then do a GET instead of a HEAD in that case, but throw away the response body. Which would suck.

AFAICT, the bug is on Amazon's side, and the right fix is for them to fix it, at which point gvfs will automatically start working correctly without needing any changes.

(Note that other operations, eg "gvfs-cat", work fine; it's only query_info, and anything that depends on it, that's broken.)

Comment 12 John Stowers 2008-09-08 14:34:20 UTC

> 
> (Note that other operations, eg "gvfs-cat", work fine; it's only query_info,
> and anything that depends on it, that's broken.)
> 

Interesting. So I could theoretically work around this by using python urllib to get the size and mtime, and just using g_file_copy to copy the file?

Comment 13 Dan Winship 2008-09-08 15:12:24 UTC

(In reply to comment #12)
> Interesting. So I could theoretically work around this by using python urllib
> to get the size and mtime, and just using g_file_copy to copy the file?

Yes, it looks like that would work; call urlopen() to open the URL, then call info() on the returned file, check the size and mtime, and if you don't want to download the rest of the message body, you can close the file. You can't do this asynchronously though...

Another possibility would be to just not check the mtime in this case; in the S3 backend, if g_file_query_info() returns "forbidden", then just re-download the file unconditionally.

Comment 14 Dan Winship 2009-10-15 01:29:57 UTC

bug 596615 shows another example of a file server allowing GET but not HEAD. bug 598505 has a patch to let you do g_file_input_stream_query_info() after starting a g_file_read(), although that wouldn't really be great here since you don't want to start a read until after seeing the info...

Comment 15 Christian Kellner 2009-11-13 10:05:46 UTC

There is another report of a broken server sending 404 on HEAD but working fine on GET: bug #547020

Comment 16 Christian Kellner 2009-11-13 11:22:22 UTC

Args, I mean bug #bug 601776.

Comment 17 Christian Kellner 2011-05-12 11:25:20 UTC

*** Bug 634335 has been marked as a duplicate of this bug. ***

Comment 18 Christian Kellner 2011-05-12 11:31:42 UTC

From the analysis of bug 634335:
IMDB (http://www.imdb.com) is another instance of a server responding with NOT_ALLOWED for HEAD but works perfectly with GET.
(The other thing mentioned in the bug report, i.e. that gvfs-open doesn't work due to this bug here, seems to work here)

Comment 19 Ross Lagerwall 2014-04-05 19:35:19 UTC

*** Bug 601776 has been marked as a duplicate of this bug. ***

Comment 20 Ross Lagerwall 2015-03-01 17:12:34 UTC

If HEAD is not allowed, then query_info is not going to work...