Bug 577630 – libsoup should try to fix up broken Content-Type headers

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 577630 - libsoup should try to fix up broken Content-Type headers


Summary:	libsoup should try to fix up broken Content-Type headers


Status:	RESOLVED FIXED

Product:	libsoup
Classification:	Core
Component:	HTTP Transport
Version:	2.25.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	libsoup-maint@gnome.bugs
QA Contact:	libsoup-maint@gnome.bugs

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2009-04-01 17:44 UTC by Gustavo Noronha (kov)
Modified:	2009-04-03 00:46 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Gustavo Noronha (kov) 2009-04-01 17:44:28 UTC

We hit a problem with webkit, in which the following page would send a bad Content-Type header:

http://www.gnome.org/~shaunm/pulse/web/

This site sends a broken Content-Type header:

kov@abacate ~> wget -S -O /dev/null http://www.gnome.org/~shaunm/pulse/web/
--2009-04-01 14:42:59--  http://www.gnome.org/~shaunm/pulse/web/
Resolving www.gnome.org... 209.132.176.176
Connecting to www.gnome.org|209.132.176.176|:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Date: Wed, 01 Apr 2009 17:43:19 GMT
  Server: Apache/2.2.3 (Red Hat)
  Connection: close
  Content-Type: content-type: text/html; charset=utf-8
Length: unspecified [content-type: text/html]
Saving to: `/dev/null'

    [  <=>                                  ] 6,457       20.5K/s   in 0.3s    

2009-04-01 14:43:02 (20.5 KB/s) - `/dev/null' saved [6457]

The problem is quite simple to work-around, but it would be good to have it in one central place.

Comment 1 Gustavo Noronha (kov) 2009-04-01 17:45:23 UTC

I forgot to say this causes webkit to try to download the page. We have a bug report on WebKit to track this: https://bugs.webkit.org/show_bug.cgi?id=24843.

Comment 2 Dan Winship 2009-04-02 14:17:07 UTC

probably soup_message_headers_get_content_type() should ignore the header if it's syntactically incorrect like this. But why isn't this handled already by your existing content-type-sniffing code?

Comment 3 Gustavo Noronha (kov) 2009-04-02 14:22:21 UTC

It's not sniffed, because the content type is not empty (the only case our current content sniffing code handles). The code currently living in WebKitGTK+ for content sniffing is really just a simple work-around while we get a more complete implementation into libsoup.

Comment 4 Dan Winship 2009-04-03 00:46:27 UTC

fixed in trunk; soup_message_headers_get_content_type() will now return NULL if the header is syntactically incorrect, which will then cause you to try to sniff it and win