After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 597004 - Provide API to set a list of languages for the accept-language header
Provide API to set a list of languages for the accept-language header
Status: RESOLVED FIXED
Product: libsoup
Classification: Core
Component: API
2.27.x
Other All
: Normal normal
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
Depends on:
Blocks: 166795
 
 
Reported: 2009-10-01 15:16 UTC by Mario Sánchez Prada
Modified: 2009-12-17 17:24 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch proposal (4.15 KB, patch)
2009-10-02 10:35 UTC, Mario Sánchez Prada
none Details | Review
Patch proposal (4.15 KB, patch)
2009-10-02 10:47 UTC, Mario Sánchez Prada
needs-work Details | Review
Provide a new 'accept-language' property for SoupSession (3.71 KB, patch)
2009-12-16 17:11 UTC, Mario Sánchez Prada
committed Details | Review
New accept-language-auto property for SoupSession (7.91 KB, patch)
2009-12-16 17:12 UTC, Mario Sánchez Prada
committed Details | Review

Description Mario Sánchez Prada 2009-10-01 15:16:34 UTC
It seems it's not possible at this moment to specify in libsoup the list of languages to be passed along with the "accept-language", to be used in outgoings HTTP requests.

This would be a very nice thing to have, which would also help fixing bug 528893 in epiphany-webkit, which is now just ignoring the list of languages specified in the preferences dialog.

So, having an API in libsoup so every request included the accept-language header properly set would be really great. Any idea of where to start from?
Comment 1 Mario Sánchez Prada 2009-10-02 10:35:19 UTC
Created attachment 144577 [details] [review]
Patch proposal

Not sure whether this is or not the way to go, but attaching this patch here for some feedback over it, so don't consider it as something ready to be pushed atm :-)
Comment 2 Mario Sánchez Prada 2009-10-02 10:47:36 UTC
Created attachment 144579 [details] [review]
Patch proposal

New patch fixing a wrong usage of gchar* (instead of char*)
Comment 3 Dan Winship 2009-10-02 11:24:20 UTC
Comment on attachment 144579 [details] [review]
Patch proposal

Hey, this mostly looks good. One nit and one substantive comment:

>+		if (!accept_language) {
>+			priv->accept_language = NULL;
>+		} else {
>+			priv->accept_language = g_strdup (accept_language);
>+		}

no {}s there please

>+	if (priv->accept_language) {
>+		soup_message_headers_replace (item->msg->request_headers,
>+					      "Accept-Language",
>+					      priv->accept_language);
>+	}

So, that's the way User-Agent works, but I think there may be times when you want to change Accept-Language for a single message, so it would be better if it did:

    if (priv->accept_language &&
        !soup_message_headers_get_list (item->msg->request_headers, "Accept-Language")) {

(and then just soup_message_headers_append rather than _replace), so if the app sets a different Accept-Language string on the message, we just use that rather than overriding it.


Two additional bits that I would like to see if you wanted to do a little more hacking on this:

First, it would be nice to have an easy way to say "set Accept-Language automatically from the return value of g_get_language_names()". Maybe a boolean-valued property on SoupSession? ("accept-language-auto" or something). Setting "accept-language" would turn off "accept-language-auto" and enabling "accept-language-auto" would clear "accept-language".

Second, it would be nice to have a method for generating valid Accept-Language strings; the syntax is slightly different from the syntax used by POSIX and g_get_language_names() ("en-us" instead of "en_US"), and also, technically you need to assign "quality values" to each item if you actually want them to be treated as sorted (that is, "es, en" technically means that English and Spanish are equally preferred; you need to say "es, en;q=0.9" if you want Spanish to be ranked higher than English).

It would be useful to check what strings other browsers generate for given language selections; in particular, do they do "*;q=0.1" or something at the end, to indicate "or whatever else you've got"? (The spec says that if none of the indicated languages are available, and there's no "*" term, then the server should return an error, although I suspect most of them probably don't actually.)
Comment 4 Xan Lopez 2009-10-02 12:17:04 UTC
(In reply to comment #3)
> (From update of attachment 144579 [details] [review])
> Hey, this mostly looks good. One nit and one substantive comment:
> 
> >+		if (!accept_language) {
> >+			priv->accept_language = NULL;
> >+		} else {
> >+			priv->accept_language = g_strdup (accept_language);
> >+		}
> 
> no {}s there please
> 

FWIW, g_strdup is NULL-safe, so you can get rid of the whole check.
Comment 5 Mario Sánchez Prada 2009-10-02 14:32:20 UTC
(In reply to comment #3)
> (From update of attachment 144579 [details] [review])
> Hey, this mostly looks good. One nit and one substantive comment:
> 
> >+		if (!accept_language) {
> >+			priv->accept_language = NULL;
> >+		} else {
> >+			priv->accept_language = g_strdup (accept_language);
> >+		}
> 
> no {}s there please

Yep. Also, Xan's recommendation looks good to me as well in this case.

> >+	if (priv->accept_language) {
> >+		soup_message_headers_replace (item->msg->request_headers,
> >+					      "Accept-Language",
> >+					      priv->accept_language);
> >+	}
> 
> So, that's the way User-Agent works, but I think there may be times when you
> want to change Accept-Language for a single message, so it would be better if
> it did:
> 
>     if (priv->accept_language &&
>         !soup_message_headers_get_list (item->msg->request_headers,
> "Accept-Language")) {
> 
> (and then just soup_message_headers_append rather than _replace), so if the app
> sets a different Accept-Language string on the message, we just use that rather
> than overriding it.

The idea I had behind this preliminar patch was to make things as simple as possible, leaving responsability of appending and removing languages to the list to the client of libsoup library, which just would pass a RFC2616-compliant "Accept-Language" string, the same way it's done (and stated in the documentation) for the "user-agent" property of SoupSession.

That way, new API in libsoup would remain limited just to a new property, btw.

However, if you think it's better to add support for adding and removing languages without having to set the full string I have to say I agree with that :-)... but then perhaps it would be better to keep a private GSList in SoupSession with all the languages instead of a single char*. That way we could easily add and remove languages to the list, which would be added with soup_message_headers_append().

Does it make sense to you?

Btw, I'm also thinking we could even also keep a "cached" char* version of such a list already ready to use along with soup_message_headers_replace(), which would get updated whenever a language was added or removed. That way, adding and removing languages would keep being so simple (because of the GSList), while access to the "Accept-Language" string would still be very fast (the cached gchar*).

> Two additional bits that I would like to see if you wanted to do a little more
> hacking on this:
> 
> First, it would be nice to have an easy way to say "set Accept-Language
> automatically from the return value of g_get_language_names()". Maybe a
> boolean-valued property on SoupSession? ("accept-language-auto" or something).
> Setting "accept-language" would turn off "accept-language-auto" and enabling
> "accept-language-auto" would clear "accept-language".

Perhaps we could do something like what it's done in epiphany, in ephy-lang.c:ephy_langs_append_languages(): 

void
ephy_langs_append_languages (GArray *array)
{
	const char * const * languages;
	char *lang;
	int i;

	languages = g_get_language_names ();
	g_return_if_fail (languages != NULL);

	/* FIXME: maybe just use the first, instead of all of them? */
	for (i = 0; languages[i] != NULL; i++)
	{

		if (strstr (languages[i], ".") == 0 &&
		    strstr (languages[i], "@") == 0 &&
		    strcmp (languages[i], "C") != 0)
		{
			/* change to lowercase and '_' to '-' */
			lang = g_strdelimit (g_ascii_strdown
						(languages[i], -1), "_", '-');

			g_array_append_val (array, lang);
		}
	}

	/* Fallback: add "en" if list is empty */
	if (array->len == 0)
	{
		lang = g_strdup ("en");
		g_array_append_val (array, lang);
	}
}

> Second, it would be nice to have a method for generating valid Accept-Language
> strings; the syntax is slightly different from the syntax used by POSIX and
> g_get_language_names() ("en-us" instead of "en_US"), and also, technically you
> need to assign "quality values" to each item if you actually want them to be
> treated as sorted (that is, "es, en" technically means that English and Spanish
> are equally preferred; you need to say "es, en;q=0.9" if you want Spanish to be
> ranked higher than English).

Agree. Then perhaps it could be good to add some public functions to easily add a language and optionally the "quality value", according to the specs:

       Accept-Language = "Accept-Language" ":"
                         1#( language-range [ ";" "q" "=" qvalue ] )
       language-range  = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" )

I guess something like soup_session_add_accept_language (SoupSession *session, const char *lang_code, float qvalue) could work nice.

That way we could always check any 'lang_code' value whenever this function got called, to make sure is a valid string for such a header.
 
> It would be useful to check what strings other browsers generate for given
> language selections; in particular, do they do "*;q=0.1" or something at the
> end, to indicate "or whatever else you've got"? (The spec says that if none of
> the indicated languages are available, and there's no "*" term, then the server
> should return an error, although I suspect most of them probably don't
> actually.)

Good point. I'll check that to see if adding a default "*; q=0.1" at the end would make sense for this case as well (well, I guess q=0.001 would be a safer value, btw).

Thanks a lot for your feedback, let's see if I can come up with something better soon.
Comment 6 Dan Winship 2009-11-21 22:46:48 UTC
(In reply to comment #5)
> Thanks a lot for your feedback, let's see if I can come up with something
> better soon.

Are you still working on this? (I'm mostly ignoring the bug for now, on the assumption that you'll post a new patch at some point. But if you're not likely to get back to it then I may work on it myself at some point.)
Comment 7 Mario Sánchez Prada 2009-11-22 22:41:12 UTC
Sorry, last weeks had been a hell of work (and other kind of responsibilities) and I could not devote time to this yet. If it looks good to you, I'll try to work on this at some points during the next 2 weeks, and if I find I could not devote time either this time I'd notify you so we could get rid of this issue anyway soon.
Comment 8 Mario Sánchez Prada 2009-12-16 17:10:50 UTC
Finally managed to grab some time (thanks to the WebKitGtk hackfest, basically) to work on this and I guess I have a functional and valid version of this patch. Wrapping up:

  - Added a new "accept-language" property to SoupSession, which must be a RFC2616 compliant string ready to be used in the Accept-Language header.

  - Added a new "accept-language-auto" property (boolean) to SoupSession, which would make it so a list of languages will be automatically retrieved by SoupSession based on the result of calling g_get_language_names().

  - Added some internal functions to (so far used when accept-language-auto is TRUE):
     + Convert POSIX language strings to RFC2616 compliant ones
     + Add a quality value ("qv") to strings in the format specified by the same RFC

So, as you can see, at the end I forgot about the GSList/char** thing and the API to add/remove languages to the Accept-Language header, since it proved to be far more messy and unnecessary at the same time... and basically the result of misunderstanding Dan's proposalat a first glance :-)

Now attaching the two patches...
Comment 9 Mario Sánchez Prada 2009-12-16 17:11:46 UTC
Created attachment 149854 [details] [review]
Provide a new 'accept-language' property for SoupSession

This would allow to externally set a list of languages for the HTTP  "Accept-Language" header, compliant to rfc2616.
Comment 10 Mario Sánchez Prada 2009-12-16 17:12:30 UTC
Created attachment 149855 [details] [review]
New accept-language-auto property for SoupSession

Provide a new way for SoupSession to assume an 'Accept-Language' string automatically from the return value of g_get_language_names(), properly formatted according to RFC2616.
Comment 11 Dan Winship 2009-12-17 10:52:07 UTC
pushed along with a little bit of cleanup. Also, I decided to remove the
code that removes duplicate "es" values, etc, because that really ought to
be fixed in g_get_language_names() (which I filed a bug for, bug 604815).

Thanks for the patch.

Attachment 149854 [details] pushed as d15f6c4 - Provide a new 'accept-language' property for SoupSession
Attachment 149855 [details] pushed as b3c60a3 - New accept-language-auto property for SoupSession