GNOME Bugzilla – Bug 678941
/contenttype/guess test case failure
Last modified: 2012-06-28 13:58:12 UTC
When building glib 2.33.3, on some architectures (happens on the Debian and Ubuntu builders, but I can reproduce on my Panda board) I get a test case failure (after fixing bug 674452): TEST: contenttype... (pid=11058) /contenttype/guess: FAIL This particularly fails this check, which I amended with the g_printf(): res = g_content_type_guess ("test.pot", (guchar *)"ABC abc", -1, &uncertain); expected = g_content_type_from_mime_type ("application/vnd.ms-powerpoint"); g_printf("expected: %s res: %s uncertain: %i\n", expected, res, uncertain); g_assert (g_content_type_equals (expected, res)); Running the test manually I get martin@panda:~/glib-2.33.3/gio/tests$ ./contenttype /contenttype/guess: expected: application/vnd.ms-powerpoint res: text/x-gettext-translation-template uncertain: 0 ** ERROR:contenttype.c:61:test_guess: assertion failed: (g_content_type_equals (expected, res)) This does not happen on my x86_64 machine. I'm not sure whether it is really architecture specific or just by accident. In either case, "ABC abc" is not a valid line in a PO template, but PowerPoint templates look very binary to me as well, so I guess by that content data neither guess is particularly correct.
Ah, it actually happens on x86_64 as well now. The fix for bug 674452 stops reading beyond the size of "data"; that usually is some random binary junk (whatever is in memory), and thus suspiciously looks like the binary stuff PowerPoint templates have :) Stepping through with a debugger: 925 n_name_mimetypes = xdg_mime_get_mime_types_from_file_name (basename, name_mimetypes, 10); (gdb) p n_name_mimetypes $1 = 2 (gdb) p name_mimetypes $2 = {0x7ffff7fb5cc4 "application/vnd.ms-powerpoint", 0x7ffff7fb5a34 "text/x-gettext-translation-template", 0x0, [...] It correctly identifies both PowerPoint and PO template as possible types. 939 sniffed_mimetype = xdg_mime_get_mime_type_for_data (data, data_size, &sniffed_prio); (gdb) p sniffed_mimetype $1 = 0x7ffff7601d30 "application/octet-stream" (gdb) p sniffed_prio $4 = 0 So it indeed cannot make much out of "ABC abc". (gdb) call looks_like_text (data, data_size) $3 = 1 [...] 943 sniffed_mimetype = "text/plain"; This also seems correct. 978 if ( xdg_mime_mime_type_subclass (name_mimetypes[i], sniffed_mimetype)) This fails for name_mimetypes[0] (powerpoint), but succeeds for "text/x-gettext-translation-template". I guess this is okay, as .pot is a plain text format. The test already has g_assert (uncertain); so I propose to check that either powerpoint or PO template as a result is okay, and to add two more tests which uniquely identify either.
Created attachment 217348 [details] [review] improve contenttype tests I propose to adjust the tests like in the attached patch. The first one now uses res = g_content_type_guess ("test.pot", (guchar *)"msgid \"", 7, &uncertain); which ought to be unanimously a PO template. This works fine with current glib. The third uses the old test res = g_content_type_guess ("test.pot", (guchar *)"ABC abc", 7, &uncertain); which should be uncertain (as current master already asserts), so I modified it to accept either result. Current glib does not recognize it as uncertain, though, the test fails with ERROR:contenttype.c:79:test_guess: assertion failed: (uncertain) The second test gives a binary magic (which I've got from some example PowerPoint template that I googled): res = g_content_type_guess ("test.pot", (guchar *)"\xCF\xD0\xE0\x11", 4, &uncertain); I think this also ought to be unanimous, as this is not valid ASCII or UTF-8. res is indeed PO template, but this time uncertain is unexpectedly True: ERROR:contenttype.c:69:test_guess: assertion failed: (!uncertain)
(In reply to comment #2) > The third uses the old test > > res = g_content_type_guess ("test.pot", (guchar *)"ABC abc", 7, &uncertain); > > which should be uncertain (as current master already asserts), so I modified it > to accept either result. Current glib does not recognize it as uncertain, > though, the test fails with > > ERROR:contenttype.c:79:test_guess: assertion failed: (uncertain) It's not uncertain because it's plain-text-y, and so it can't be powerpoint. It was only apparently uncertain before because it was looking at garbage past the end of the string. > The second test gives a binary magic (which I've got from some example > PowerPoint template that I googled): > > res = g_content_type_guess ("test.pot", (guchar *)"\xCF\xD0\xE0\x11", 4, > &uncertain); > > I think this also ought to be unanimous, as this is not valid ASCII or UTF-8. > res is indeed PO template, but this time uncertain is unexpectedly True: I guess you'd have to trace through the code to see why...
Created attachment 217400 [details] [review] improve contenttype tests > It's not uncertain because it's plain-text-y, and so it can't be powerpoint. Fair enough. I adjusted the test cases accordingly. Now the only thing that fails is the g_assert (!uncertain); for the "\xCF... binary case, which looks wrong. Stepping through now.. So this patch replaces my previous one for improving the test cases.
danw | pitti: looks good. thanks Pushed, thanks for the review!
Comment on attachment 217400 [details] [review] improve contenttype tests Sorry, wrong bug.
Created attachment 217411 [details] [review] Fix /contenttype/guess test As per the previous comments, the actual logic is working as well as it can. This patch updates the test cases to what is really expected, so that the whole test suite now succeeds again.
For the record: The four bytes from my previous patch (for the "binary junk" case) delivered: 947 sniffed_mimetype = xdg_mime_get_mime_type_for_data (data, data_size, &sniffed_prio); (gdb) p sniffed_mimetype $2 = 0x7ffff7601db0 "application/octet-stream" /usr/share/mime/magic does not actually have a magic for powerpoint, so that can't help us, and octet-stream is the best it can do. So I propose we just remove the "uncertain" assertion for this case; right now it is rightfully uncertain (but we do check that the returned value is powerpoint), and if some day we grow a magic for powerpoint it should be "certain" again.
Comment on attachment 217411 [details] [review] Fix /contenttype/guess test looks good
Thanks for the review! Pushed.