Bug 592751 – nautilus 100% spin in g_nearest_pow()

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 592751 - nautilus 100% spin in g_nearest_pow()


Summary:	nautilus 100% spin in g_nearest_pow()


Status:	RESOLVED DUPLICATE of bug 588446

Product:	glib
Classification:	Platform
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2009-08-22 21:30 UTC by Martin Olsson
Modified:	2009-08-23 01:29 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
gdb "bt full" showing where CPU spin was stuck (4.89 KB, text/plain) 2009-08-22 21:30 UTC, Martin Olsson	Details

Description Martin Olsson 2009-08-22 21:30:06 UTC

(gdb) bt full

+ Trace 217078

#0 g_nearest_pow
at /build/buildd/glib2.0-2.21.4/glib/garray.c line 397
#1 g_array_maybe_expand
at /build/buildd/glib2.0-2.21.4/glib/garray.c line 411
#2 IA__g_array_set_size
at /build/buildd/glib2.0-2.21.4/glib/garray.c line 270
#3 IA__g_byte_array_set_size
at /build/buildd/glib2.0-2.21.4/glib/garray.c line 899



If I type "finish" in gdb it never exits g_nearest_pow(). What happens is that for some values pf n it will enter the while loop but the shifting will make n becomes SMALLER instead of bigger.

If you take the wanted_alloc value from the gdb "bt full" you can actually repro the CPU spin with this small sample program:


#include <stdio.h>
g_nearest_pow (int num)
{
        int n = 1;
        while (n < num)
                n <<= 1;
        return n;
}

int main(void)
{
        g_nearest_pow(1073750016);
}


Note that 2^30=1073741824 which is _smaller_ than wanted_alloc so it really needs to return 2^31 which is not representable using a 32-bit signed int.

Now, why is it trying to alloc something that big? Not sure. But the machine has 8GB ram and it's just trying to allocate 1GB in a byte array right? Anway, if you look carefully at the length parameter in stack frames #2 and #3 -- something weird clearly happened between those two. I looked at the code though and it looks like it's just passing one guint=16 into a guint parameter and suddently it's that huge? Maybe it's just a gdb bug though?

Processor is 64-bit but OS is 32-bit which is why the IA__ function prefix right? Could it be that the integer is accidently mangled because it's an IA__ function?

Comment 1 Martin Olsson 2009-08-22 21:30:44 UTC

Created attachment 141455 [details]
gdb "bt full" showing where CPU spin was stuck

Comment 2 Martin Olsson 2009-08-22 21:35:00 UTC

I was surprised about this but apparently it seems like I can repro this bug. I have this 7.5 GB .mkv file in a folder on the desktop (this folder is also shared on the network using SAMBA fwiw). If I double click it to start it in totem and then close it again, then that nautilus instance get's stuck in that same CPU spin (same stack etc).

So, given that I seem to be able to repro this... is there anything you want to to check using gdb/valgrind or whatever environment variables setup or so?

Comment 3 Martin Olsson 2009-08-22 21:36:33 UTC

Adding an SIGABRT whenever "n" is about to grow too bug inside g_nearest_pow is probably a good idea because a crash is better than a CPU hang (crashes will get submitted by apport and other crash analysis tools).

Comment 4 A. Walton 2009-08-23 01:28:40 UTC

Instead of trying to treat a symptom of failing to resize a dynamically allocated one gigabyte array, why don't we try to treat the underlying problem of figuring out why Nautilus is opening a 7.5GB movie file and trying to read its entire contents to begin with, and why it apparently only happens to some people and not others.

Hard to debug since it's most likely being asynchronously opened and read, but it is still doable, especially so since you can reproduce it (I haven't managed to, but I don't have all that many multiple GB ogg or mkv files laying around either). Try breaking at g_file_read_async()/g_file_load_(partial_)contents_async()/etc. and look for the call that's actually opening the file in question.

Comment 5 A. Walton 2009-08-23 01:29:44 UTC


*** This bug has been marked as a duplicate of bug 588446 ***