After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 675181 - segfault when creating thumbnails of video files
segfault when creating thumbnails of video files
Status: RESOLVED FIXED
Product: gvfs
Classification: Core
Component: smb backend
1.12.x
Other Linux
: Normal normal
: ---
Assigned To: gvfs-maint
gvfs-maint
: 662959 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2012-04-30 19:55 UTC by whoop
Modified: 2014-04-20 07:46 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
tests: Add Smb seek test (not ready to commit) (2.55 KB, patch)
2013-03-11 10:25 UTC, Martin Pitt
none Details | Review
crasher log (11.72 KB, text/plain)
2013-03-14 23:28 UTC, Bastien Nocera
  Details
Fix daemon crash when cancelling channel operations (8.79 KB, patch)
2013-04-04 08:30 UTC, Alexander Larsson
none Details | Review
Fix daemon crash when cancelling channel operations (9.41 KB, patch)
2013-04-04 17:27 UTC, Alexander Larsson
committed Details | Review
channel: Unqueue cancelled requests (5.65 KB, patch)
2013-04-04 17:27 UTC, Alexander Larsson
committed Details | Review
GVfsChannels: Verify that replies are for the right serial (3.93 KB, patch)
2013-04-04 17:27 UTC, Alexander Larsson
committed Details | Review

Description whoop 2012-04-30 19:55:48 UTC
Description:
gvfsd-smb segfaults when a remote directory is being opened for the first time and nautilus is generating thumbnails for video files.
result is that nautilus closes and the samba connection is dropped completely


Additional info:
* package version(s):
gvfs-smb-1.10.1-3
* config and/or log files etc:
Apr 4 20:36:50 localhost kernel: [ 7575.980428] gvfsd-smb[7275]: segfault at 28 ip 0000000000411754 sp 00007fff68b42ce0 error 4 in gvfsd-smb[400000+26000]


Steps to reproduce:
Open a remote directory (via samba in nautilus) containing video files and enable thumbnails for video files. Does not always happen.
Comment 1 Tomas Bzatek 2012-05-02 08:57:29 UTC
Could you please try grabbing a stacktrace so we can see where it crashes? Do you have anything special in that folder - could you ellaborate on which video does it crash?
Comment 2 Bastien Nocera 2012-05-02 09:37:45 UTC
Tomas, I think you can reproduce the bug by removing this work-around in totem, and just seeking around:
http://git.gnome.org/browse/totem/tree/src/backend/bacon-video-widget-gst-0.10.c?h=gnome-3-4#n3437

put path = NULL here, and it will fall back to use a URI, meaning that GStreamer's GIO plugin is used. If you access the video through FUSE, it works fine.
Comment 3 whoop 2012-05-02 20:40:15 UTC
If you tell me how to grab the stacktrace I will try...
I can reproduce this quite easily by opening a remote directory containing a large amount of video files (like wmv, mpeg, etc)...
Comment 4 Tomas Bzatek 2012-05-03 09:47:59 UTC
(In reply to comment #3)
> If you tell me how to grab the stacktrace I will try...
> I can reproduce this quite easily by opening a remote directory containing a
> large amount of video files (like wmv, mpeg, etc)...

You could for example follow the steps described on this page: http://fedoraproject.org/wiki/StackTraces#gdb
Comment 5 Akhil Laddha 2012-06-18 16:27:25 UTC
whoop, were you able to collect stacktrace ?
Comment 6 Bastien Nocera 2012-07-09 12:42:40 UTC
This is definitely reproducable.

  • #0 g_vfs_job_run
    at gvfsjob.c line 198
  • #1 job_handler_callback
    at gvfsdaemon.c line 144
  • #2 g_thread_pool_thread_proxy
    at gthreadpool.c line 309
  • #3 g_thread_proxy
    at gthread.c line 801
  • #4 start_thread
    from /lib64/libpthread.so.0
  • #5 clone
    from /lib64/libc.so.6
$1 = (GVfsJobClass *) 0x6ffa10
(gdb) p class->run
$2 = (void (*)(GVfsJob *)) 0xaaaaaaaaaaaaaaaa
Comment 7 Bastien Nocera 2012-07-09 14:25:07 UTC
==808== Syscall param write(buf) points to unaddressable byte(s)
==808==    at 0x3ABF00E61D: ??? (in /usr/lib64/libpthread-2.15.90.so)
==808==    by 0x58CC1ED: g_unix_output_stream_write (gunixoutputstream.c:379)
==808==    by 0x588FA6C: g_pollable_output_stream_default_write_nonblocking (gpollableoutputstream.c:158)
==808==    by 0x588E346: write_async_pollable (goutputstream.c:1482)
==808==    by 0x588E5BA: g_output_stream_real_write_async (goutputstream.c:1534)
==808==    by 0x588CC58: g_output_stream_write_async (goutputstream.c:818)
==808==    by 0x411E1E: send_reply_cb (gvfschannel.c:598)
==808==    by 0x588C795: async_ready_callback_wrapper (goutputstream.c:642)
==808==    by 0x5897663: g_simple_async_result_complete (gsimpleasyncresult.c:767)
==808==    by 0x58976AF: complete_in_idle_cb (gsimpleasyncresult.c:779)
==808==    by 0x607A684: g_idle_dispatch (gmain.c:4657)
==808==    by 0x6077F2C: g_main_dispatch (gmain.c:2539)
==808==  Address 0x10a36e10 is 0 bytes inside a block of size 65,536 free'd
==808==    at 0x4A079AE: free (vg_replace_malloc.c:427)
==808==    by 0x60800EA: standard_free (gmem.c:98)
==808==    by 0x60802AD: g_free (gmem.c:252)
==808==    by 0x41739B: g_vfs_job_read_finalize (gvfsjobread.c:50)
==808==    by 0x5BC7797: g_object_unref (gobject.c:3023)
==808==    by 0x4131C7: g_vfs_job_run (gvfsjob.c:200)
==808==    by 0x40D05C: job_handler_callback (gvfsdaemon.c:144)
==808==    by 0x60A44CD: g_thread_pool_thread_proxy (gthreadpool.c:309)
==808==    by 0x60A3F07: g_thread_proxy (gthread.c:801)
==808==    by 0x3ABF007EF4: start_thread (in /usr/lib64/libpthread-2.15.90.so)
==808==    by 0x3ABECF4EAC: clone (in /usr/lib64/libc-2.15.90.so)
==808== 
==808== Invalid read of size 1
==808==    at 0x413522: g_vfs_job_emit_finished (gvfsjob.c:323)
==808==    by 0x411E5F: send_reply_cb (gvfschannel.c:613)
==808==    by 0x588C795: async_ready_callback_wrapper (goutputstream.c:642)
==808==    by 0x5897663: g_simple_async_result_complete (gsimpleasyncresult.c:767)
==808==    by 0x58976AF: complete_in_idle_cb (gsimpleasyncresult.c:779)
==808==    by 0x607A684: g_idle_dispatch (gmain.c:4657)
==808==    by 0x6077F2C: g_main_dispatch (gmain.c:2539)
==808==    by 0x6078BF1: g_main_context_dispatch (gmain.c:3075)
==808==    by 0x6078DD4: g_main_context_iterate (gmain.c:3146)
==808==    by 0x6079204: g_main_loop_run (gmain.c:3340)
==808==    by 0x40CDCF: daemon_main (daemon-main.c:300)
==808==    by 0x40CE2D: main (daemon-main-generic.c:39)
==808==  Address 0x28 is not stack'd, malloc'd or (recently) free'd
==808== 
==808== 
==808== Process terminating with default action of signal 11 (SIGSEGV)
==808==  Access not within mapped region at address 0x28
==808==    at 0x413522: g_vfs_job_emit_finished (gvfsjob.c:323)
==808==    by 0x411E5F: send_reply_cb (gvfschannel.c:613)
==808==    by 0x588C795: async_ready_callback_wrapper (goutputstream.c:642)
==808==    by 0x5897663: g_simple_async_result_complete (gsimpleasyncresult.c:767)
==808==    by 0x58976AF: complete_in_idle_cb (gsimpleasyncresult.c:779)
==808==    by 0x607A684: g_idle_dispatch (gmain.c:4657)
==808==    by 0x6077F2C: g_main_dispatch (gmain.c:2539)
==808==    by 0x6078BF1: g_main_context_dispatch (gmain.c:3075)
==808==    by 0x6078DD4: g_main_context_iterate (gmain.c:3146)
==808==    by 0x6079204: g_main_loop_run (gmain.c:3340)
==808==    by 0x40CDCF: daemon_main (daemon-main.c:300)
==808==    by 0x40CE2D: main (daemon-main-generic.c:39)
==808==  If you believe this happened as a result of a stack
==808==  overflow in your program's main thread (unlikely but
==808==  possible), you can try to increase the size of the
==808==  main thread stack using the --main-stacksize= flag.
==808==  The main thread stack size used in this run was 8388608.
==808== Thread 3:
==808== Invalid free() / delete / delete[] / realloc()
==808==    at 0x4A079AE: free (vg_replace_malloc.c:427)
==808==    by 0x3ABED6470B: __libc_freeres (in /usr/lib64/libc-2.15.90.so)
==808==    by 0x480269C: _vgnU_freeres (vg_preloaded.c:61)
==808==  Address 0x3abefb8390 is 0 bytes inside data symbol "noai6ai_cached"
Comment 8 Bastien Nocera 2013-03-08 11:37:04 UTC
You can easily reproduce this with totem-video-thumbnailer from GNOME 3.6 or newer:

$ ./totem-video-thumbnailer -g 30 smb://path/to/video/file foo.png

GStreamer's giosrc will create enough seeks that you will see the problem. If you were to use the fuse path to the file instead, GStreamer would use fuse/filesrc and the problem wouldn't occur.

$ Program received signal SIGSEGV, Segmentation fault.
g_vfs_job_emit_finished (job=0x0) at gvfsjob.c:322
322	  g_assert (!job->finished);
(gdb) bt
  • #0 g_vfs_job_emit_finished
    at gvfsjob.c line 322
  • #1 send_reply_cb
    at gvfschannel.c line 613
  • #2 async_ready_callback_wrapper
    from /lib64/libgio-2.0.so.0
  • #3 g_simple_async_result_complete
    from /lib64/libgio-2.0.so.0
  • #4 complete_in_idle_cb
    from /lib64/libgio-2.0.so.0
  • #5 g_main_context_dispatch
    from /lib64/libglib-2.0.so.0
  • #6 g_main_context_iterate.isra.24
    from /lib64/libglib-2.0.so.0
  • #7 g_main_loop_run
    from /lib64/libglib-2.0.so.0
  • #8 daemon_main
    at daemon-main.c line 396
  • #9 main
    at daemon-main-generic.c line 39

Comment 9 Martin Pitt 2013-03-11 10:25:18 UTC
Created attachment 238568 [details] [review]
tests: Add Smb seek test (not ready to commit)

I ran "totem-video-thumbnailer -v -l -g 30 smb://donald/public/test.ogv /tmp/foo.png" on current master on the ogg video from http://archive.org/details/AlanOakleysmalltestvideo_0 . I don't get this crash, but it does hang eternally after a couple of iterations.

Bastien asked me to try and write a test case for this. This does 500 random seeks on a file through (unauthenticated) smb://. Again I don't get a crash of gvfsd-smb (nor any other process), but I do get an eternal hang after some iterations (in the order of 200). Does that test happen to reproduce the crash for you?

Please note that this shouldn't be committed as-is. The print() is too ugly for running this regularly; also, 500 iterations take quite long, for committing this that number should be reduced. But in its current form it might be more useful for debugging.
Comment 10 Bastien Nocera 2013-03-14 23:28:08 UTC
Created attachment 238946 [details]
crasher log

the giosrc's logs show that the seeking/read combinations are slightly different from what you're trying to reproduce in the test.

I also only tested with avi ("DivX") files and MP4 files.
Comment 11 Alexander Larsson 2013-04-04 08:30:01 UTC
Created attachment 240575 [details] [review]
Fix daemon crash when cancelling channel operations

The error handling in gvfschannel.c:start_queued_request() when
there was an error creating the job or when the request was cancelled
caused problems. It didn't set current_job, yet it called
g_vfs_channel_send_error() which eventually resulted in a
call to send_reply_cb which crashed as it assumed current_job
was set.

Also, not returning TRUE for started_job when we sent an error
is problematic as we then could start the next job which caused
us to have two outstanding jobs on the same channel mixing things up
badly.
Comment 12 Alexander Larsson 2013-04-04 08:30:42 UTC
This fixes the daemon crash, but now i get a hang on the client side when cancelling instead. Will look more.
Comment 13 Alexander Larsson 2013-04-04 17:27:06 UTC
Created attachment 240642 [details] [review]
Fix daemon crash when cancelling channel operations

The error handling in gvfschannel.c:start_queued_request() when
there was an error creating the job or when the request was cancelled
caused problems. It didn't set current_job, yet it called
g_vfs_channel_send_error() which eventually resulted in a
call to send_reply_cb which crashed as it assumed current_job
was set.

Also, not returning TRUE for started_job when we sent an error
is problematic as we then could start the next job which caused
us to have two outstanding jobs on the same channel mixing things up
badly.
Comment 14 Alexander Larsson 2013-04-04 17:27:12 UTC
Created attachment 240643 [details] [review]
channel: Unqueue cancelled requests

We put a channel request on the output buffer and start writing, but
if the write is cancelled on the first call (i.e. no partial writes)
we abort immediately without ever writing the request.

However, if we do this we also need to unqueue the request from the output
buffer, as otherwise this will be sent with the next operation. This
can be problematic for seeks as the seek generation is then not in sync.
Comment 15 Alexander Larsson 2013-04-04 17:27:17 UTC
Created attachment 240644 [details] [review]
GVfsChannels: Verify that replies are for the right serial

We might be getting replies for old cancelled operations which
we need to ignore.
Comment 16 Alexander Larsson 2013-04-04 17:28:30 UTC
Attachment 240642 [details] pushed as 46f9c79 - Fix daemon crash when cancelling channel operations
Attachment 240643 [details] pushed as f25b407 - channel: Unqueue cancelled requests
Attachment 240644 [details] pushed as 5dcde30 - GVfsChannels: Verify that replies are for the right serial
Comment 17 Ross Lagerwall 2014-04-20 07:46:23 UTC
*** Bug 662959 has been marked as a duplicate of this bug. ***