After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 144726 - silent abort when copying to vfat filesystem
silent abort when copying to vfat filesystem
Status: RESOLVED FIXED
Product: nautilus
Classification: Core
Component: File and Folder Operations
2.16.x
Other All
: Urgent critical
: 2.16.x
Assigned To: Christian Neumair
Nautilus Maintainers
: 347457 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2004-06-21 02:38 UTC by Luke Hutchison
Modified: 2007-01-18 18:39 UTC
See Also:
GNOME target: 2.16.x
GNOME version: 2.15/2.16


Attachments
Original dir hierarchy (56.34 KB, text/plain)
2004-06-21 02:39 UTC, Luke Hutchison
  Details
Hierarchy after copy operation (2.69 KB, text/plain)
2004-06-21 02:40 UTC, Luke Hutchison
  Details
Proposed patch (58.15 KB, patch)
2006-01-10 20:49 UTC, Christian Neumair
rejected Details | Review
The file the copy fails on, using Nautilus copy (30.01 KB, text/plain)
2006-01-11 19:07 UTC, Luke Hutchison
  Details
The config directory that causes the Nautilus failure (478.91 KB, application/x-gzip)
2006-01-11 19:34 UTC, Luke Hutchison
  Details

Description Luke Hutchison 2004-06-21 02:38:37 UTC
I have a set of directories that contain digital photo images, which I tried to
copy to a different partition using Nautilus.  The copy seemed to work, but was
way too quick, so I looked in the directory, and it turns out that only about 5%
of the files actually copied across.  The copying aborted partway through,
without giving any warning or error, and for no visible reason.

Attached is a list of files (generated from 'find') from the original directory,
then the target directory.
Comment 1 Luke Hutchison 2004-06-21 02:39:20 UTC
Created attachment 28891 [details]
Original dir hierarchy
Comment 2 Luke Hutchison 2004-06-21 02:40:00 UTC
Created attachment 28892 [details]
Hierarchy after copy operation
Comment 3 Luke Hutchison 2004-06-21 02:40:37 UTC
I marked this top priority, because of the potential for file loss.  I was
making backups, and was about to delete the originals when this happened.
Comment 4 Elijah Newren 2004-06-21 15:56:46 UTC
Could you try copying them manually in a terminal to see if that is successful?
 Also, it appears you are copying to a vfat filesystem, could you check for
symlinks or something similar that may have caused the copy to fail?  Also,
could you clarify the version?  It's marked as 2.6.x, but the Gnome Version
listed is 2.7/2.8.

This looks like it might be a duplicate of bug 138491.
Comment 5 Luke Hutchison 2004-06-22 06:48:31 UTC
"cp -R" copies the whole hierarchy fine, so it can't be a symlink problem, since
-R doesn't follow links.

I listed it as 2.7/2.8 since it probably should be fixed by the release of 2.8
:-)  It is nautilus-2.6.0-4, the FC2 default RPM.

I am copying to a vfat filesystem.  I can confirm that the problem does not
exist when copying to an ext3 filesystem.  I can also copy the individual files
around the point where it stops (IMG_0041.JPG/img_0399.jpg) without a problem,
so it seems to be a cumulative problem caused by copying many files in a
directory.  A race condition?  Memory leak?  File handle leak?  A problem with
Nautilus not expecting the case of files to change (vfat is all lower case)?

It could certainly be a GNOME-VFS problem -- it seems that a GNOME-VFS xfer is
used to actually do the transfer.

It dies silently when the progress indicator is at about 10%.  How can I see
what it is doing when it dies?  I tried attaching gdb, but no signal is emitted
when the copy dies.

Bug 138491 looks like a different problem to me, at least if I'm interpreting it
properly.  I think he's saying that only the dir/file that you actually drag is
copied during multi-selection drags.
Comment 6 Luke Hutchison 2004-06-22 06:49:54 UTC
The bug also survives a reboot.
Comment 7 Elijah Newren 2004-08-10 17:18:41 UTC
The version numbers are used to mark what versions the bug exist in, not when
they should be fixed (the target milestones are for that, but are meant for the
maintainer to set).  The reason we have both a 'Version' field and a 'GNOME
Version' is because not all products have version numbers that match the Gnome
ones and we want to make it easy for the release team to 'query all bugs that
still affect version 2.7'.

Anyway, I'm setting Gnome version to 2.5/2.6 since this was nautilus 2.6.x.  I'm
also shortening the summary slightly.
Comment 8 Matthew Gatto 2004-10-18 13:42:35 UTC
You could try using gnomevfs-copy from the command line and see if that works or
not.
Comment 9 Kjartan Maraas 2005-01-11 23:26:24 UTC
Did you get a chance to try gnomevfs-copy?
Comment 10 Martin Lolov 2005-04-01 11:26:42 UTC
I have problem with vfat filesystem too. 
Mount in fstab:
/dev/hdb5       /media/DiskD    vfat    umask=0         0       0
Nautilus open location, show directories, but can't open any one. Can't create
new direcory, and copy files. In terminal with mkdir, and cp, all working fine.
I check partition with windows, no errors.
Comment 11 mallchin 2005-05-02 14:10:26 UTC
Same here.

I Recently moved 250,000 dir/files from an ext3 to a vfat partition and had to
copy them in small chunks else it would silently abort (bit by bit worked).

I too nearly lost data and this is a high probability for those not paying
attention to the copy dialogue!

I see the bug has been about for nearly 9 months without a resolve. Do you need
someone to do some testing? If so please specify and I will happily narrow the
problem down for you.

I must use vfat partitions and would like to continue to use nautilus to copy
data to and from them.
Comment 12 Luke Hutchison 2005-05-02 16:52:18 UTC
M.Allchin: If you could run some tests, that would be great -- I just haven't
had time to really investigate this deeply.  I haven't tried gnomevfs-copy for
example.  I have given up on vfat/nautilus for now, so it's not even in my
workflow anymore.
Comment 13 Christian Neumair 2005-05-27 18:38:13 UTC
Using gnomevfs-copy and testing whether would be very appreciated.
Comment 14 Alexander Larsson 2005-08-31 14:08:11 UTC
Another nice thing would be if someone could come up with a minimal test case
for this. For instance, if you can repeat this for a folder, can you make a copy
of that folder and remove all the files that weren't copied except the first one
that didn't get copied. Then does it still reproduce? If it does, does removing
another file make be able to copy all files? (I.E. are we limited by number of
files, or is it something specific to the file that made it break?)

I wasn't able to reproduce this bug by just copying lots of files to vfat, so it
seems there is something special about the files its failing at.
Comment 15 Luke Hutchison 2005-08-31 16:03:10 UTC
Investigating this bug is on my list of things to do in the near future, I'm
just moving across the other side of the country right now, so I'm super busy
:-)  The other guys who have experienced this may be able to provide info more
quickly.  

However, to answer your question, Alexander: as far as I recall, the copy
stopped at different points each time, so it was non-deterministic.  This
usually indicates either a race-condition or the use of uninitialized memory. 
Given that you can't duplicate it on your own system, I would suggest the
former, unless there is nothing in the copy operation that is multithreaded.
Comment 16 Alexander Larsson 2005-09-01 11:10:31 UTC
gnome-vfs operations are threaded, such that each asynchronous operation is
handled by its own thread. However, a whole gnome_vfs_xfer operation is handled
by one such thread, so I don't think threads are the problem.
Comment 17 Christian Neumair 2005-10-18 17:14:05 UTC
Luke?
Comment 18 Niklas Lindblad 2005-11-02 12:39:37 UTC
Is this bug still not fixed in the 2.12-release?
Comment 19 Mathijs Vogelzang 2006-01-08 12:21:12 UTC
In the 2.10 release (Debian etch), the bug is still there. Is there any chance it gets solved soon? I almost lost half my music collection because of this bug!
Comment 20 Mathijs Vogelzang 2006-01-08 12:49:21 UTC
gnomevfs-copy works fine BTW, only the graphical drag-drop copy is affected.
Comment 21 Christian Neumair 2006-01-08 15:40:37 UTC
Mathijs: Odd. Are you definitly sure that *only* the drag-drop copy is affected, and that neither gnomevfs-copy nor the usual copy/paste procedure cause this issue?
DnD copying and copy/paste copying both seem to involve fm_directory_view_move_copy_items, so there shouldn't be a difference.

Updating version, milestoning to 2.14.
Comment 22 Mathijs Vogelzang 2006-01-08 21:59:23 UTC
Oh, sorry, I meant:

graphical: both DnD & Copy-paste DON'T work
terminal: both cp -R & gnomevfs-copy work
Comment 23 Christian Neumair 2006-01-09 07:38:54 UTC
Does the bug occur randomly or for particular directories?

I'll come up with some file transfor debugging code, which helps us to find out what's going on.
Comment 24 Luke Hutchison 2006-01-09 10:15:29 UTC
For me it was random (it didn't always stop on the same file), although the copy always had to be going for a while before it failed.  I just re-tested (sorry to be unresponsive on this bug until now), and I can't duplicate it currently, although I don't have the same large directory of digital photos on here anymore that it used to fail with.
Comment 25 Luke Hutchison 2006-01-09 10:23:29 UTC
Mathijs: is this problem only occuring for you when copying to a vfat partition?  I never had this problem ext3->ext3, only ext3->vfat.
Comment 26 Mathijs Vogelzang 2006-01-09 11:14:39 UTC
I experienced it on multiple occasions: I first discovered it while copying from a vfat harddisk partition to a vfat USB device (debian etch, gnome 2.10). But then (after I had copied the whole thing with cp -R), on another computer (ubuntu 5.10, gnome 2.12) the same thing happened when I tried to copy from the vfat USB device to the ext3 harddisk. 
So vfat (hd) -> vfat (usb), vfat (usb) -> ext3 (hd), (and copying back failed too, so also) ext3 (hd) -> vfat (usb)
So could it be the USB device? But then still, it shouldn't happen that nautilus can't copy something cp and gnomevfs-copy can.

Same here, it only happens when copying big directories. The weirdest thing is that I had a directory "music" containing 10 subdirectories just named 1 through 10, each with 100s of files. Copying the whole music directory didn't fail, but copying a single underlying directory did. So can it be something in the time calculation code involving the number of direct children of the copied directory?
Comment 27 Luke Hutchison 2006-01-09 11:22:53 UTC
I observed this problem copying from an ext3 partition on my laptop to a vfat partition on the same physical disk, so no, this is not a USB problem.

That is weird that copying single directories failed while the whole thing did not.  This is probably a race condition, although that's a weird inside-out manifestation of such a problem (usually race conditions become more likely the longer you run a system, all other things being equal)
Comment 28 Luke Hutchison 2006-01-09 11:52:08 UTC
Actually, just a thought Mathijs, your problem may or may not be related, I had a similar problem with an external USB drive on early 2.6 kernels -- the USB subsystem didn't throttle IO properly and as a result had a tendency to overrun and fail in the middle of a copy: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=130326

However having said that your symptoms do indeed sound the same as what I experienced when I originally reported this bug, and as I stated above I was going from one partition to another on the same drive, no USB involved.

AFAICR, the USB throttling bug made the drive unusable until it was unmounted and remounted after a problem occurred.  It doesn't sound like that was a problem for you, but I just wanted to throw out that possibility.
Comment 29 Christian Neumair 2006-01-10 20:49:53 UTC
Created attachment 57122 [details] [review]
Proposed patch

The issue seems to be that Nautilus does all its error handling inside the async callback, which is not reliably called. http://blogs.gnome.org/edit/cneumair/drafts/2006/01/10 has some details.
Comment 30 Christian Neumair 2006-01-10 20:50:56 UTC
Sorry, http://blogs.gnome.org/view/cneumair/2006/01/10 is the correct URI.
Comment 31 Alexander Larsson 2006-01-11 15:26:06 UTC
That blog page is wrong, so the patch is not right.
Comment 32 Federico Mena Quintero 2006-01-11 17:22:00 UTC
How does one replicate this bug?  Can a simple test case be constructed that exposes the bug?  Does it happen reliably if we sprinkle a few usleep()s in the VFS code?
Comment 33 Luke Hutchison 2006-01-11 18:32:24 UTC
I don't know how to duplicate it reliably -- the problem doesn't always occur on the same file.  I just tried though and I can duplicate the problem still, which I didn't think I could do... Just go "Show Hidden Files" then "Select All" in your home directory, and drag everything to a vfat partition.  If you have enough files, it should die at some point.  (Took about 40 seconds on a very slow laptop, at about 2 files/sec.)
Comment 34 Luke Hutchison 2006-01-11 18:50:40 UTC
OK, I finally got my act together and ran tests of gnomevfs-copy.  Sorry for the delay.

I was wrong about Nautilus not stopping on the same file every time.  At least for the test case I described in the last comment, it always stops on the same file.  Unfortunately I don't know what file that is that it fails on, because I can't tell what order Nautilus is copying the files in.  (It's not the order returned by 'find' or by 'ls -R'.)  Two runs copied 1105 and 1108 of 11828 files respectively, then failed.  The difference of 3 is accounted for by the fact that three files were thumbnailed after copying.

The third test was to use gnomevfs-copy.  It copied 3831 files total then failed with the error message:
  Failed to copy to <dir>
  Reason: Invalid URI

Given that it did not stop at the same place, it may be failing for a different reason.

I also noticed another difference between the copy methods -- Nautilus stops 3 times and says it can't copy a file (presumably because there are no read permissions on the file).  I hit "Skip" each time.  gnomevfs-copy doesn't ask any questions like this.  I assume it is skipping the files by default?  Anyway it is possible that it is the user interaction that eventually triggers the race condition?
Comment 35 Luke Hutchison 2006-01-11 18:57:31 UTC
Weird, I investigated the files that Nautilus is asking for user interaction while copying, and it doesn't make sense to me.

I get:

  Error: "Operation not permitted" while copying "/home/luke/...o/biblio.dbf"

$ find ~ -name biblio.dbf -exec ls -l \{\} \;
-rw-r--r--  1 luke luke 58661 Nov  9  2004 /home/luke/.rhopenoffice1.1/user/database/biblio/biblio.dbf
-rw-r--r--  1 luke luke 113095 Apr 12  2005 /home/luke/.openoffice.org2.0/user/database/biblio/biblio.dbf

I don't see any reason that there should be an operation not permitted on these files... The perms are normal, they are normal files (not links etc.).
Comment 36 Luke Hutchison 2006-01-11 19:06:10 UTC
OK, I watched the copy operation closely until it failed, and was able to see the following file flash up for a split second before the window closed silently:

/home/luke/.openoffice.org2.0/user/config/classico.sog

I will attach the file.  It doesn't look like it's anything special.  It is definitely in the source and not in the destination, so it did not copy over.
Comment 37 Luke Hutchison 2006-01-11 19:07:24 UTC
Created attachment 57172 [details]
The file the copy fails on, using Nautilus copy
Comment 38 Luke Hutchison 2006-01-11 19:12:54 UTC
Ah, and gnomevfs-copy fails on the following: ~/.mozilla/firefox/wx7qym8c.default/lock

It's some weird soft link to an IP/port, it shows up as broken (red bg) in ls:

-rw-rw-r--  1 luke luke   9417 Jan 11 13:33 formhistory.dat
-rw-rw-r--  1 luke luke 330068 Jan 12 08:11 history.dat
-rw-r--r--  1 luke luke    719 Nov 28  2004 install.log
-rw-------  1 luke luke  16384 Jan 11 13:33 key3.db
-rw-r--r--  1 luke luke   9596 Jan 12 08:03 localstore.rdf
lrwxrwxrwx  1 luke luke     15 Jan 11 15:23 lock -> 127.0.0.1:+3917
-rw-r--r--  1 luke luke   7887 Nov 29 13:04 mimeTypes.rdf
-rw-rw-r--  1 luke luke      0 Jan 11 15:23 .parentlock
-rwxr-xr-x  1 luke luke   5369 Jan 11 13:33 prefs.js
-rw-r--r--  1 luke luke    752 Nov 27  2004 search.rdf

$ gnomevfs-copy ~/.mozilla/firefox/wx7qym8c.default/lock /tmp
Failed to copy /home/luke/.mozilla/firefox/wx7qym8c.default/lock to /tmp
Reason: Invalid URI
Comment 39 Luke Hutchison 2006-01-11 19:32:37 UTC
I now have a minimum test case for the Nautilus failure.  (I think there are two separate failures in this bug, Nautilus and gnomevfs-copy).

The attachment contains a directory, "config".  Nautilus fails after copying "config/modern_rus.sog", which successfully makes it to the destination.  "config/classico.sog".  Interestingly there is also a "config/Classico.sog" which has successfully copied.

So it appears that the fact that there are two files with the same name but different case in the same dir is tripping up Nautilus when copying to vfat.

Comment 40 Luke Hutchison 2006-01-11 19:34:26 UTC
Created attachment 57174 [details]
The config directory that causes the Nautilus failure

The config directory that causes the Nautilus failure
Comment 41 Luke Hutchison 2006-01-11 19:43:48 UTC
I also investigated the gnomevfs-copy failure, and it is definitely the "lock" softlink/socket that is causing the problem.  Nautilus actually gives the user the option of skipping the file, although it reports the wrong filename (it says "Invalid URI" for the file before the failure, ".parentlock", although it is definitely "lock" that is at fault, as can be demonstrated by moving "lock" out of the directory and trying again.)  This is another bug (that the wrong filename is reported)...  I'm filing it here rather than elsewhere at this point to keep everything together, let me know if/when you want me to file another bug report.  Also Nautilus does not die suddenly in this Invalid URI case, which was the problem in the original bug report.  (gnomevfs-copy does die, albeit not silently, but perhaps that is its behavior when any error occurs?)
Comment 42 Luke Hutchison 2006-01-11 20:04:22 UTC
Sorry for the comment spam, but hopefully some of this is uesful!

I can confirm that an even more minimalistic test case for the Nautilus problem is to just create two files with the same name but different case, and copy them along with other files.  Nautilus copies in a weird order (sometimes it appears to be the same as the inode order returned by 'find', sometimes different), so you might have to select quite a few files to ensure that some files selected will be copied after the ones you created.

Another strange thing is that if you create two files with the same name but a different case, that don't have extensions (e.g. "temp" and "Temp"), Nautilus happily copies both over *without* failing.  Both appear in the vfat window, then if you hit refresh one disappears.  For some reason it is only files that have different cases and have extensions that fail.  I don't have time to investigate right now but probably the difference in behavior between 8.3-format DOS filenames and long filenames should also be investigated, as it is possible there is a difference there too.
Comment 43 Christian Neumair 2006-01-11 21:03:36 UTC
Thanks for all your insigthful comments Luke!

> For some reason it is only files that have different cases and have extensions that fail. (...)
> I don't have time to investigate right now but probably the difference in behavior between
> 8.3-format DOS filenames and long filenames should also be investigated, as it
is possible there is a difference there too.

Exactly! :)
We don't deal with GNOME_VFS_ERROR_NAME_TOO_LONG in handle_transfer_duplicate from nautilus-file-operations.c, like we do in new_file_transfer_callback and new_folder_transfer_callback (those are used for "New File"/"New Folder" feature) However, I can't find any code that generates GNOME_VFS_ERROR_NAME_TOO_LONG in GnomeVFS ATM.
Comment 44 Christian Neumair 2006-01-11 21:08:05 UTC
If you have some time, you can also grab my patch to gnomevfs-copy [1] which will print very verbose output if invoked with the "-vv" option.

[1] http://mail.gnome.org/archives/gnome-vfs-list/2006-January/msg00015.html
Comment 45 Mathijs Vogelzang 2006-01-11 22:50:12 UTC
Hey, but my problem was copying from vfat to vfat, so there can't be any files with the same name with only the case differring! So there must be another problem too.
Comment 46 Luke Hutchison 2006-03-06 13:35:20 UTC
Did this get looked at for GNOME 2.14?  Is it hard to fix?  It's a bad one because it can cause data loss.

gnome-vfs2 probably just needs to send Nautilus a warning, and it should open a dialog asking if the user wants to overwrite the first file with the second file that has different case.  It would seem that a lot of code could be reused for this (the code that asks the user if they want to overwrite a file if it already exists).
Comment 47 Kjartan Maraas 2006-03-14 21:15:51 UTC
Adding GNOME Target
Comment 48 Maimon Mons 2006-07-14 12:30:46 UTC
I'm not sure if bug 347457 is a dupe of this or if it's just a "simplest" test case.
Comment 49 Uri David Akavia 2006-09-13 13:11:38 UTC
Not sure if bug 342437 is correlated somehow to this bug, or that it is a completely unrelated bug.
Comment 50 Luke Hutchison 2006-09-13 13:45:24 UTC
Re. comment #49: no, I think that bug is to do with hitting the 32-bit unsigned integer limit.
Comment 51 Christian Neumair 2006-09-13 21:25:12 UTC
Christian Kellner, Alex Larsson: Do you think it is a good idea to (ab)use GNOME_VFS_XFER_PROGRESS_STATUS_DUPLICATE together with GNOME_VFS_ERROR_INVALID_FILENAME to let the application provide a new, DOS-compliant filename, as we already do with GNOME_VFS_ERROR_NAME_TOO_LONG for long filenames?
Comment 52 Jimmy Angelakos 2006-09-19 19:49:52 UTC
I confirm this, on Ubuntu 6.06 with all updates, and Nautilus 2.14.3-0ubuntu1.

When I try to copy/move files that include e.g. "file.txt" and "FILE.TXT" to a VFAT filesystem the operation is aborted silently and I am none the wiser. I have even deleted files accidentally that I thought had been copied over :/
Comment 53 Luke Hutchison 2006-09-19 20:08:23 UTC
Should the target field be updated?

The problem will likely only become more of an issue with increasing use of FAT-formatted flash devices...

Comment 54 Sebastien Bacher 2006-09-27 18:49:39 UTC
GNOME 2.16.0 still has that issue. Ubuntu bug about that: https://launchpad.net/products/nautilus/+bug/52348. Updating settings and target since 2.16 is the new stable
Comment 55 Alexander Larsson 2006-11-06 10:35:12 UTC
So, i experimented a bit with "file.txt" and "FILE.TXT", and i got this:

[alex@greebo fat_test]$ ls -l
total 0
?--------- ? ? ? ?            ? file.txt
[alex@greebo fat_test]$ rm file.txt 
rm: cannot remove `file.txt': No such file or directory
[alex@greebo fat_test]$ touch file.txt
touch: cannot touch `file.txt': File exists

WTH?
Comment 56 Alexander Larsson 2006-11-06 12:35:23 UTC
*** Bug 347457 has been marked as a duplicate of this bug. ***
Comment 57 Alexander Larsson 2006-11-06 12:38:49 UTC
Fixed in CVS:

2006-11-06  Alexander Larsson  <alexl@redhat.com>

        * libgnomevfs/gnome-vfs-xfer.c: (copy_items):
        Don't always cancel on EFILEEXISTS unless we're trying to
        generate unique filenames. This fixes a silent abort
        when copying "file.txt" and "FILE.TXT" to a case insensitive
        filesystem (like FAT). (#144726)
Comment 58 Luke Hutchison 2006-11-06 14:32:57 UTC
Thanks Alexander!