After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 719253 - Duplicate detection should detect previous versions of an image (md5 of image content only)
Duplicate detection should detect previous versions of an image (md5 of image...
Status: RESOLVED OBSOLETE
Product: shotwell
Classification: Other
Component: import
unspecified
Other All
: Normal normal
: ---
Assigned To: Shotwell Maintainers
Shotwell Maintainers
: 719233 738221 738222 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2013-05-19 09:27 UTC by Shotwell Maintainers
Modified: 2021-05-19 13:56 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Charles Lindsay 2013-11-25 22:11:04 UTC


---- Reported by shotwell-maint@gnome.bugs 2013-05-19 14:27:00 -0700 ----

Original Redmine bug id: 6970
Original URL: http://redmine.yorba.org/issues/6970
Searchable id: yorba-bug-6970
Original author: Aaron W
Original description:

I am running Shotwell 0.14.1 on Ubuntu Raring (13.04).

If I do the following:

1) change preferences to write changes to tags/metadata to the file;

2) export a copy of a picture as "picture_notags.jpg";

3) add a tag to the picture of "test1";

4) export a copy of the picture as "picture_tag_test1.jpg";

5) add a tag to the picture of "test2";

6) import the two photos from the folder;

then the "notags" and "test1" versions of the file will be imported, giving
three versions of the file in total.

If one thinks about this, the user has made each of these changes
(adding/removing tags, changing dates etc) chronologically and has therefore
deliberately changed the file from the one that is being imported. A good
starting assumption is that the user does not want an additional previous
version of a changed file.

In my view, the hashes (or however you are currently checking for duplicates)
of previous versions of files should also be kept, so that these can be
identified in addition to current versions.

I accept that there are situations where you may want to re-add files as they
were, rather than as they are. I therefore suggest that, if duplicates to
previous versions are detected, a dialogue along the following lines is
displayed:

"

50 photos were successfully imported.

35 duplicates were not imported.

20 photos are earlier versions of photos that are in your library (eg before
you made changes to metadata/tags). Do you wish to import these earlier
versions of photos?"

The lack of this feature catches me out a lot, as I am paranoid about my
photos and often re-import my photos from various sources to ensure I don't
lose any. The result is a mess of duplicate photos that don't incorporate tags
/date-fixes.

Related issues:
related to shotwell - Feature #3487: More sophisticated, configurable handling
of duplicates (Open)
related to shotwell - 7027: Tagged pictures are duplicated upon
reimportation (Open)



--- Bug imported by chaz@yorba.org 2013-11-25 22:11 UTC  ---

This bug was previously known as _bug_ 6970 at http://redmine.yorba.org/show_bug.cgi?id=6970

Unknown version " in product shotwell. 
   Setting version to "!unspecified".
Unknown milestone "unknown in product shotwell. 
   Setting to default milestone for this product, "---".
Setting qa contact to the default for this product.
   This bug either had no qa contact or an invalid one.
Resolution set on an open status.
   Dropping resolution 

Comment 1 Jim Nelson 2014-10-09 17:38:09 UTC
*** Bug 738222 has been marked as a duplicate of this bug. ***
Comment 2 Jim Nelson 2014-10-09 18:00:05 UTC
*** Bug 738221 has been marked as a duplicate of this bug. ***
Comment 3 seb 2014-10-12 12:55:33 UTC
This bug was and is valid since Shotwell version 0.14 and still valid.

Another dupe is: https://bugzilla.gnome.org/show_bug.cgi?id=719233
Comment 4 Jim Nelson 2014-10-14 17:28:55 UTC
*** Bug 719233 has been marked as a duplicate of this bug. ***
Comment 5 John Chivall 2015-04-11 16:39:57 UTC
This bug gets me every time I do an import. I have a saved search for untagged files ending in '_1.JPG' and can manually remove them, but this really shouldn't be necessary. Shotwell ought to be smart enough to recognise that the picture is the same even with the addition of tags.

Current version: shotwell-0.20.2-2.fc21.x86_64
Comment 6 realdiskdoc 2015-12-23 00:12:25 UTC
I'm not sure if my comment is redundant, but this is still a problem in Shotwell 0.22.0 when importing from a folder, at least.

Pictures I tagged earlier (writing metadata to files) result in the same pictures being imported again and not detected as duplicates.
Comment 7 Peter 2016-01-23 06:57:01 UTC
This is infuriating, mostly because doing a simple hash of files to compare them should be a no brainer.

If metadata makes a photo "different" then offer the user both a default duplicate action as well as the option to review each duplicate found to pick which to keep.

This feature should be core to photo organising software like Shotwell.

Do we have to put a bounty or something on this bug to get it fixed? 

I'd pay.
Comment 8 Lubosz Sarnecki 2017-12-12 21:16:23 UTC
Still happens in 0.26.3.
I moved a 40GB library to another drive, and activated "Watch library directory for new files".
Now Shotwell is eating up all my CPU and drive.

@Peter: If you want to pay then join me on Bountysource. I have put 5 bucks on this bug.

https://www.bountysource.com/issues/52635043-duplicate-detection-should-detect-previous-versions-of-an-image-md5-of-image-content-only
Comment 9 Lubosz Sarnecki 2017-12-12 21:19:03 UTC
It is creating additional JPEGs to all raw files.

Original File:
DSC_0644.NEF

Already existed:
DSC_0644_NEF_embedded.jpg

Shotwell now does another one:
DSC_0644_NEF_embedded_1.jpg
Comment 10 Jens Georg 2017-12-12 21:43:05 UTC
What exactly did you move?
Comment 11 Peter 2017-12-13 00:49:33 UTC
(In reply to Lubosz Sarnecki from comment #8)
> Still happens in 0.26.3.
> I moved a 40GB library to another drive, and activated "Watch library
> directory for new files".
> Now Shotwell is eating up all my CPU and drive.
> 
> @Peter: If you want to pay then join me on Bountysource. I have put 5 bucks
> on this bug.
> 
> https://www.bountysource.com/issues/52635043-duplicate-detection-should-
> detect-previous-versions-of-an-image-md5-of-image-content-only

I chipped in $15, let's get this fixed.
Comment 12 Lubosz Sarnecki 2017-12-13 11:33:23 UTC
(In reply to Jens Georg from comment #10)
> What exactly did you move?

I moved the 2017 folder (I just had images from this year) from ~/Pictures to a destination on an external drive and changed the target library folder in Shotwell, applying the auto discover function in hope that Shotwell will find the new paths.

I didn't touch any dot folders or config files.

Is there another procedure of doing this?

In the end it turned out ok, except for the several gigabytes of JPEG dupes produced and the CPU time required for that.
Comment 13 Jens Georg 2017-12-13 16:49:04 UTC
Yes, you manually need to adapt the paths in the database. What happens is:

Shotwell (currently) needs a JPEG for each RAW. This is called a backing photo (the _shotwell.jpg files). Those are associated with the RAW file in the database.

If you only copy the library and re-import it (by automatic scanning), the connection is lost since the database points to the old file location. Shotwell will then create a new JPEG backing file (_shotwell_1.jpg) because the original name is already taken. The old backing photo will be taken as a new file.

While comparing the image contents (which isn't exactly trivial) would help in this situation, IMHO it would only be curing the symptoms.

What Shotwell should provide is:
 - Short-term: Proper documentation on how to relocate the library
 - Mid-term: Support the user when he has to/wants to move the library around with some kind of tool
 - Long-term: Get rid of the need of a backing photo for RAWs (bug 718242). There is a trade-off to be paid between speed and disk-space so this has to be chosen carefully.
Comment 14 R Mercado 2018-08-23 22:24:27 UTC
Hi,
I am using shotwell 0.28.4 (Fedora 28 system) and I still see this behaviour.
Thanks,
RM
Comment 15 Jens Georg 2018-08-24 08:52:01 UTC
(In reply to R Mercado from comment #14)
> Hi,
> I am using shotwell 0.28.4 (Fedora 28 system) and I still see this behaviour.
> Thanks,
> RM

Which of the many behaviors described in this bug do you see exactly?
Comment 16 GNOME Infrastructure Team 2021-05-19 13:56:39 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/shotwell/-/issues/4380.