Bug 691971 – User should be able to cancel delete track jobs

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 691971 - User should be able to cancel delete track jobs


Summary:	User should be able to cancel delete track jobs


Status:	RESOLVED FIXED

Product:	banshee
Classification:	Other
Component:	general
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	Banshee Maintainers
QA Contact:	Banshee Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2013-01-17 19:11 UTC by Nicholas Little
Modified:	2013-02-16 23:24 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Allows cancellation of PrimarySource track deletions (4.06 KB, patch) 2013-01-17 19:11 UTC, Nicholas Little	needs-work	Details \| Review
Allows cancellation of PrimarySource track deletions V2 (5.02 KB, patch) 2013-02-08 21:21 UTC, Nicholas Little	needs-work	Details \| Review
Allows cancellation of PrimarySource track deletions V3 (2.99 KB, patch) 2013-02-08 21:27 UTC, Nicholas Little	committed	Details \| Review

Description Nicholas Little 2013-01-17 19:11:05 UTC

Created attachment 233687 [details] [review]
Allows cancellation of PrimarySource track deletions

Currently the user is able to cancel track additions, but they are prevented from cancelling track deletions.

The overall operation is covered by a dialog prompting the user if they delete more than x tracks (currently 10, iirc) but the Cancel operation while the job is in progress is disabled.

The attached patch enables the user to cancel a running track deletion. The main changes are:

- When exiting the foreach loop if the job is cancelled, we need to add all remaining tracks to the skip_deletion list, so this process I turned on its head so we keep track of what's successfully deleted.

- After that, we take the intersection of the CachedList and the successfully deleted tracks, then we proceed as normal.

Cheers!

Comment 1 Andrés G. Aragoneses (IRC: knocte) 2013-01-25 01:00:24 UTC

Comment on attachment 233687 [details] [review]
Allows cancellation of PrimarySource track deletions

Hey Nicholas, good patch! here's a review:

> @@ -510,11 +511,16 @@ namespace Banshee.Sources
>          protected virtual void DeleteTrackList (CachedList<DatabaseTrackInfo> list)
>          {
>              is_deleting = true;
> -            DeleteTrackJob.Total += (int) list.Count;
> -            List<DatabaseTrackInfo> skip_deletion = null;
> +            DeleteTrackJob.Total += list.Count;
> +            LinkedList<DatabaseTrackInfo> deleted = new LinkedList<DatabaseTrackInfo> ();

Try to use var when using constructors please.
Also, why a LinkedList? The methods you're using in the list are just valid to any Enumerable<>.


>  
>              // Remove from file system
>              foreach (DatabaseTrackInfo track in list) {
> +                if (DeleteTrackJob.IsCancelRequested) {
> +                    DeleteTrackJob.Finish ();
> +                    break;
> +                }
> +
>                  if (track == null) {
>                      DeleteTrackJob.Completed++;
>                      continue;
> @@ -522,12 +528,9 @@ namespace Banshee.Sources
>  
>                  try {
>                      DeleteTrackJob.Status = String.Format ("{0} - {1}", track.ArtistName, track.TrackTitle);
> -                    if (!DeleteTrack (track)) {
> -                        if (skip_deletion == null) {
> -                            skip_deletion = new List<DatabaseTrackInfo> ();
> -                        }
> -                        skip_deletion.Add (track);
> -                    }
> +                    if (DeleteTrack (track))
> +                        deleted.AddLast (track);

Remember the coding guidelines, we use braces after 'if' statements, even if there is only one line after them.

Using a deleted list of the items instead of a skip_deletion makes sense.


>                  } catch (Exception e) {
>                      Log.Exception (e);
>                      ErrorSource.AddMessage (e.Message, track.Uri.ToString ());
> @@ -539,21 +542,17 @@ namespace Banshee.Sources
>                  }
>              }
>  
> -            is_deleting = false;
> -
> -            if (DeleteTrackJob.Total == DeleteTrackJob.Completed) {
> +            if (!DeleteTrackJob.IsFinished) {
>                  delete_track_job.Finish ();
> -                delete_track_job = null;
>              }
>  
> -            if (skip_deletion != null) {
> -                list.Remove (skip_deletion);
> -                skip_deletion.Clear ();
> -                skip_deletion = null;
> -            }
> +            is_deleting = false;
> +            delete_track_job = null;
>  
>              // Remove from database
> -            if (list.Count > 0) {
> +            if (deleted.Count > 0) {
> +                var skip_deletion = list.Except (deleted);
> +                list.Remove (skip_deletion);

I think this the main area of concern in the patch. Take in account how is the code currently (before the patch): there's a bunch of tracks in the CoreCache table with the same ModelId which are the ones to remove, then list.Remove(skip_deletion) constructs a SQL statement which means its only one query to the database to remove the elements that were not really removed, from the cache, and then a another query (with 3 statements, INSERT+SELECT+DELETE).

However, with your patch the situation is worse with the regards of performance, because:
1) You iterate over all items of list (via Except extension method) to find the ones that are not in the deleted list.
2) You send a query to remove the items found in 1.
3) You send a query to delete the tracks (INSERT+SELECT+DELETE).

They are both the same except for (1), which is new. So performance decreased. How about if you simply modified the SQL query used in 3) in order to receive a list of IDs (like it's done in 2) to be added to the query as a list of items divided by commas? Then you would do: 

  INSERT INTO CoreRemovedTracks (DateRemovedStamp, TrackID, Uri)
    SELECT ?, TrackID, {0} FROM CoreTracks WHERE TrackID IN (SELECT ItemID FROM CoreCache WHERE ModelID = ?);
  DELETE FROM CoreTracks WHERE TrackID IN ({1})

This way you would avoid the added (1) step in this patch, and the (2) step which was even before this patch, so a performance improvement!



>                  ServiceManager.DbConnection.Execute (remove_list_command, DateTime.Now, list.CacheId, list.CacheId);
>              }
>  
> @@ -628,7 +627,7 @@ namespace Banshee.Sources
>          {
>              CachedList<DatabaseTrackInfo> list = cached_list as CachedList<DatabaseTrackInfo>;
>              is_adding = true;
> -            AddTrackJob.Total += (int) list.Count;
> +            AddTrackJob.Total += list.Count;

As this change is not related to the patch, I've committed already as a separate improvement:

http://git.gnome.org/browse/banshee/commit/?id=f32a020c6309dfd6bc4cf93c460f90cd960ba2fb

Comment 2 Nicholas Little 2013-01-25 19:26:35 UTC

(In reply to comment #1)
> (From update of attachment 233687 [details] [review])
> Hey Nicholas, good patch! here's a review:
> 
> > @@ -510,11 +511,16 @@ namespace Banshee.Sources
> >          protected virtual void DeleteTrackList (CachedList<DatabaseTrackInfo> list)
> >          {
> >              is_deleting = true;
> > -            DeleteTrackJob.Total += (int) list.Count;
> > -            List<DatabaseTrackInfo> skip_deletion = null;
> > +            DeleteTrackJob.Total += list.Count;
> > +            LinkedList<DatabaseTrackInfo> deleted = new LinkedList<DatabaseTrackInfo> ();
> 
> Try to use var when using constructors please.
> Also, why a LinkedList? The methods you're using in the list are just valid to
> any Enumerable<>.

A linked list was natural, and I think the best choice of structure-I knew the list was going to grow up to at most the size of the original cached list, and we're always iterating through it. I hadn't looked at the CachedList remove class to consider the impact of linq on that structure at the time-the next version might iterate through the linked list again, but we don't need random access here. Otherwise it would have made sense to use an array based structure (such as the plain List<T>) but with the capacity set...

> >  
> >              // Remove from file system
> >              foreach (DatabaseTrackInfo track in list) {
> > +                if (DeleteTrackJob.IsCancelRequested) {
> > +                    DeleteTrackJob.Finish ();
> > +                    break;
> > +                }
> > +
> >                  if (track == null) {
> >                      DeleteTrackJob.Completed++;
> >                      continue;
> > @@ -522,12 +528,9 @@ namespace Banshee.Sources
> >  
> >                  try {
> >                      DeleteTrackJob.Status = String.Format ("{0} - {1}", track.ArtistName, track.TrackTitle);
> > -                    if (!DeleteTrack (track)) {
> > -                        if (skip_deletion == null) {
> > -                            skip_deletion = new List<DatabaseTrackInfo> ();
> > -                        }
> > -                        skip_deletion.Add (track);
> > -                    }
> > +                    if (DeleteTrack (track))
> > +                        deleted.AddLast (track);
> 
> Remember the coding guidelines, we use braces after 'if' statements, even if
> there is only one line after them.
> 
> Using a deleted list of the items instead of a skip_deletion makes sense.

Perhaps, but the skip_deletion was quite nice in that it didn't need initialising unless we actually need it, and it's less likely to grow so big in the majority of cases-plus, as you show later it's much cheaper using the Remove(IEnumerable<T>) in the CachedList than using linq on it.

> >                  } catch (Exception e) {
> >                      Log.Exception (e);
> >                      ErrorSource.AddMessage (e.Message, track.Uri.ToString ());
> > @@ -539,21 +542,17 @@ namespace Banshee.Sources
> >                  }
> >              }
> >  
> > -            is_deleting = false;
> > -
> > -            if (DeleteTrackJob.Total == DeleteTrackJob.Completed) {
> > +            if (!DeleteTrackJob.IsFinished) {
> >                  delete_track_job.Finish ();
> > -                delete_track_job = null;
> >              }
> >  
> > -            if (skip_deletion != null) {
> > -                list.Remove (skip_deletion);
> > -                skip_deletion.Clear ();
> > -                skip_deletion = null;
> > -            }
> > +            is_deleting = false;
> > +            delete_track_job = null;
> >  
> >              // Remove from database
> > -            if (list.Count > 0) {
> > +            if (deleted.Count > 0) {
> > +                var skip_deletion = list.Except (deleted);
> > +                list.Remove (skip_deletion);
> 
> I think this the main area of concern in the patch. Take in account how is the
> code currently (before the patch): there's a bunch of tracks in the CoreCache
> table with the same ModelId which are the ones to remove, then
> list.Remove(skip_deletion) constructs a SQL statement which means its only one
> query to the database to remove the elements that were not really removed, from
> the cache, and then a another query (with 3 statements, INSERT+SELECT+DELETE).
> 
> However, with your patch the situation is worse with the regards of
> performance, because:
> 1) You iterate over all items of list (via Except extension method) to find the
> ones that are not in the deleted list.
> 2) You send a query to remove the items found in 1.
> 3) You send a query to delete the tracks (INSERT+SELECT+DELETE).
> 
> They are both the same except for (1), which is new. So performance decreased.
> How about if you simply modified the SQL query used in 3) in order to receive a
> list of IDs (like it's done in 2) to be added to the query as a list of items
> divided by commas? Then you would do: 
> 
>   INSERT INTO CoreRemovedTracks (DateRemovedStamp, TrackID, Uri)
>     SELECT ?, TrackID, {0} FROM CoreTracks WHERE TrackID IN (SELECT ItemID FROM
> CoreCache WHERE ModelID = ?);
>   DELETE FROM CoreTracks WHERE TrackID IN ({1})
> 
> This way you would avoid the added (1) step in this patch, and the (2) step
> which was even before this patch, so a performance improvement!

Ah, yes-I checked the sqlite command for 3 (that's the one executed below, if i'm not mistaken?) and it's exactly as you say. But-maybe we can just use the remove_range_sql command in this case: I was about to mention cleaning items left in the CoreCache table till I saw CachedList.Dispose()...

> >                  ServiceManager.DbConnection.Execute (remove_list_command, DateTime.Now, list.CacheId, list.CacheId);
> >              }
> >  
> > @@ -628,7 +627,7 @@ namespace Banshee.Sources
> >          {
> >              CachedList<DatabaseTrackInfo> list = cached_list as CachedList<DatabaseTrackInfo>;
> >              is_adding = true;
> > -            AddTrackJob.Total += (int) list.Count;
> > +            AddTrackJob.Total += list.Count;
> 
> As this change is not related to the patch, I've committed already as a
> separate improvement:
> 
> http://git.gnome.org/browse/banshee/commit/?id=f32a020c6309dfd6bc4cf93c460f90cd960ba2fb

Cheers for that, couldn't resist the minor tidy-up...

Hopefully have an updated version for you over the weekend.

Cheers!

Comment 3 Nicholas Little 2013-02-08 21:21:20 UTC

Created attachment 235543 [details] [review]
Allows cancellation of PrimarySource track deletions V2

Hi Andrés,

Sorry it's been so long an interval updating this one. I've got two versions.

I took into account your consideration of performance wrt the database and the operations involved and ended up with this version. So unless we're cancelled we keep the old behaviour of deleting the whole CachedList contents, but otherwise we build a list of track ids and remove that list. As we saw earlier the CachedList will take care of itself when it's Disposed.

While this reduces the sql involved to a single operation each way-it feels a bit like micro-optimisation...

Comment 4 Nicholas Little 2013-02-08 21:27:21 UTC

Created attachment 235544 [details] [review]
Allows cancellation of PrimarySource track deletions V3

...Which leads me to this version.

It's got a far lower impact on the PrimarySource and uses the original strategy of only recording the fails/cancelled tracks, so shouldn't perform any worse than the current banshee implementation.

Let me know what you think. Cheers!

Comment 5 Andrés G. Aragoneses (IRC: knocte) 2013-02-16 14:33:51 UTC

Hey Nicholas, why didn't you implement the ideas I mentioned in comment#1? (Except the usage of LinkedList, which I now think is correct)

What I mean is, that in the last two patches you reverted your approach and kept the skip_deletion list, but that doesn't really work if we want to implement cancelling! Let me try to explain with an example:

4 tracks to remove: A, B, C, D

In the middle of the deletion, the user presses Cancel. The track A has already been deleted and, we're in the foreach iteration dealing with the track B. What is going to happen is that B is going to be added to the skip_deletion list, but not C and D!

So then A, C, D will be deleted from the database, and A will have been deleted from both the database and the filesystem.

You see what I mean?

Let me know if I overlooked anything.

Comment 6 Nicholas Little 2013-02-16 17:57:09 UTC

(In reply to comment #5)
> Hey Nicholas, why didn't you implement the ideas I mentioned in comment#1?
> (Except the usage of LinkedList, which I now think is correct)

Ah, but I did! V2 was such a version. In summary, it makes a list of deleted tracks, and if the job is cancelled part way through it uses that list to remove the track id's from the database in a single SQL statement. If the job runs to completion then the current behaviour is executed and the whole cached list is removed.

The reason I'm not a fan of V2 is mostly because of the SQL string building it has to do to make the track list, i.e. it has to be aware that the CacheEntryId is the key to the item for removal-it doesn't feel right for that to be in the primary source.

Consequently, two different methods of dealing with the deleted tracks I think is asking for maintenance trouble down the road.

> What I mean is, that in the last two patches you reverted your approach and
> kept the skip_deletion list, but that doesn't really work if we want to
> implement cancelling! Let me try to explain with an example:
> 
> 4 tracks to remove: A, B, C, D
> 
> In the middle of the deletion, the user presses Cancel. The track A has already
> been deleted and, we're in the foreach iteration dealing with the track B. What
> is going to happen is that B is going to be added to the skip_deletion list,
> but not C and D!
> 
> So then A, C, D will be deleted from the database, and A will have been deleted
> from both the database and the filesystem.
> 
> You see what I mean?
> 
> Let me know if I overlooked anything.

Yeah, but that's ok! In V3, I think you may have misread the continue in the foreach as a break ;) So, if the job gets cancelled we iterate through the rest of the tracks (only adding them to the skip_deletion list) and use the existing behaviour to remove those from the cached list and then remove the cached list contents as before. So, performance should be the same-but it's a far simpler patch.

Hope this makes sense, cheers!

Comment 7 Andrés G. Aragoneses (IRC: knocte) 2013-02-16 22:08:30 UTC

Comment on attachment 235543 [details] [review]
Allows cancellation of PrimarySource track deletions V2

Alright yeah sorry for the previous review, I'm now more awake.

First, I take back what I said about LinkedList. We shouldn't use it. Two reasons:
1) It's an unrelated change to this bug. If anything, it should go in a separate commit.
2) According to 
http://stackoverflow.com/questions/851949/asymptotic-complexity-of-net-collection-classes#answer-851972 the uses we are excercising on the list are not the ones in which LinkedList provides benefits, so let's respect the kiss principle ;)

Also, another issue I've found in this patch is: in case DeleteTrack() returns false in some iteration but the job is not cancelled, then partial_job is false, and then you delete all items! But you should not delete the one that couldn't be deleted.

Anyway, as I'm reviewing V2 and you say V3 is better than V2, don't post a fixed patch yet! I'll review v3 asap.

Comment 8 Andrés G. Aragoneses (IRC: knocte) 2013-02-16 23:23:02 UTC

Comment on attachment 235544 [details] [review]
Allows cancellation of PrimarySource track deletions V3

Ok, so I've pushed this here:

http://git.gnome.org/browse/banshee/commit/?id=efd3c38d356fdb4dc0abe3658f618b2a25dcbe20

Only modification is that I haven't used LinkedList<>.

BTW, good idea there using continue instead of break! It simplified the patch a lot, true.

Comment 9 Andrés G. Aragoneses (IRC: knocte) 2013-02-16 23:24:53 UTC

This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.