After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 689018 - Banshee crashes at playing certain mp3-file with Last.fm scrobbling switched on.
Banshee crashes at playing certain mp3-file with Last.fm scrobbling switched on.
Status: RESOLVED FIXED
Product: banshee
Classification: Other
Component: Last.fm
2.6.0
Other Linux
: Normal normal
: ---
Assigned To: Banshee Maintainers
Banshee Maintainers
W
: 689016 693844 694039 694521 698650 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2012-11-25 10:29 UTC by V_oron
Modified: 2013-04-24 10:06 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Banshee log-file (9.23 KB, text/plain)
2012-11-25 10:29 UTC, V_oron
  Details
Banshee debug log-file (18.77 KB, text/plain)
2012-11-26 18:39 UTC, V_oron
  Details
Scrobbler queue XML-file (16.67 KB, text/xml)
2012-11-26 18:41 UTC, V_oron
  Details
Improved Last.fm response error handling (2.04 KB, patch)
2012-11-28 10:53 UTC, Phil Trimble
reviewed Details | Review
Improved Last.fm response error handling [v2] (1.89 KB, patch)
2012-12-06 05:38 UTC, Phil Trimble
committed Details | Review
Gracefully reduce submission count upon failures (5.49 KB, patch)
2013-01-18 07:17 UTC, Phil Trimble
none Details | Review
Alternative patch (11.75 KB, patch)
2013-01-18 23:14 UTC, Andrés G. Aragoneses (IRC: knocte)
none Details | Review
Alternative patch v2 (11.75 KB, patch)
2013-01-18 23:30 UTC, Andrés G. Aragoneses (IRC: knocte)
none Details | Review
Alternative patch v3 (11.83 KB, patch)
2013-01-18 23:53 UTC, Andrés G. Aragoneses (IRC: knocte)
none Details | Review
Alternative patch v4 (13.94 KB, patch)
2013-02-15 23:02 UTC, Andrés G. Aragoneses (IRC: knocte)
none Details | Review
Alternative patch v5 (14.79 KB, patch)
2013-02-19 18:17 UTC, Andrés G. Aragoneses (IRC: knocte)
none Details | Review
Alternative Patch v6 (15.28 KB, patch)
2013-02-20 10:14 UTC, Phil Trimble
none Details | Review
Alternative Patch v7 (14.80 KB, patch)
2013-02-22 03:21 UTC, Phil Trimble
committed Details | Review

Description V_oron 2012-11-25 10:29:38 UTC
Created attachment 229805 [details]
Banshee log-file

Banshee crashes at serial playing of 2 files: first "02 - Muse - Apocalypse Please.mp3" and second "03 - Muse - Time Is Running Out.mp3". Due to huge size I do not attach them here and paste the URLs to third-party service only. There is no problems in case of Last.fm scrobbling switched off.
Comment 1 V_oron 2012-11-25 10:33:06 UTC
The mp3-files are available here: http://depositfiles.com/files/bcubjh31w
Comment 2 Phil Trimble 2012-11-25 21:54:40 UTC
Thanks for the bug report! The final part of your log seems to be similar to this bug: https://bugzilla.gnome.org/show_bug.cgi?id=689016

However, the other parts of your log are different. I'm not sure if the root causes are different but the exception is thrown when LastFMRequest attempts to deserialize the response from LastFM. We are receiving some character we don't expect and we throw an error and we aren't handling the exception as we probably should.

Could you please try to recreate the error with debug enabled and attach that log file? You can follow the instructions here to obtain a debug log: https://live.gnome.org/Banshee/CommonQuestions/Logs
Comment 3 V_oron 2012-11-26 18:39:25 UTC
Created attachment 229930 [details]
Banshee debug log-file
Comment 4 V_oron 2012-11-26 18:41:46 UTC
Created attachment 229931 [details]
Scrobbler queue XML-file
Comment 5 Phil Trimble 2012-11-26 18:43:28 UTC
Thanks! This is perfect. It looks like this is the same issue as the other bug I linked. I don't know exactly what is causing this yet but I have enough to start digging. I'll let you know what I find.
Comment 6 V_oron 2012-11-26 18:43:51 UTC
Phil, thank you for your reply and your work on open software! :)

>> Could you please try to recreate the error with debug enabled and attach that
log file?
Yes, of course! Please find attached the debug log-file and the scrobbler queue XML-file. As I understand, I definitely get the same exception as another Russian guy have (bug 689016). But this error may be not unique.
Comment 7 Phil Trimble 2012-11-28 10:20:34 UTC
Okay, I am definitely able to reproduce using just your queue XML file. My guess is this: since your songs are in....Russian? Sorry, I'll plead ignorance on what language this is, I'm an uncultured American. I'll assume that I'm right based on what you said. :)

Anyway, since they are in a non-english language we convert them into something that can go over the Post message to Last.fm. When we do that it converts your song info to something like this:

"method=track.scrobble&timestamp[0]=1353834603&track[0]=04%20-%20%D0%9E%D1%82%D0%BA%D1%80%D1%8B%D1%82%D0%B8%D0%B5&artist[0]=%D0%A7%D0%B0%D0%B9%D1%84"

As you can see the string can get very long since it has to do this for all the characters for each song. This is why I think that we are seeing this bug report from two people with non-english song titles in such a short timespan. This is also why I don't personally see the issue with my english song info: I just never hit whatever limit Last.fm has in their Apache setup for Post data.

Unfortunately I don't know of another fast fix other than just dropping the size of the submissions we do to 20 at a time. I did that as a rough test and it solved the crash for both bugs. That's not a great solution, though...what if someone has a bunch of really long song titles? Won't we run into the same issue in the future? But I don't know what else to do since there seems to be some overall Apache limit to what it can accept. I can come up with a more complicated setup where we gradually lower the number of songs to submit upon failures but that seems like a lot of work. I would like some feedback on that from other developers. That's point one.

The second point is how we handle this error in the future. The reason that Banshee hard crashes is that it expects back a Json object but the actual error we get (the 'URI too large' error) is not Json, Last.fm spits back HTML from Apache itself. I also have a fix for that that simply handles the resulting Hyena.Json exception so we don't crash. By 'handles' it just catches the error and prints the messages to the log. It's just a crash-saving measure, it doesn't attempt to recover. 

Both of these fixes work but I would like some input from others before we commit. I'll submit two patches to address both issues sometime tonight or tomorrow to kick off discussion on the right path forward. Again, developer feedback welcome.
Comment 8 Andrés G. Aragoneses (IRC: knocte) 2012-11-28 10:38:21 UTC
(In reply to comment #7)
> ...
> Unfortunately I don't know of another fast fix other than just dropping the
> size of the submissions we do to 20 at a time. I did that as a rough test and
> it solved the crash for both bugs. That's not a great solution, though...what
> if someone has a bunch of really long song titles? Won't we run into the same
> issue in the future? But I don't know what else to do since there seems to be
> some overall Apache limit to what it can accept. I can come up with a more
> complicated setup where we gradually lower the number of songs to submit upon
> failures but that seems like a lot of work. I would like some feedback on that
> from other developers. That's point one.

How about finding out where's the limit? If we figure that out, let's say, the limit is 100; then Banshee could send N songs because we could know that, if it sent N+1 it would surpass the limit (or, to be safe, a certain threshold from the limit, how about 80%? so then, if N+1 > 80, send N).

Does that make sense?

> The second point is how we handle this error in the future. The reason that
> Banshee hard crashes is that it expects back a Json object but the actual error
> we get (the 'URI too large' error) is not Json, Last.fm spits back HTML from
> Apache itself. I also have a fix for that that simply handles the resulting
> Hyena.Json exception so we don't crash. By 'handles' it just catches the error
> and prints the messages to the log. It's just a crash-saving measure, it
> doesn't attempt to recover. 

Sounds good to me.
Comment 9 Andrés G. Aragoneses (IRC: knocte) 2012-11-28 10:50:58 UTC
*** Bug 689016 has been marked as a duplicate of this bug. ***
Comment 10 Phil Trimble 2012-11-28 10:53:36 UTC
Created attachment 230076 [details] [review]
Improved Last.fm response error handling

Hey Andrés! Thanks for helping out, I always appreciate your help. :)

Here is the submission for error handling when processing the Last.fm response. This is straightforward but (as you know) I'm open to any improvement suggestions. It just catches exceptions from the LastFmRequest object and logs a message. This 'solves' the crashes, at least, with both bugs. 

Another patch is coming up after I look into your last comment.
Comment 11 Phil Trimble 2012-11-28 11:30:54 UTC
Here is my concern about finding the overall limit: the limit is by character and not by overall track count. Last.fm says that it can handle up to 50 songs per submission but it seems that Last.fm's Apache config has a limit on the number of characters it can receive per POST request. So we can't say 'only 30 tracks at a time'. That's actually what we do now and it's resulting in this bug.

We could do what you say for the overall character limit but then if Apache/Last.fm ever changes its overall POST character limit we would be in the same boat. We would ultimately be at the mercy of Last.fm if they ever change their limit.

I see two possibilities:

1)  We just set it at 20 tracks per request and see if it is low enough to handle our scenarios. This is a fast change but doesn't address the root cause. If a Banshee user has really long song/artist tags then we might see this error again. This *should* be rare but this proposed fix is not a guarantee.

2) We could write something into the Last.fm connection and queue classes to recognize when there was some kind of error and lower the submission count temporarily. For example, we default to 30 tracks at a time but if we receive an error we lower by 5 each time until it works and then, after a set period, we put it back to the default. This is more complexity in these two classes, though, which is why I suggested simply lowering the number of submissions at a time to 20.

The second solution is more robust overall but I'm not sure it is worth the complexity. Thoughts?

If we have any developer contacts with Last.fm I am also willing to follow up with them. Maybe there is something I am missing...I can't believe that we are the first ones to experience this error. It might be worthwhile to see what they  think.
Comment 12 Andrés G. Aragoneses (IRC: knocte) 2012-11-28 22:23:32 UTC
(In reply to comment #11)
> Here is my concern about finding the overall limit: the limit is by character
> and not by overall track count. 

In my previous comment, I wasn't proposing to find the trackcount limit, but the character limit.


> We could do what you say for the overall character limit but then if
> Apache/Last.fm ever changes its overall POST character limit we would be in 
> the same boat. 
> We would ultimately be at the mercy of Last.fm if they ever change
> their limit.

Yes, but putting the current Lastfm/Apache limit now on Banshee's source code is better, today, than not putting it. Don't let the perfect be the enemy of the working :)

> 
> I see two possibilities:
> 
> 1)  We just set it at 20 tracks per request and see if it is low enough to
> handle our scenarios. This is a fast change but doesn't address the root cause.
> If a Banshee user has really long song/artist tags then we might see this error
> again. This *should* be rare but this proposed fix is not a guarantee.

That's better than now, but it's shooting in the dark. :)


> 2) We could write something into the Last.fm connection and queue classes to
> recognize when there was some kind of error and lower the submission count
> temporarily. For example, we default to 30 tracks at a time but if we receive
> an error we lower by 5 each time until it works and then, after a set period,
> we put it back to the default. This is more complexity in these two classes,
> though, which is why I suggested simply lowering the number of submissions at a
> time to 20.
> 
> The second solution is more robust overall but I'm not sure it is worth the
> complexity. Thoughts?

I think the second is not worth it, I mean, it's the "perfect" solution, but yeah, do you have so much time on your hands? I would advocate, again, to have a middle solution between 1 and 2 by finding out what's the character limit today. We can lower it more in the future if it breaks again (because we'll get bug reports, or maybe automatic reports if I ever finish that feature...).


> If we have any developer contacts with Last.fm I am also willing to follow up
> with them. Maybe there is something I am missing...I can't believe that we are
> the first ones to experience this error. It might be worthwhile to see what
> they  think.

That's a good idea too.

Go and do whatever is a more interesting itch to you ;)
Comment 13 Andrés G. Aragoneses (IRC: knocte) 2012-11-28 22:43:42 UTC
Comment on attachment 230076 [details] [review]
Improved Last.fm response error handling

Patch looks fine, just a couple of questions:

> diff --git a/src/Libraries/Lastfm/Lastfm/AudioscrobblerConnection.cs b/src/Libraries/Lastfm/Lastfm/AudioscrobblerConnection.cs
> index 4599bbf..7f9e25d 100644
> --- a/src/Libraries/Lastfm/Lastfm/AudioscrobblerConnection.cs
> +++ b/src/Libraries/Lastfm/Lastfm/AudioscrobblerConnection.cs
> @@ -248,8 +248,19 @@ namespace Lastfm
>                  return;
>              }
>  
> -            var response = current_scrobble_request.GetResponseObject ();
> -            var error = current_scrobble_request.GetError ();
> +            // The request might have completed but if we receive a weird response
> +            // (for example, some non-json response) then we need to handle it.
> +            JsonObject response = null;
> +            StationError error = StationError.Unknown;
> +            try {
> +                response = current_scrobble_request.GetResponseObject ();
> +                error = current_scrobble_request.GetError ();
> +            } catch (Exception e) {

Are you only dealing with the non-JSON response problem here? If yes, then I guess it happens only in one of those 2 lines (response= or error=), which one is it? And: should you then only put the try-catch around only one?

BTW, what's the exact type of the exception being thrown here? It's a better practice to be specific, in order to not swallow exceptions that we didn't know about.
Comment 14 Phil Trimble 2012-12-06 05:38:23 UTC
Created attachment 230870 [details] [review]
Improved Last.fm response error handling [v2]

Andrés, I have made some changes based on your questions. Only the 'GetResponseObject ()' method needs to be checked in the try/catch. It is impossible for us to get past that first method and then encounter this specific exception in 'GetError ()'. I have modified the try/catch to only reference the first call.

In addition, the exact exception thrown by Hyena.Json is 'ApplicationException'. I've modified the catch to address that specific error so we don't gobble up other exceptions unknowingly.

This only addresses the crash caused by not handling the exception, not the root cause. I will work on a solution to avoid the '414' error altogether and hopefully have an additional patch in the near future.
Comment 15 Andrés G. Aragoneses (IRC: knocte) 2013-01-15 00:19:44 UTC
Comment on attachment 230870 [details] [review]
Improved Last.fm response error handling [v2]

(In reply to comment #14)
> Created an attachment (id=230870) [details] [review]
> Improved Last.fm response error handling [v2]
> 
> Andrés, I have made some changes based on your questions. Only the
> 'GetResponseObject ()' method needs to be checked in the try/catch. It is
> impossible for us to get past that first method and then encounter this
> specific exception in 'GetError ()'. I have modified the try/catch to only
> reference the first call.

Thanks!

> 
> In addition, the exact exception thrown by Hyena.Json is
> 'ApplicationException'. I've modified the catch to address that specific error
> so we don't gobble up other exceptions unknowingly.

I've changed my mind on this and kept what you did in v1, because:
- The exception is not swallowed anyway, gets logged.
- No matter the type of the exception, we know that it failed.
- To be consistent with code a bit above, which also catches Exception
- Because ApplicationException being thrown by Lastfm lib is still not specific enough (to be completely correct we should subclass ApplicationException and throw a specific exception; but this is not done in the whole library so we would be inconsistent with itself).


> This only addresses the crash caused by not handling the exception, not the
> root cause. I will work on a solution to avoid the '414' error altogether and
> hopefully have an additional patch in the near future.

Sounds good! Looking forward to next patches. Sorry for the long delay (and happy new year!)

Committed: http://git.gnome.org/browse/banshee/commit/?id=4ea9b5cacf0b3ff858eb1d4b1830326fa07d571e
Comment 16 Phil Trimble 2013-01-18 07:17:50 UTC
Created attachment 233731 [details] [review]
Gracefully reduce submission count upon failures

This is my first attempt to solve the underlying bug. 

I found that the only reliable way to handle the '414' error was to gracefully reduce the number of songs we submit upon receiving a failure. I tried reducing just the POST length but since we internally organize by song parameters it was painful. It felt much easier to handle things by song than it did by pure POST character count.

I found that it was relatively easy to have the LastFmRequest report a failure when it reached a max POST size and then have the connection class record it as a hard failure. I then updated the LastFM queue class to accept the number of hard failures as a parameter so it could reduce the number of songs it selected to submit.

This feels like a solid, graceful solution to the problem. We can modify the magic numbers as necessary in the future. I tested these changes against the submitted XML queue files and in each case the errors were caught, the number of submissions was reduced, and all songs were eventually submitted successfully to Last.FM. For a regular user it should all be transparent.

Suggestions (and testers) always welcome!
Comment 17 Andrés G. Aragoneses (IRC: knocte) 2013-01-18 23:14:26 UTC
Created attachment 233808 [details] [review]
Alternative patch


Hey Phil!

(In reply to comment #16)
> I found that the only reliable way to handle the '414' error was to gracefully
> reduce the number of songs we submit upon receiving a failure. I tried reducing
> just the POST length but since we internally organize by song parameters it was
> painful. It felt much easier to handle things by song than it did by pure POST
> character count.

By looking at the code, I can understand this statements.


> Suggestions (and testers) always welcome!

However, something in me told me that there should be a better (less hacky) way.

And I came up with an idea, which I wasn't sure if it was implementable. So then I went ahead and I tried to implement it, and this is the result.

Let me know what you think! And if you like it, please, test it, complete it (because if I haven't tested it I guess I must have missed something), and I'll commit it keeping attribution to your name (since you were the first thinking about the problem!).
Comment 18 Andrés G. Aragoneses (IRC: knocte) 2013-01-18 23:30:28 UTC
Created attachment 233811 [details] [review]
Alternative patch v2

Oops, I noticed a typo in previous patch.
Comment 19 Andrés G. Aragoneses (IRC: knocte) 2013-01-18 23:53:53 UTC
Created attachment 233813 [details] [review]
Alternative patch v3

I keep finding issues with the patch, this new version adds a null check.
Comment 20 Phil Trimble 2013-01-22 05:22:50 UTC
Hi Andrés, I applied your patch and tested it against the XML files provided but I am receiving errors when playing tracks. They are 'Invalid Method - No method with that name in this package' errors, which I think come back from Last.FM. I am investigating but don't know the root cause yet. It seems that the POST message we build to send to Last.FM is not being generated correctly with your solution. I think I am close to understanding your train of thought but something is still off and the 'method' of the POST message isn't being set as we expect.

Could you explain your thinking with the current patch? There is quite a lot of code added with your patch and to be honest I don't understand why my patch is sub-optimal. As far as I see my patch keeps each class dependent only on itself and doesn't break any calls between classes (at least any more than yours does). I think we are close to each other's thinking but I don't understand your concerns with my patch versus your patch. 

I'm not really attached to my solution so if you can elaborate on your concerns and convince me I am willing to dig into your solution. In the meantime I will continue looking at your solution and trying to figure out why things aren't meshing up.
Comment 21 Phil Trimble 2013-01-25 06:51:15 UTC
Okay, I think I have figured out the first issue. It seems that originally the 'Post()' method in LastFmRequest added a '?' to the POST message but now you are adding it in the newly-added 'Init()' method. By removing it from 'Post()' I am able to circumvent the error. 'NowPlaying' seems to work again but the original 414 error still occurs upon a scrobble submission with the test XML attached to this bug report. I don't see how your patch addresses the root problem.

I have two questions surrounding your counter-patch:

1)  Based on my reading you have removed the 'max scrobble per request' limit. Last.FM states that they cannot accept more than 50 tracks at a time. It looks to me like this patch just submits however many scrobbles are in the queue at the time. Is this intentional on your part? Won't we receive errors from last.fm if we submit, say, 100 songs at once?

2) I don't understand how this patch addresses the original 414 error. Don't we need to reduce the number of submitted songs on subsequent requests? It seems like this patch just continues to send the entire queue over and over, which would results in 414 errors over and over. I don't see any reduction logic and I don't understand how it circumvents this error scenario.

Can you help me understand why you feel my patch isn't the right path? And if we go with your patch, don't we need to add some kind of graceful reduction in the number of submissions with each failed POST request to eliminate the repeated errors?
Comment 22 Andrés G. Aragoneses (IRC: knocte) 2013-01-27 22:27:48 UTC
(In reply to comment #20)
> Hi Andrés, I applied your patch and tested it against the XML files provided
> but I am receiving errors when playing tracks. They are 'Invalid Method - No
> method with that name in this package' errors, which I think come back from
> Last.FM. I am investigating but don't know the root cause yet. It seems that
> the POST message we build to send to Last.FM is not being generated correctly
> with your solution.

Bummer, well, sorry, I should have added some unit tests and make sure they were green witht both the old version of the class and after my patch is applied.


(In reply to comment #21)
> Okay, I think I have figured out the first issue. It seems that originally the
> 'Post()' method in LastFmRequest added a '?' to the POST message but now you
> are adding it in the newly-added 'Init()' method. By removing it from 'Post()'
> I am able to circumvent the error.

Cool, thanks for looking into this!


> 'NowPlaying' seems to work again but the
> original 414 error still occurs upon a scrobble submission with the test XML
> attached to this bug report. I don't see how your patch addresses the root
> problem.
> 
> I have two questions surrounding your counter-patch:
> 
> 1)  Based on my reading you have removed the 'max scrobble per request' limit.
> Last.FM states that they cannot accept more than 50 tracks at a time. It looks
> to me like this patch just submits however many scrobbles are in the queue at
> the time. Is this intentional on your part? Won't we receive errors from
> last.fm if we submit, say, 100 songs at once?

True, I missed this when cooking my patch, we should add this limit again.


> 
> 2) I don't understand how this patch addresses the original 414 error. Don't we
> need to reduce the number of submitted songs on subsequent requests? It seems
> like this patch just continues to send the entire queue over and over, which
> would results in 414 errors over and over. I don't see any reduction logic and
> I don't understand how it circumvents this error scenario.
> 
> Can you help me understand why you feel my patch isn't the right path? And if
> we go with your patch, don't we need to add some kind of graceful reduction in
> the number of submissions with each failed POST request to eliminate the
> repeated errors?

Well, let me try to explain. Without any of our patches, we have one limit: 50 songs. But then there is also the URL-length limit (which we're trying to fix here). The problem with this limit is that the way LastFmRequest class works, you can only find out if you surpassed the limit when you're finished constructing the LastFmRequest object and you're about to send it: that's why you implemented the retry logic.

But my patch tries to avoid the retry logic being a bit more efficient: we construct the URL (or the post data) on the fly while parameters are being added. Also, being able to specify N parameters in one method call allows us to do in a transaction-way: if the parameters make the request too large, an exception is thrown and none of them are added. This let us still use the LastFmRequest object even if the MaxSizeExceededException has been thrown.

It is more efficient, then, because if the MaxSizeExceededException is thrown we don't need to start again constructing a new LastFmRequest object. Do you see what I mean?

The fact that with my patch the 414 is still thrown means that: a) I missed something, so I'll unit tests to find out, OR b) the MAX_POST_LENGTH you calculated by your testing is too high.

Let me know if you have any questions, or concerns. Sorry for the delay. I'll rework my patch and post it again soon.
Comment 23 Phil Trimble 2013-01-31 04:53:25 UTC
Ah, makes sense now! I think you and I are on the same page at this point. Thanks for explaining.

I think our 'disagreement', if you even want to call it that, is whether the extra LastFmRequest that my solution sends out is too high of a price to pay for a failed request. In my opinion this kind of error happens rarely enough that I am willing to pay the price but I do see and understand your position. Your solution also cuts down on the parameter passing between classes, which is good, but in my view these classes are already joined at the hip so the extra parameter to Queue.cs is a price I am willing to pay.

With all that said, I do support your solution since it is cleaner. I'm not attached enough to my patch to really push for it. :)

I'll hang tight as a tester for your updates. Let me know if you want me to dig around on any particular aspect of this bug.

Thanks Andrés!
Comment 24 Andrés G. Aragoneses (IRC: knocte) 2013-02-15 01:22:37 UTC
*** Bug 693844 has been marked as a duplicate of this bug. ***
Comment 25 Andrés G. Aragoneses (IRC: knocte) 2013-02-15 23:02:12 UTC
Created attachment 236318 [details] [review]
Alternative patch v4

(In reply to comment #23)
> Ah, makes sense now! I think you and I are on the same page at this point.
> Thanks for explaining.
> 
> I think our 'disagreement', if you even want to call it that, is whether the
> extra LastFmRequest that my solution sends out is too high of a price to pay
> for a failed request. In my opinion this kind of error happens rarely enough
> that I am willing to pay the price but I do see and understand your position.

Yeah, from a performance point of view it's not really a super gain, but a micro-optimization; agreed. But it's more towards the readability of the whole thing.


> Your solution also cuts down on the parameter passing between classes, which is
> good, but in my view these classes are already joined at the hip so the extra
> parameter to Queue.cs is a price I am willing to pay.
> 
> With all that said, I do support your solution since it is cleaner. I'm not
> attached enough to my patch to really push for it. :)
> 
> I'll hang tight as a tester for your updates. Let me know if you want me to dig
> around on any particular aspect of this bug.

I added a bunch of unit tests for LastfmRequest, which let me identify the issues with my patch. Now I fixed them and it should be ready to go! Can you test again? Cheers Phil
Comment 26 Phil Trimble 2013-02-19 07:48:38 UTC
Hi Andrés,

Not a problem! I've taken a look and I have some questions. I'll list them out below:

1) It seems that in LastFmRequest#CheckSize method you have the line 'target_length = 9 + 32'. I am fairly certain you meant this to be 'target_length += 9 + 32', which is what you had in previous versions of this patch. As it stands now the check to see if we are over the length limit is never is hit because it is always checking '9 + 32' instead of 'overall_string_length+9+32'. This means that the max size exception is never triggered.

2) In the new 'AddParameters(NameValueCollection params)' method I see a naming conflict. I noticed it when the logging statement in AudioScrobblerConnection never seemed to print out what we were sending to Last.FM. This is caused by the fact that the LastFmRequest class private variable is named 'parameters' while your local method variable is also named 'parameters'. When you call 'parameters.Add (key, parameters [key])' in that method isn't it adding the parameter values to itself instead of the private class-level variable?

This doesn't affect what we send to Last.FM since we are building the request string via the 'incremental_data' StringBuilder object but it does have other consequences, like mistaken logging. Either we need to call 'this.parameters.Add' or we should change variable names in that method to remove the conflict.

3) Once I fixed the above two issues I ran into another larger issue, one that I have not fixed yet. I was seeing the following error when I was running my text XML files:

>>[20 Debug 01:03:26.076] Last.fm scrobbler sending 'track.scrobble
>>	sk=2dee58057a58816132467962596f3682
>>	format=json'
>>[21 Warn  01:03:26.574] Lastfm error InvalidParameters : Cannot submit a batch 
>>of no scrobbles

I see the above error over and over again once it pops up. It took me a while but as far as I can tell the following is happening. Bear with me as I am thinking this out as I am writing it, hopefully it helps us both figure out the root cause:

* In AudioScrobblerConnection a submission is started. The request is built using the new logic you have added. The main incremental_data string is built in LastFmRequest.
* Before the final string is finished, the AudioScrobblerConnection 'next_interval' is up, which causes a new round of the method 'TransmitQueue()' to kick off. This creates a new 'LastFmRequest' object and attempts to go against the Queue class. The 'queue.GetNextTrack()' method tries to return the next song from the queue.
* Since the Queue class has been modified using the 'next' variable you added we run into a problem. The 'next' integer has not been reset yet because 'RemoveRange()' has not been called due to the next_interval interruption. When 'GetNextTrack()' is called it returns null because 'next' is still above the total number of tracks in the queue. This results in a loop where the Queue class never resets the 'next' variable and it never sends back a track to the AudioScrobblerConnection class.

The odd part of all of this is that most of the tracks are sent in my tests but it always stalls out at a certain point no matter which test XML file I use. I don't know why this is. It feels like a timing issue. I can't say for sure yet, this is as far I have been able to get. I can't prove my theory just yet.

I will continue to look at it but I wanted you to see what I have seen so you aren't waiting. I'll let you know when I have more!
Comment 27 Andrés G. Aragoneses (IRC: knocte) 2013-02-19 17:48:10 UTC
(In reply to comment #26)
> ...

Awesome work on catching those issues Phil.


> I will continue to look at it but I wanted you to see what I have seen so you
> aren't waiting. I'll let you know when I have more!

Thanks, it might be I overlooked something, especially regarding the class that got the GetTracks() -> GetNextTrack() change. Maybe I missed some "next = 0" reset?

I'll have a look around too, but can't promise any success!
Comment 28 Andrés G. Aragoneses (IRC: knocte) 2013-02-19 18:17:57 UTC
Created attachment 236812 [details] [review]
Alternative patch v5

For the record, the reviewed patch after addressing Phil's concerns.
Comment 29 Phil Trimble 2013-02-20 10:14:37 UTC
Created attachment 236904 [details] [review]
Alternative Patch v6

Good news! I am pretty sure I figured out the root problem and I have an explanation and solution. I was sort of close with my initial analysis but not quite right. 

The problem is twofold but starts with this snippet of code in Queue#GetNextTrack():

>>if (next > queue.Count) {
>>  return null;
>>}

This will cause an ArgumentOutOfRangeException. It should be 'greater than or equal to' since the queue starts at zero. For example, if the queue.Count is 10 and 'next' is 10 then it should return null since an array of count '10' has elements 0 through 9. The current setup, however, would try to return the element at 10, which doesn't exist so we should see an exception.

Here is the key part, though: in AudioScrobblerConnection we use System.Timers.Timer () to control the tick interval to check the Last.FM Connection state (i.e. transmitting, idle, etc). From what I can see the Timer eats all exceptions thrown within it and doesn't return it to the main app to be processed. So we hit the limit, the exception is eaten, and then we try to build a new request on the next interval. This is because the current 'state' is 'NeedsTransmit'. It doesn't know that we are trying to build a request but have not submitted yet, it simply initiates another request. Now the 'next' variable is over 'queue.Count' and we return 'null' each time. This causes the weird error loop I mentioned previously.

The fix is to simply do 'if (next >= queue.Count)'. I tested it out with your latest patch and things look golden.

In addition, I would like to propose a change to how the current connection State is handled. Right now we have the following states: Idle, NeedsTransmit, Transmitting, WaitingForResponse. I think we either need to move the 'Transmitting' state to earlier in the TransmitQueue() method or create a new State named something like 'BuildingRequest' to denote that we are currently in the process of generating a request to be transmitted. 

If we do it this way then we would wait for the request to be built, transmitted, and the response to be processed before trying to build a new request. This would hopefully eliminate any race conditions like the one we just experienced, where a new request is built before Queue.cs is ready. I have attached an updated patch that is essentially your patch with the two additions I mentioned. Thoughts?

By the way, your name should really be on this patch since this is mostly your code. Unfortunately I am not git-savvy enough to know how to put your name on what I am attaching but please do change it. :)
Comment 30 Andrés G. Aragoneses (IRC: knocte) 2013-02-20 18:33:01 UTC
(In reply to comment #29)
> Created an attachment (id=236904) [details] [review]
> Alternative Patch v6
> 
> Good news! I am pretty sure I figured out the root problem and I have an
> explanation and solution. I was sort of close with my initial analysis but not
> quite right. 
> 
> The problem is twofold but starts with this snippet of code in
> Queue#GetNextTrack():
> 
> >>if (next > queue.Count) {
> >>  return null;
> >>}
> 
> This will cause an ArgumentOutOfRangeException. It should be 'greater than or
> equal to' since the queue starts at zero. For example, if the queue.Count is 10
> and 'next' is 10 then it should return null since an array of count '10' has
> elements 0 through 9. The current setup, however, would try to return the
> element at 10, which doesn't exist so we should see an exception.
> 
> Here is the key part, though: in AudioScrobblerConnection we use
> System.Timers.Timer () to control the tick interval to check the Last.FM
> Connection state (i.e. transmitting, idle, etc). From what I can see the Timer
> eats all exceptions thrown within it and doesn't return it to the main app to
> be processed. So we hit the limit, the exception is eaten, and then we try to
> build a new request on the next interval. This is because the current 'state'
> is 'NeedsTransmit'. It doesn't know that we are trying to build a request but
> have not submitted yet, it simply initiates another request. Now the 'next'
> variable is over 'queue.Count' and we return 'null' each time. This causes the
> weird error loop I mentioned previously.
> 
> The fix is to simply do 'if (next >= queue.Count)'. I tested it out with your
> latest patch and things look golden.

Awesome, thanks for debugging this!


> In addition, I would like to propose a change to how the current connection
> State is handled. Right now we have the following states: Idle, NeedsTransmit,
> Transmitting, WaitingForResponse. I think we either need to move the
> 'Transmitting' state to earlier in the TransmitQueue() method or create a new
> State named something like 'BuildingRequest' to denote that we are currently in
> the process of generating a request to be transmitted. 

Wait a sec, but the time dedicated to transmit the message will always be much longer than the one building the request, so why don't we simply set it to 'Transmitting' just before we start building the request? (Because, effectively, we don't need distinguish between Transmitting and BuildingRequest, so it's an unneeded state.)


> If we do it this way then we would wait for the request to be built,
> transmitted, and the response to be processed before trying to build a new
> request. This would hopefully eliminate any race conditions like the one we
> just experienced, where a new request is built before Queue.cs is ready. I have
> attached an updated patch that is essentially your patch with the two additions
> I mentioned. Thoughts?

Is this race condition present in current banshee (I mean, without this patch applied)? If yes, then we should fix it in a separate commit.


> By the way, your name should really be on this patch since this is mostly your
> code.

Nah, it's fine. It's a collaboration, and if we don't put your name on it you lose the incentive of contributing :)


> Unfortunately I am not git-savvy enough to know how to put your name on
> what I am attaching but please do change it. :)

FYI: https://live.gnome.org/Git/Developers#Committing_on_behalf_of_a_contributor
Comment 31 Phil Trimble 2013-02-22 03:21:36 UTC
Created attachment 237138 [details] [review]
Alternative Patch v7

Here is the newest patch based on your response, Andrés. I ran it against all of my test cases and things look good.

> Wait a sec, but the time dedicated to transmit the message will always be much
> longer than the one building the request, so why don't we simply set it to
> 'Transmitting' just before we start building the request? (Because,
> effectively, we don't need distinguish between Transmitting and
> BuildingRequest, so it's an unneeded state.)

Okay, that's fine. I moved the 'Transmitting' state change to above the building of the request and removed 'BuildingRequest' from the available states. 

This change belongs in this patch and not a separate one (see below). 
  
> Is this race condition present in current banshee (I mean, without this patch
> applied)? If yes, then we should fix it in a separate commit.

This race condition is not present in the current codebase. Previously the Queue.cs class did not keep track of the queue position between calls to it; it would just return a large batch of queue elements in each call. It's only with the addition of the 'next' variable and associated logic that there is even the possibility that we might not completely build and transmit the request before a new request is initiated. Because of this we should include it with this patch.

Lastly, I tested some more and slightly lowered the max POST size limit from 7300 to 7000. Based on my testing 7300 was too close to whatever max they have and sometimes still hit the error. Better to not push things.

Thanks Andrés!
Comment 32 Phil Trimble 2013-02-22 04:05:22 UTC
*** Bug 694039 has been marked as a duplicate of this bug. ***
Comment 33 Andrés G. Aragoneses (IRC: knocte) 2013-02-22 09:16:45 UTC
Comment on attachment 237138 [details] [review]
Alternative Patch v7

ok, pushed
Comment 34 Andrés G. Aragoneses (IRC: knocte) 2013-02-22 09:17:04 UTC
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.
Comment 35 Andrés G. Aragoneses (IRC: knocte) 2013-02-24 20:11:47 UTC
*** Bug 694521 has been marked as a duplicate of this bug. ***
Comment 36 Andrés G. Aragoneses (IRC: knocte) 2013-04-24 10:06:59 UTC
*** Bug 698650 has been marked as a duplicate of this bug. ***