Bug 525094 – Grab lyrics from darklyrics.com

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 525094 - Grab lyrics from darklyrics.com


Summary:	Grab lyrics from darklyrics.com


Status:	RESOLVED FIXED

Product:	rhythmbox
Classification:	Other
Component:	Plugins (other)
Version:	HEAD
Hardware:	Other All

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	RhythmBox Maintainers
QA Contact:	RhythmBox Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2008-03-30 10:25 UTC by Edgar Luna
Modified:	2010-05-29 06:25 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Patch to grab lyrics from darklyrics.com (4.54 KB, patch) 2008-03-30 10:28 UTC, Edgar Luna	reviewed	Details \| Review
New DarkLyrics Grabber patch (7.12 KB, patch) 2010-05-29 05:06 UTC, Edgar Luna	none	Details \| Review

Description Edgar Luna 2008-03-30 10:25:57 UTC

In addition to the sites from which rhythmbox grab lyrics would be nice if rhythmbox could grab lyrics from www.darklyrics.com. A site containing high quality metal music lyrics.

I have a patch that I'll put here.

Comment 1 Edgar Luna 2008-03-30 10:28:55 UTC

Created attachment 108262 [details] [review]
Patch to grab lyrics from darklyrics.com

It only adds the engine. dosen't add the item in the plugin configuration dialog, so in order to use it, it is necesary to use gconf to add darklyrics.com to /apps/rhythmbox/plugins/lyrics/engines key.

Comment 2 Jonathan Matthew 2008-04-24 12:53:14 UTC

This looks OK to me, with the usual reservations about html-scraping; do you know if the site owners are generally OK with this sort of usage?

		# Fix artist name first with unicodedata
		#unichr(int(decomposition(u"ÿ").split()[0], 16))

Did this turn out to be unnecessary?

I'm not sure if you were involved in the recent relicensing of rhythmbox to allow GPL-incompatible gstreamer plugins, so I'll ask here just to be sure: do you agree to relicense this code?

Comment 3 Edgar Luna 2009-06-07 17:07:41 UTC

The webmaster of darklyrics.com haven't answered my two mails about the topic (one was sent when you asked me the cuestion, the second one week ago). I don't know how to take that.

The line of codes that are commented are an example of how to handle the international chars changin them to their ascii 127 version like changing ÿ to y. This is needed to the patch to be complete, but I didn't wanted to do it when I upload this patch.

And I agree in that relicensing for this or any contribution that I could make to rhythmbox.

Comment 4 Jonathan Matthew 2009-06-16 07:37:52 UTC

I think I'm OK with adding this without explicit consent as long as it's not enabled by default, especially if it's not even visible in the plugin preferences.
It'd be nice if there was a way for the site owner to identify the source of the requests, but I don't see a good way to do that.

+		pattern = '<a href="../lyrics/%s/(.*).html#([^"]+)" target="_blank"><FONT COLOR="#CCCCCC">%s</FONT></a><br>' \
+		    % (self.wartist, re.escape(self.title.title()))
+		matches = re.findall (pattern, songlist)

Instead of this, it might be better to find all the song links, extract the titles, and use the string_match utility function I added a few days ago.  This will match even when there are punctuation and spelling differences.

+		titleline = '(?ms)<a name=%s><FONT color=#DDDDDD><b>%s. %s</b></font>(.+?)<BR>\r\n<BR>' % \
+		    (self.titlenumber, self.titlenumber, re.escape(self.title.title()))
+		lyricmatch = re.split (titleline, album)

and if you do that, this should use the title that was matched in the previous step, rather than the input title.

Comment 5 Edgar Luna 2010-05-29 05:06:57 UTC

Created attachment 162256 [details] [review]
New DarkLyrics Grabber patch

I did what Jonathan Matthew recommended.

Regards

Comment 6 Jonathan Matthew 2010-05-29 06:25:11 UTC

Pushed as commit 06d9966.  Thanks for working on this.