Bug 611841 – Wikipedia: Use API

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 611841 - Wikipedia: Use API


Summary:	Wikipedia: Use API


Status:	RESOLVED WONTFIX

Product:	banshee
Classification:	Other
Component:	Other Extensions
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	1.x
Assigned To:	Banshee Maintainers
QA Contact:	Banshee Maintainers

URL:
Whiteboard:	gnome[unmaintained]

Depends on:
Blocks:

Reported:	2010-03-04 20:34 UTC by Samuel Gyger (IRC: thinkabout)
Modified:	2020-03-17 08:46 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Samuel Gyger (IRC: thinkabout) 2010-03-04 20:34:14 UTC

WikiMedia Software offers an API to access the data, why not use this?

e.g.
 - The different languages that are available: http://en.wikipedia.org/w/api.php?action=sitematrix
 - We could start a search via api: like http://en.wikipedia.org/w/api.php?action=opensearch&search=AC%2fDC
 - Which leads to the next, #582858 would be solved because we could UrlEncode the Question.

Then we could open the right Page.

More doc about API here: http://en.wikipedia.org/w/api.php

Comment 1 Gabriel Burt 2010-03-04 21:30:55 UTC

The downside of this is it requires two GET requests, right?  One to search via the api, and another to actually get the article?

Comment 2 Aaron Bockover 2010-03-04 21:39:21 UTC

I think you will find that GET requests are common on the Internet.

Comment 3 Samuel Gyger (IRC: thinkabout) 2010-03-04 22:07:45 UTC

Or even more (but I don't think that's a real problem. Even if there is a picture in the article it needs a new request. So it would be as if there was one more).

The plus: the urls aren't guessed. Although guessing works quite good.

We could also check the categories for something like music or check if they are somehow in the top category Music (in German is also such a top category, so it should be translatable) , this would remove I think 80 % (guess) of problems with ambiguity. 
Like (AC DC for Band and http://en.wikipedia.org/wiki/AC/DC_%28disambiguation%29) even if in the case of ACDC the wiki chooses the right article.

I wouldn't integrate now, but keep it on a thing to do in the nearer future. Of course your choice :D but it would increase the quality of the Output.

The question is, is it a real problem (the quality) or just something I'm building up in my mind.

The absolute wow would be if we could parse the Infoboxes on the wikipedia and get Information out of it, removing the cluttered Webpage and creating an information pane, stuffed with information from the Wiki (I'm looking forward for a semantic web :D)

So a little bit answer and a little bit dreaming,
Samuel

p.s. thx for changing importance, submitted to fast. :)

Comment 4 Aaron Bockover 2010-03-04 23:04:22 UTC

I agree... if there is a sanctioned API, it should be used instead of URL guessing and HTML scraping, etc.

If HTTP requests are a concern (and they aren't IMHO), the connection can (and probably should, regardless) be kept alive - multiple requests can be made on the same connection. I'd really hope the Wikipedia HTTP server supports this ;-)

Comment 5 André Klapper 2020-03-17 08:46:30 UTC

Banshee is not under active development anymore and had its last code changes more than three years ago. Its codebase has been archived.

Closing this report as WONTFIX as part of Bugzilla Housekeeping to reflect
reality. Please feel free to reopen this ticket (or rather transfer the project
to GNOME Gitlab, as GNOME Bugzilla is being shut down) if anyone takes the
responsibility for active development again.
See https://gitlab.gnome.org/Infrastructure/Infrastructure/issues/264 for more info.