GNOME Bugzilla – Bug 611841
Wikipedia: Use API
Last modified: 2020-03-17 08:46:30 UTC
WikiMedia Software offers an API to access the data, why not use this? e.g. - The different languages that are available: http://en.wikipedia.org/w/api.php?action=sitematrix - We could start a search via api: like http://en.wikipedia.org/w/api.php?action=opensearch&search=AC%2fDC - Which leads to the next, #582858 would be solved because we could UrlEncode the Question. Then we could open the right Page. More doc about API here: http://en.wikipedia.org/w/api.php
The downside of this is it requires two GET requests, right? One to search via the api, and another to actually get the article?
I think you will find that GET requests are common on the Internet.
Or even more (but I don't think that's a real problem. Even if there is a picture in the article it needs a new request. So it would be as if there was one more). The plus: the urls aren't guessed. Although guessing works quite good. We could also check the categories for something like music or check if they are somehow in the top category Music (in German is also such a top category, so it should be translatable) , this would remove I think 80 % (guess) of problems with ambiguity. Like (AC DC for Band and http://en.wikipedia.org/wiki/AC/DC_%28disambiguation%29) even if in the case of ACDC the wiki chooses the right article. I wouldn't integrate now, but keep it on a thing to do in the nearer future. Of course your choice :D but it would increase the quality of the Output. The question is, is it a real problem (the quality) or just something I'm building up in my mind. The absolute wow would be if we could parse the Infoboxes on the wikipedia and get Information out of it, removing the cluttered Webpage and creating an information pane, stuffed with information from the Wiki (I'm looking forward for a semantic web :D) So a little bit answer and a little bit dreaming, Samuel p.s. thx for changing importance, submitted to fast. :)
I agree... if there is a sanctioned API, it should be used instead of URL guessing and HTML scraping, etc. If HTTP requests are a concern (and they aren't IMHO), the connection can (and probably should, regardless) be kept alive - multiple requests can be made on the same connection. I'd really hope the Wikipedia HTTP server supports this ;-)
Banshee is not under active development anymore and had its last code changes more than three years ago. Its codebase has been archived. Closing this report as WONTFIX as part of Bugzilla Housekeeping to reflect reality. Please feel free to reopen this ticket (or rather transfer the project to GNOME Gitlab, as GNOME Bugzilla is being shut down) if anyone takes the responsibility for active development again. See https://gitlab.gnome.org/Infrastructure/Infrastructure/issues/264 for more info.