GNOME Bugzilla – Bug 116523
Old pages should be redirected
Last modified: 2008-03-17 17:12:19 UTC
Hello, I think old pages should still be valid somehow. For example, http://www.gimp.org/the_gimp_screenshots.html is working, but not http://mmmaybe.gimp.org/the_gimp_screenshots.html URL shouldn't change over time :)
I just did a quick web search and I found a few hundred links to http://www.gimp.org/download.html, several links to stable_ver.html and devel_ver.html. A few other pages seemed to be well linked: tutorials.html, mailing_list.html, the_gimp_about.html, the screenshots page that you listed and a few documentation pages.
Here is this page fro; the W3C explaining why URIs shouldn't change: http://www.w3.org/Provider/Style/URI.html
By the way, I had a quick look at the access log on www.gimp.org and I saw something suprising: most of the visitors following a link from an external site do not go to the home page, but go directly to some other page. Here are the stats from today's access log: * 344855 links from local web pages (including images, etc.) * 36147 links without usable referer (bookmarks, hidden by UA, etc.) * 10224 links from external web pages among which: - 3379 links to the home page - 6845 links to other pages
We should come up with a good list of important links that need preservation and cause redirects to new pages in Apache land when the time comes.
I agree. I had a look at various logfile parsers, but I could not find one that gave a good list of the most common entry points without consuming too much CPU time or disk space. For example, AWstats does almost what I need, but it is way too heavy for Wilber, IMHO. So I have just started writing a short Perl script that should do the job and help us to know the most important URLs in the current site.
Well, I wrote the script. You can find it here: http://www.gimp.org/~raphael/check-wgo-referers.pl For those who have an account on the web server (wilber), you can also find the script in ~raphael/bin/. I ran the script on the access log and I found that there are more external referers than I thought. There are also many sites linking directly to the images. The script can display the details of the referers (so that one can see how many are from Google, for example) but I have skipped it for the output below: Number of requests in the log file: 396594 / 396594. Details for each file linked from an external site: 3419 / 1165 /~tml/gimp/win32/ 797 /win32/ 753 /icons/gfx_by_gimp.gif 576 /~tml/gimp/win32/downloads.html 553 /icons/the_gimp_corner.gif 270 /download.html 224 /the_gimp_screenshots.html 122 /win32/downloads.html 106 /~tml/gimp/win32 100 /docs.html 88 /icons/gimp_kobus.gif 76 /icons/frontpage-small.gif 75 /icons/sit3-shine.7.gif 71 /win32 59 /~tml/gimp/win32// 56 /~tml/gimp/win32/downloads-20030620.html 51 /icons/gimp_in_action1.jpeg 50 /icons/art_corner.gif 46 /tutorials.html 44 /win32/pappa-seal.gif 44 /icons/the_gimp_head.gif 44 /~tml/gimp/win32/gimp-1.2.3-20020310-setup.zip 39 /~tml/gimp/win32/screenshots.html 37 /~tml/gimp/win32/pappa-seal.gif 35 /icons/pad.gif 34 /icons/tube-top.gif 33 /icons/tube-button.gif 33 /~tml/gimp/win32/gimp-1.2.4-20020907.zip 33 /~tml/gimp/win32//downloads.html 33 /icons/tube-subbutton.gif 33 /icons/the_gimp_text.gif 32 /icons/tube-prev.gif 32 /icons/tube-next.gif 32 /icons/tube-up.gif 32 /icons/download_text.gif 32 /icons/links_text.gif 32 /icons/art_text.gif 32 /icons/tube-elem.gif 31 /icons/tube-mail.gif 31 /icons/tube-home.gif 31 /icons/docs_text.gif 31 /icons/data_text.gif 31 /icons/tube-button-marker.gif 29 /icons/small_tri_right.gif 28 /win32/screenshots.html 28 /gallery.html 27 /icons/the_gimp_about_text.gif 27 /icons/webmasters_text.gif 27 /stable_ver.html 26 /icons/the_gimp_system_reqs_text.gif 26 /icons/the_gimp_org_about_text.gif 26 /icons/the_gimp_screenshots_text.gif 24 /fonts.html 24 /icons/fish.jpg 23 /~sjburges/straightline/straightline.html 21 /~tml/gimp/win32/s5.jpg 21 /tut-basic.html 18 /scripts.html 18 /the_gimp.html 17 /icons/tube-subbutton-over.gif 17 /~xach/gimp-in-action-by-jimmac.jpg 16 /icons/tube-next-over.gif 16 /icons/tube-button-over.gif 16 /icons/tube-mail-over.gif 14 /gtk 13 /~tml/gimp/win32/gimp-1.2.4-20030213.zip 13 /gtk/ 12 /links.html 12 /icons/quartic_gimp.jpg 11 /tut-patt1.html 10 /devel_ver.html 10 /the_gimp_about.html (188 files with less than 10 external referers not displayed) Summary: 349652 links from local web pages (including images, etc.), 36460 links without usable referers (bookmarks, blocked by UA, etc.) 10482 links from external web pages, including: - 3419 links to the home page - 7063 links to 260 other files
I improved the analysis script and added some command-line options including a way to get HTML output. The new version can be used on other sites than www.gimp.org, so I renamed it: http://www.gimp.org/~raphael/check-referers.pl I ran it on the current log file and the one from yesterday, and I saved the HTML output: http://www.gimp.org/~raphael/wgo-referers.html However, it may be more useful to look at the list of external links without counting the search enginges such Google, especially the Google cache that links to all images in the page. I have also excluded www.saunalahti.fi, which contains Tor's pages. The result of this analysis is here: http://www.gimp.org/~raphael/wgo-referers-nosearch.html Keep in mind that this analysis covers only a bit more than a day. It would have to be repeated periodically if we want to get a better idea of the pages and images that are frequently linked directly from other sites. Anyway, after looking at this one-day snapshot, it looks like there are many pages that should be preserved or redirected: - the whole Windows part: /~tml/gimp/win32/ and /win32/. - /win32/download.html - /download.html - /the_gimp_screenshots.html - /docs.html - /stable_ver.html - /tutorials.html - /the_gimp.html (same as the home page) - and many others: /fonts.html, /tut_basic.html, /gallery.html, ... But there are also many sites that use images directly from w.g.o.: - /icons/gfx_by_gimp.gif (button "Graphics by Gimp") - /icons/the_gimp_corner.gif (Wilber) - /icons/gimp_kobus.gif (button "GIMP") - /icons/art_corner.gif (Wilber + paintbrush) Other pages contain direct links to some GIMP screenshots: - /icons/frontpage-small.gif (the splash screen) - /icons/gimp_in_action1.jpeg (large desktop screenshot) - /icons/quartic_gimp.jpg ("THE GIMP" by Quartic) - /~tml/gimp/win32/s5.jpg (Windows screenshot) All of these should be preserved, IMHO. Any file that gets more than 10 requests per day directly from some other sites is a good candidate for preservation or redirection.
> But there are also many sites that use images directly from w.g.o.: In this case it would be incredibly generous of you to maintain these links. I dont think it would be unreasonable to thedirect links to icons. I would recommend replacing direct links to images on the GIMP site to a message politely telling people which images they may copy. Not only is this the polite thing to do (rather than leeching bandwidth) it should also result in these images appearing on their page more quickly in most cases. For direct links to screenshots a 303 permanent redirect would be ideal for search engines, and would allow people to find what they are looking for but discourage people from embedding (using <img>) instead of mirroring/caching/copying (with permission) large files such as screenshots.
Changes at the request of Dave Neary on the developer mailing list. I am changing many of the bugzilla reports that have not specified a target milestone to Future milestone. Hope that is acceptable.
Basically, all web documents (HTML pages) should have a text saying that they are expired and that the new content may be on page so and so. The problem with redirects is that the referring page may raise some expectation with the visitor that the old page fulfilled and the new one doesn't. By explaining to the visitor what happened, the visitor won't get disappointed as when we redirect to a page with the wrong information. After some time, we could replace those messages with redirects.
One of the main problems is the images, especially the buttons and icons. Although any site that links directly to a file on some other site does it at its own risk (and sometimes against the will of the site owner), we have never discouraged other sites from borrowing the button "Made with GIMP" or the Wilber icon. As a result, there are several dozen sites using these images directly from www.gimp.org and they are referenced several hundred times per day. This could be solved by doing two things: - Copy a part of the /icons/ directory to the new site (i.e, commit them to CVS so that they are available as real files and not through redirects). This part would include the buttons and maybe the most requested screenshots. - Add a new section to the site, titled "Linking to us" or "GIMP icons and buttons" (better suggestions are welcome, but not "GIMP images" as this would be confusing). This page would provide the same images in PNG or JPEG format (not GIF - see bug #70221) and encourage people to copy them to their own site. After a while, we could remove the old image files from CVS. That could be near the end of 2004 or maybe 2005. P.S.: I created two new HTMLized lists of referers for today: http://www.gimp.org/~raphael/wgo-referers2.html http://www.gimp.org/~raphael/wgo-referers-nosearch2.html
I will try to come up with a list of files that should be redirected. In the meantime, here is the latest list or referers for today: http://www.gimp.org/~raphael/wgo-referers3.html http://www.gimp.org/~raphael/wgo-referers-nosearch3.html The second version of the list excludes the search engines.
I have added some of the old images to the web site, together with better versions of these images (in PNG format) and a page describing how other sites can link to us. This fixes a part of the problem. I will soon include a list of redirect statements that should solve the remaining parts of this bug. 2003-09-21 Raphael Quinet <quinet@gamers.org> [...] * about/linking.htrw: New page describing how other sites can link to us, and providing some buttons and images that can be freely copied (see bug #116523 for details). * images/wilber_the_gimp.png * images/wilber_the_gimp2.png * images/wilber_the_gimp_idx.png * images/wilber_painter.png * images/wilber_painter_idx.png * images/wilber_work.png * images/wilber_wizard.png: Several PNG images of Wilber (based on the Wilber Construction Kit) that can be freely copied and used by other sites. * images/gfx_by_gimp.png * images/gimp_free_button.png: PNG versions of old GIF images, suitable for usage by other sites linking to us. * images/old/wilber_the_gimp.gif * images/old/wilber_painter.gif * images/old/frontpage.gif * images/old/frontpage.png * images/old/frontpage_idx.png * images/old/gfx_by_gimp.gif * images/old/gimp_free_button.gif: Old GIF images that were directly linked by a large number of other sites (bug #116523). * images/old/README: Some explanations for those who get there. * images/external_404.png: New image to which old images should be redirected so that other sites linking to us see immediately that some things have changed.
Here is a list of pages and images that should be redirected, based on the most frequent external referers. I hope that the formatting will not be broken because the lines are longer than what the form can take... # Web pages that have moved (bug #116523) Redirect permanent /the_gimp.html http://www.gimp.org/ Redirect permanent /docs.html http://www.gimp.org/docs/ Redirect permanent /the_gimp_screenshots.html http://www.gimp.org/screenshots/ Redirect permanent /download.html http://www.gimp.org/downloads/ Redirect seeother /stable_ver.html http://www.gimp.org/downloads/ Redirect seeother /devel_ver.html http://www.gimp.org/downloads/ Redirect seeother /stable_src.html http://www.gimp.org/source/ Redirect seeother /devel_src.html http://www.gimp.org/source/ Redirect permanent /tutorials.html http://www.gimp.org/tutorials/ Redirect permanent /links.html http://www.gimp.org/links/ Redirect permanent /the_gimp_about.html http://www.gimp.org/about/ Redirect seeother /fonts.html http://www.gimp.org/unix/ # Images that have moved (and have been replaced by PNG) Redirect permanent /icons/gfx_by_gimp.gif http://www.gimp.org/images/old/gfx_by_gimp.gif Redirect permanent /icons/the_gimp_corner.gif http://www.gimp.org/images/old/wilber_the_gimp.gif Redirect permanent /icons/art_corner.gif http://www.gimp.org/images/old/wilber_painter.gif Redirect permanent /icons/gimp_kobus.gif http://www.gimp.org/images/old/gimp_free_button.gif Redirect permanent /icons/frontpage-small.gif http://www.gimp.org/images/old/frontpage_idx.png # Images that are gone and should be replaced by a message to the visitor Redirect seeother /icons/gimp_in_action1.jpeg http://www.gimp.org/images/external_404.png Redirect seeother /icons/sit3-shine.7.gif http://www.gimp.org/images/external_404.png Redirect seeother /icons/the_gimp_head.gif http://www.gimp.org/images/external_404.png Redirect seeother /icons/fish.jpg http://www.gimp.org/images/external_404.png # The other images in /icons/ should generate a 404 error.
Argh! This is completely broken. I have sent the list to Yosh by mail. It should work better that way. Once the list of redirects is added to httpd.conf, this bug can be closed.
I have recently created a page about the UNIX fonts, so the redirection for the fonts page can be updated: Redirect permanent /fonts.html http://www.gimp.org/unix/fonts.html Also, it would be better to use temporary redirections instead of "seeother" for the pages about the stable and development versions, because it is possible that such pages would be created later in the new site. That would be nicer for the caches and proxies: Redirect temp /stable_ver.html http://www.gimp.org/downloads/ Redirect temp /devel_ver.html http://www.gimp.org/downloads/ Redirect temp /stable_src.html http://www.gimp.org/source/ Redirect temp /devel_src.html http://www.gimp.org/source/ There are no other important bugs remaining for the new site, so let's hope that it can go live very soon... Please! ;-)
Created attachment 22717 [details] Updated list of redirects as attachment (easier to use than copying from the comments)
Changing all www.gimp.org bugs from gimp product to the gimp-web product, including old closed/fixed bugs, and reassigning.
Marking as blocker for the new site, as suggested by Shawn on the gimp-web list. I think that the list of redirects provided above in the attachment can be integrated easily into the config file for the virtual host (currently /etc/apache/virtual/apache/gimp/mmmaybe.gimp.org). By the way, I think that the name of the virtual host config file, log files and DocumentRoot should say www.gimp.org instead of mmmaybe, even if the ServerName is still mmmaybe.gimp.org for the moment.
We need to add HTTP redirects to the new site.
Given that the comments in this report a quite old, there should be a new anylysis of the logs. Does the current rate of 404 errors justify any effort in redirecting ?
this definitely isn't a blocker. as we are missing an updated log analysis here, i am going to set the status to NEEDINFO (also see comment #21), please anybody REOPEN when there are some news on this - raphael, perhaps?
This makes no sense since GIMP has got a new website.