After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 74288 - Impossible to search mail.gnome.org/archives meaningfully
Impossible to search mail.gnome.org/archives meaningfully
Status: RESOLVED FIXED
Product: website
Classification: Infrastructure
Component: mail.gnome.org
current
Other All
: Normal major
: ---
Assigned To: Ross Golder
GNOME Web maintainers
: 330578 340975 466278 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2002-03-11 23:46 UTC by Charles Kerr
Modified: 2007-10-15 21:19 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Charles Kerr 2002-03-11 23:46:49 UTC
http://mail.gnome.org/archives/ has a search form at the
top of the page.  Unfortunately when you search for something
the result page notes that the search tables were
"Last modified:  2000-08-06" so the search is useless.

Sorry if www isn't the right component for this bug; there's
not a component for mail.gnome.org.
Comment 1 Jeff Waugh 2002-04-20 19:00:11 UTC
Assigning to myself, I've done a lot of namazu stuff for other groups,
so I'll have a look at ours.
Comment 2 Murray Cumming 2003-07-25 09:34:40 UTC
This is still broken.
Comment 3 Jeff Waugh 2003-07-25 14:37:11 UTC
And the bug is still open. Surprised? :-)
Comment 4 George Karabin 2003-07-25 15:35:48 UTC
Interestingly, the "Last modified:" date has changed to  "2003-02-18",
so compared to the original bug report, the search databases is about
14 months less broken than it used to be.

2002-3-11 minus 2000-8-06 = ~ 1 year, 7 months
2003-7-25 minus 2003-2-18 = ~ 5 months

How's that for a useless statistic? :)
Comment 5 Charles Kerr 2003-07-25 20:46:08 UTC
So, in 16 months real time, the search engine has added 30 archived
months to search from.  That's an improvement of 1.875 archive months
per real-time month.

At this rate it should converge in September, and we can mark 
this as `FIXED'. :)
Comment 6 Jeff Waugh 2003-07-26 00:30:34 UTC
Charles: ha ha ha. ;-)
Comment 7 Jens Bech Madsen 2004-01-03 09:55:11 UTC
This is a serious problem. It's impossible to find anything. No wonder
people keep asking the same questions over and over again. As an
example, try searching for libcroco. It gives no results at all. I
know libcroco has been mentioned a lot at least on the garnome
mailinglist. 

And judging from this bug report is has been broken almost two years...

I looked through the cvs repository but couldn't find anything related
to this, so it's hard to make a guess at what is wrong.
Comment 8 Chris Deigan 2004-04-24 12:54:12 UTC
SLUG (http://slug.org.au/) recently switched over to htdig from namazu since it
was really not working any more. We have a cron job every morning that indexes
all the list archives, and it works quiet well. Maybe this might work well for
GNOME?
Comment 9 Quim Gil 2005-12-19 22:34:35 UTC
Now "Last modified:  2000-08-06"

The results are more recent though, at least from feb-204 the most recent.

It's been a while since this bug was filed...
Comment 10 Quim Gil 2005-12-20 09:11:55 UTC
I have contacted the tech guys behind Mailman to see if we find a solution to this ancient bug.
Comment 11 Quim Gil 2005-12-26 15:26:19 UTC
Moving to the new mail.gnome.org component.
Comment 12 Quim Gil 2006-01-08 23:44:26 UTC
GNOME sysadmins are not answering. Namazu upstream contacted:

http://www.namazu.org/trac-namazu/trac.cgi/ticket/9

I guess this is more a problem of our installation than a Namazu bug but who knows. In any case, maybe they can help.
Comment 13 Quim Gil 2006-01-10 16:50:48 UTC
Just found some comments on this issue in http://live.gnome.org/MailingLists

---------

The namazu scripts are now in sysadmin CVS in the 'namazu' module, but we're currently missing a script to keep the indexes in sync regularly. At time of writing, the indexes were last updated in August 2005.

Plans for improvement

Get namazu hooked up to scan for recent changes (perhaps every 3-4hrs, do 'find' in the archives and feed that into namazu to update the indexes).

----------------

http://live.gnome.org/SysadminToDoList

Archive search indexes not being maintained

It looks like when I set up namazu back in April 2004, I ran the initial indexing script, but never hooked up anything to keep it indexed. I've checked the scripts that I found in ~mailman/namazu into 'namazu' module in sysadmin CVS, but they only seem to handle generating the initial indexes. As yet there doesn't appear to be a script capable of periodically checking for recent content, and indexing just that. I have re-run the index generation (so all content up to 21-Aug-2005 should get indexed), and I have started a script for this, and will check it into namazu module when I've had a chance to test it. Until then, the indexes will likely remain static. 
Comment 14 Ross Golder 2006-01-12 02:21:43 UTC
Sorry, I closed the wrong bug window :(
Comment 15 Quim Gil 2006-01-12 17:00:05 UTC
We got an worksforme upstream...  :(

http://www.namazu.org/trac-namazu/trac.cgi/ticket/9

Ross, maybe if you could provide more details. I'll be happy pursuing the Namazu people in order to et help if we need it - I only need to know what to tell them.  :)
Comment 16 Ross Golder 2006-01-13 02:10:50 UTC
It's been a while since I looked at the problem, but if I recall, I couldn't find an example anywhere of how to use namazu to only index content added/changed since the last run (or how to remove content that had been removed). It's probably not a particularly difficult script to write, I just didn't have time at that point and couldn't find an example anywhere from someone who had done something similar before. Anyway, I haven't got back round to it since.
Comment 17 Quim Gil 2006-01-13 11:46:25 UTC
Opened a new request at Namazu's bugtracker:

http://www.namazu.org/trac-namazu/trac.cgi/ticket/12

I'll comment here any advance done there.
Comment 18 Quim Gil 2006-02-10 06:18:07 UTC
*** Bug 330578 has been marked as a duplicate of this bug. ***
Comment 19 Quim Gil 2006-02-10 06:21:28 UTC
Hi Ross, we got an answer from http://www.namazu.org/trac-namazu/trac.cgi/ticket/12



Mon Feb 6 14:55:59 2006: Modified by ot@zoy.org

    * resolution set to worksforme
    * status changed from new to closed

By default, mknmz (the indexing script for namazu) will do just that: add to the index documents recently added, updates the ones that have changed, and delete the ones that were removed.

just run:

mknmz /path/to/the/directory-to-index -0 /where/the/index/resides/

(update the paths as needed)
Comment 20 Ross Golder 2006-02-11 05:57:40 UTC
I've added the 'genindex.sh' script onto cron, to run daily. I've run it once manually too. It seems to now have content from January 2006, as you can see by searching for 'Jan 2006', but only for a few lists (perhaps it died when I logged off). Not sure this is closed yet. Perhaps tonight's re-indexing will complete.
Comment 21 Quim Gil 2006-02-14 08:31:59 UTC
Apparently the index has improved but still is not 100% functional. Some lists such as gnome-infrastructure seem to be (fully?) indexed. Others like gnome-web-list are clearly not fully indexed.

Maybe every day the script is incorporating some more messages?

Anyway, hopefully the fix of this bug is approaching...
Comment 22 Ross Golder 2006-02-14 08:37:34 UTC
Yep, I spotted a cron mail the other day suggesting the nightly re-indexing is failing. IIRC, it looks like a bug in the script - probably an easy-fix, though. I'll look into it shortly.
Comment 23 Quim Gil 2006-03-16 10:37:50 UTC
Just confirming the script is not working properly yet, since there are still many lists not fully indexed.
Comment 24 Teppo Turtiainen 2006-05-08 17:58:47 UTC
*** Bug 340975 has been marked as a duplicate of this bug. ***
Comment 25 morgan read 2006-05-24 03:57:06 UTC
As a simple work-around until this is fixed, perhaps someone could add a brief note to the search page to the effect: "use google to search the lists with google's 'site:' operator".  (It's what I did, and imagine what everybody else is doing given this bug has been open four years and the functionality is essential.)

PS Not working for NetworkManager list.
Comment 26 Behdad Esfahbod 2006-05-24 04:58:47 UTC
Or just replace it with a Google site-search box?
Comment 27 morgan read 2006-05-24 05:25:56 UTC
Well, I didn't want to be quite that presumptuous - but, now you mention it...:)
Comment 28 Ross Golder 2006-07-08 02:41:30 UTC
I started *another* re-indexing a couple of days ago which is still running. Hopefully that should bring us up to now. However, we still need to find and hook up the script that keeps the indexes up-to-date as new mail is archived.

I'm trying to keep notes on http://live.gnome.org/MailingLists about the setup and any problems so once it's fixed it stays fixed (or is easily fixable).
Comment 29 Quim Gil 2006-10-01 22:07:54 UTC
I've just realized that the search engine is more broke than I thought:

1. Go to http://mail.gnome.org/archives/

2. Type "drupal" in the search string field and click Search!

The result with the highest score on the top of the list http://mail.gnome.org/archives/evolution-patches/2004-December/msg00158.html

It doesn't contain the string "drupal".

I've tried with other words and it is pretty easy to get wrong results.


Seriously: in all these years it has been impossible to search mail.gnome.org/archives meaningfully. In the meantime people needing to do so have been using Google's advanced search features. I think it's time to stop dreaming that one day we will have Namazu working and put an adapted Google search instead. 
Comment 30 Alex Lancaster 2007-10-09 10:42:54 UTC
*** Bug 466278 has been marked as a duplicate of this bug. ***
Comment 31 Frederic Peters 2007-10-15 14:37:12 UTC
bkor, I didn't touch other pages, and I am not totally sure about hq parameter, so I won't close this bug now.  Could you update the site so it can be tested ?

2007-10-15  Frederic Peters  <fpeters@0d.be>

        * css/layout.css:
        * index.html: added Google Site search to search in archives.