At 01:24 PM 09/06/02 -0700, Bill Paxton wrote:
>Are there some pre-done htdig modifications out there?
>I checked contrib but nothing I could find. Is there
>something better than htdig?
I'm one of the developers of swish-e (http://swish-e.org). I've used it for indexing hypermail archives -- there's a perl script in the swish-e distribution that I have used for parsing the metadata form the hypermail HTML messages.
The downfall is that swish doesn't do incremental indexing, so for a very high volume list it might be a problem. On the other hand, swish is so damn fast[1] at indexing that for most application you don't need incremental indexing. If your messages are not coming in every second or so then you can typically figure out a way to build an index quickly (i.e. have master index created once a day and run indexing on just new messages for the day every minute or so and search both indexes at the same time).
The swish-e list is a hypermail archive and it's searchable at
http://swish-e.org/Discussion/search/swish.cgi
You could probably come up with a better looking interface.
[1] Fast is subjective, of course. On my athlon I can index 100,000 2K text files in about three minutes. YMMV, of course.
-- Bill Moseley mailto:moseley_at_hank.orgReceived on Fri 20 Sep 2002 12:46:44 AM GMT
This archive was generated by hypermail 2.2.0 : Thu 22 Feb 2007 07:33:54 PM GMT GMT