When hypermail maintenance stopped, most people here switched over to MHonArc, because it was a going concern. Has anyone compared hypermail and Mhonarc features? Is one better than the other for particular tasks. e.g. more efficient for large folders, or well suited for large projects.
Personally, I find most mail archives useless without an accompanying search interface. Is there anything that can be done to make that a natural part of the parsing subsystem. (Today, a search engine is typically applied to the generated HTML with some loss in the document filtering. e.g bad search hits on next and previous headers and footers as opposed to the actual message or attachments.)
Since computations are equally support on the client of server side of the browser/http web environment, I'd like to get more powerful access to mail archives with traditional database capabilities. Hypermail already provides preformatted indexes, linked records, and formatted metadata in a fairly static form. Why not allow some dynamic presentation of those data structures? e.g. a folded outline of the threaded index in an applet applet viewer.
http://java.sun.com/docs/books/tutorial/ui/swing/tree.html http://java.sun.com/products/jfc/swingdoc-archive/jtable.html http://java.sun.com/products/javahelp/features.html
>
> Writing it in C gives a few advantages:
> - For a speed-intensive task like this, nothing else
> gives you the control needed (don't get me wrong - Python is
> cool, just not something I'd personally choose to use here)
Performance requirements should address
> - We can mmap() in the mailbox file and speed up parsing a lot.
We can unwind the loops in assembly language :-)
> - Other cool speedups (writev() for writing out the messages,
> for example) that require low-level manipulations.
> - There's probably more, I'm just not thinking of it right now.
Some attention should be made to portability if you want hypermail wide deployed again. I think there should also be a goal of extensibility. The Apache module architecture made it possible for lots of different orthogonal contributions to be made to the basic web server architecture, because the request/reply transaction was decomposed into incremental stages where appropriate vaule added capabilities could be made. e.g. I already mentioned the desire to contact an "external indexing agent" when a mesage was fully parsed. It also be good to allow a per attachment filter to be configured based on the type of the attachment (why not do the pdf to text conversion at archive filtering time?)
>
> I would suggest using 'glib' (a set of utility classes for C, i.e. sane
> string manipulation, hash tables, tree, linked list, etc.). It would give
> us much better memory management, among other things, as well as GString -
> the infinitely-long buffer :)
>
> Perhaps we must rewrite hypermail for sanity, though - existing code has
> no concept of "buffer overflow" no matter how hard you try ;-)
How much code are we talking about today? 100, 1000, 10000 lines of code?
Sorry for the long reply. My hope is that I'll be able to help contribute some Java and Search engine experience in getting more out of the generated archive indexes and formatted messages. e.g. Hood ornaments, rather than new V8 engines. Received on Fri 24 Apr 1998 11:51:45 PM GMT
This archive was generated by hypermail 2.3.0 : Sat 13 Mar 2010 03:46:10 AM GMT GMT