Re: [Fwd: Ideas]

From: Byron C. Darrah <bdarr_at_sse.FU.HAC.COM_at_hypermail-project.org>
Date: Wed, 6 May 1998 10:53:04 -0700 (PDT)
Message-Id: <199805061753.KAA06244_at_pepperoni.pizza.hac.com>

> Date: Wed, 06 May 1998 07:52:19 +0200
> From: Daniel Stenberg <Daniel.Stenberg_at_sth.frontec.se>
>
> Ok, I accidentaly mailed this to mr Darrah privately only, here it comes to
> the list...
>
> Byron C. Darrah wrote:
>
> > Okay, second comment now. This is regarding the suggestion someone offered
> > in a reply about putting all of the attachments for all the messages in a
> > "bins" directory.
>
> Oh well, the current method saves the binary attachments in the same directory
> as the HTML files, with a random name prefixed with 'bin'. Of course there may
> be reasons to change that.

Cool -- whatever works. Still, it might be even nicer if all attachments were grouped together in a subdirectory and named so that humans and hypermail alike could find them. It would be such a small amount of effort to do it like that.

> > I suggest that there should be a subdirectory for *each* message that has
> > attachments.
>
> [picture cut out]
>
> > This way, all of the attachments for a particular message are grouped
> > together, which makes them easy to find, and easy to delete when one
> > wishes to remove message 0042 from the archive.
>
> Excuse me, but why would you want to remove message 0042? You'd ruin a bunch
> of links and generally mess up. It would be much better to remove that
> particular mail from the mbox-file, remove all hypermail-generated stuff and
> then regenerate the entire archive...

Okay, I see that example wasn't the best I could have offered. Perhaps look at it this way: if it's easier for a person to inspect and manipulate a hypermail archive, then it's also going to be easier to manage the archive, and easier to develop new capabilities.

For example, suppose as an administrator of a hypermail list, I notice that one of the messages contains a program that has a security bug, and I know there's a fixed version available. I would know right where to look to find the attachment that needs to be removed or replaced.

Or how about this one: I could use a one-line shell script to find all of the duplicate attachments in all of my hypermail archives that people sometimes send (don't 'cha just hate when a bunch of separate lusers each send a copy of IE4 to your hypermail archives? :-) and do something about it, like maybe replace all but one of them with a symbolic link.

Those are just two examples... I don't want to go on with imaginary scenarios, so I'll just leave it at this: a more logical and intuitive archive directory structure can't hurt.

> > It also lets the
> > attachments keep their original file names, if they such is given in their
> > mime headers (and it usually is).
>
> I thought about this too, and I may go back at trying to use this method too,
> but...
>
> 1. The file name used in the mail doesn't necessary have to be usable or even
> preferable to use. I.e if it contains '/' in a unix machine, if it contains
> whitespaces or if it contains '\' in an MS-OS machine. There would have to
> be a very complicated is_this_a_sane_filename() function to verify that.

I think it would actually be very easy to solve this problem. It would be reasonable to take the approach that there is a list of characters that are allowed in filenames, and all other characters are forbidden. There could be a line in the hypermail.h or config.h file that looks like this:

#define FILENAME_CHARACTERS \

        "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz .-+_0123456789"

Thus, users who see a need to change this set can do it at compile time, though they probably won't need to. Then, a "is_this_a_sane_filename()" function could be written in three lines of code:

   for(i = strlen(filename) - 1; i >= 0; i--)

      if(strchr(FILENAME_CHARACTERS, filename[i])) return(0);    return(1);

But instead of a "is_this_a_sane_filename()" function, I would suggest something even easier: don't even bother checking whether the filename is sane, just run a filter function over the filename that replaces any disallowed characters with underscores.

> 2. Nothing prevents the same mail to contain several attachments that use the
> same file name although with different contents.

True, but this is easy to work around, if one wishes to do so. For example, here are two easy suggestions: if an attachment is about to be saved with the same name as an existing attachment, prepend a number to it. Or this: keep each attachment in it's own subdirectory. (At 512 bytes per subdirectory, compared to probably a lot more than that per attachment, one subdirectory per attachment is not a waste.)

Thanks for reading this,
--Byron Darrah Received on Wed 06 May 1998 07:56:13 PM GMT

This archive was generated by hypermail 2.2.0 : Thu 22 Feb 2007 07:33:49 PM GMT GMT