Re: Duplicate message ids

From: Paul Haldane <Paul.Haldane_at_newcastle.ac.uk_at_hypermail-project.org>
Date: Wed, 5 May 1999 13:29:29 +0100 (GMT)
Message-ID: <Pine.GSO.3.95-960729.990505131904.25054A-100000_at_carr6.ncl.ac.uk>


On Wed, 5 May 1999, Daniel Stenberg wrote:

> On Tue, 4 May 1999, Paul Haldane wrote:
>
> > Any more thoughts on what (if anything) we should do with messages with
> > duplicate message ids?
> >
> > My inclination is to stick with what we do now - don't try to add the
> > duplicates to the web archive but put out a warning message to that a
> > human can fix things by hand.
>
> I think we could start with trying to think of reasons why this happens in
> the first place. How do you add several mails to the arcive using the same
> Message-ID? Does it ever actually occur that two different mails have the
> same ID?

It does (see my previous messages). It _shouldn't_ happen but (because of broken mail systems) it does. In a previous existence I did some work on loop avoidance in a mailing list manager. One of the techniques I used was suppressing messages with the same msgid. I soon found that there were some (a few) systems out there that don't generate unique msgids. We made the decision to recognise those messages and skip the check.

We're talking about a small number of messages here (in the test mailboxes I'm using at the moment, perhaps 4-5 out of 1000) but obviously this depends on the MUAs in use by people sending the mail that hypermail is archiving. I think we should do something reasonable but not expend too much effort on it (at this point). I reckon reasonable in this case means replacing the duplicate msgid with a new one and not worrying about replies being attached to the wrong thread (after all it's not our fault - the user's MUA should't be generating duplicates).

I'm all for doing something simple whilst making it clear that this is a kludge to accomodate broken clients.

Paul

> If it does, perhaps we should take the check a bit further and just
> make sure that if the subject and from lines are identical too we skip it. If
> they differ we can modify the ID so that'll become different. I'm not really
> sure if we want this though, or how we could present this in the document for
> the humans. If there is a reply to the mail we change ID for, it'll end
> looking like a reply to the first mail since that is then the only mail with
> that ID... Oh well, I guess we should just decide what behaviour we want and
> go with that. We can't possibly always do the right thing here.
Received on Wed 05 May 1999 02:54:38 PM GMT

This archive was generated by hypermail 2.3.0 : Sat 13 Mar 2010 03:46:11 AM GMT GMT