Re: [hypermail] Why hypermail don't handle html in this message?

From: Daniel Stenberg <daniel_at_haxx.se_at_hypermail-project.org>
Date: Mon, 26 Nov 2001 13:54:36 +0100 (MET)
Message-ID: <Pine.GSO.4.40.0111261335290.12405-100000_at_pm1.contactor.se>


On Sun, 25 Nov 2001, Henry P. Mills wrote:

> I've experienced the problem in question before, at least a similar
> problem (Running Hypermail 2.1.3 on UNIX). In the past, I found certain
> email messages would choke the hypermail process from my inbox, shutting
> down all further output. It turns out that the culprit is certain
> multipart/alternative emails.

As I once wrote the original multipart/alternative code, I figure I can at least provide my view of the matter.

> Basically, I've found that if a multipart/alternative message does not
> contain in its below parts either "Content-type: text/plain" or
> "Content-type: text/html" to define the text there present, hypermail
> doesn't know how to proceed. Here's an example of a problem message from
> a listserv I run.

As you'll see below, I think this happens because the mail doesn't include the proper headers.

[large part cut off]

> > Content-Type: multipart/alternative;
> > boundary="----=_NextPart_000_0015_01C0B580.7174DF60"
> >
> > X-Priority: 3

[many headers cut off]

> > To: WPERCY-L_at_listserv.lsu.edu
> > Status:
> >
> > <x-html><!x-stuff-for-pete base="" src="" id="0"><!DOCTYPE HTML PUBLIC

This is a perfect example of a badly (or at least stupidly) formatted mail. If I'm not totally wrong.

We can all see the multipart/alternative header. It says that this mail will provied a series of different alternatives. Each within their own boundary. The mail client (Hypermail in this case) is supposed to pick the last listed alternative that it can show (this is what the prefered_types config entry specifies).

To get that system working, each part has to specify its Content-Type so that the client knows what kind of format that it should select from. If it doesn't provide a content-type, then what is the point in using the multipart/alternative kind of mail?

[ HTML mail example cut off]

> > <P>Paul Hunt</P><FONT face="Courier New"></FONT></BODY></HTML>
> >
> > </x-html>

>
> ***********End Sample***********

And there was no trailing boundary either? Well then that is one more formatting error.

> --First note that this email was sent as HTML (by MIcrosoft Outlook
> Express 5.0).

Yes, but without saying so.

> --Second, note that the multipart/alternative specification is listed in
> the header correctly (I have added artificial spacing around it to
> demonstrate).

I agree it is.

> However, as you scroll down to the message body, there is no additional
> Content-type specification defining the following text as HTML (other
> than some type of reference in the META content tags).

Yes, but that is in the contents of the mail body, not in the meta data.

> Because the additional expression "Content-type: text/html" is not
> present to define the content that follows, hypermail freezes. (See the
> HTML email at the bottom of my email for a well-formed message that does
> contain correct and complete specification and which does not hang
> hypermail).

I don't think it is the actual lack of Content-type that makes hypermail freeze. I think it its the lack of an "initial" boundary stirng that should signal the start of the first alternative.

> Why does this problem happen? Some email clients apparently give the
> user the option of sending emails as HTML only or as HTML & PLAIN/TEXT at
> the same time.

That might be the reason, but I think there's someone along the way that breaks this.

> How to correct the problem?

First I think we need to figure out who's causing this to happen. If it truly is the mail programs you mention, then we probably need to patch hypermail to work-around the flawed mails in the style you've shown.

> 1. Manual Option (work around): go into the batch file and wildcard find
> and replace all "multipart/alternative" instances with "text/html."
> This can obviously be a maddening and laborious task if you are not
> constantly watching your archive.

That is not a complete fix anyway. You've still not fixed the boundaries then.

> 3. Hypermail Patch?: Could a new sub-routine be written that specifies an
> action for multipart/alternative emails that do not have complete
> content-type designations? Perhaps either substituting "Content-Type:
> text/html" outright or discarding incomplete messages to a scratch file
> so as not to freeze future output?

It couldn't possibly be a config file entry as if this is what those mailers send, then the broken mails will appear everywhere and everyone would want this fixed.

My guess is that the mail was sent away properly by the mail client, but that your mail server that received this mail somehow converted it for you in your end. I think we would've spotted this problem a lot more otherwise.

BTW, a HTML version of the relevant RFC section:

http://www.oac.uci.edu/indiv/ehood/MIME/1521/07_Predefined_Content-Type.html#7.2.3

-- 
      Daniel Stenberg - http://daniel.haxx.se - +46-705-44 31 77
   ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
Received on Mon 26 Nov 2001 03:02:33 PM GMT

This archive was generated by hypermail 2.2.0 : Thu 22 Feb 2007 07:33:53 PM GMT GMT