Re: [hypermail] Why hypermail don't handle html in this message?

From: Henry P. Mills <hpmills3_at_telocity.com_at_hypermail-project.org>
Date: Sun, 25 Nov 2001 13:09:53 -0600
Message-ID: <B8269E20.452D%hpmills3_at_telocity.com>


Hello. Let me first say how much I appreciate Hypermail. It's a terrific program...

I've experienced the problem in question before, at least a similar problem (Running Hypermail 2.1.3 on UNIX). In the past, I found certain email messages would choke the hypermail process from my inbox, shutting down all further output. It turns out that the culprit is certain multipart/alternative emails.

Basically, I've found that if a multipart/alternative message does not contain in its below parts either "Content-type: text/plain" or "Content-type: text/html" to define the text there present, hypermail doesn't know how to proceed. Here's an example of a problem message from a listserv I run. Notes follow:

###########Begin Sample#############

> From ???_at_??? Sun Apr 22 12:23:13 2001
> Return-Path: <owner-wpercy-l_at_listserv.lsu.edu>
> Received: from listserv.lsu.edu (its1.ocs.lsu.edu [130.39.129.108])
> by luna.metalab.unc.edu (8.11.0/8.11.0) with ESMTP id f2Q4Fs812987;
> Sun, 25 Mar 2001 23:15:54 -0500
> Received: from its1.ocs.lsu.edu (listserv.lsu.edu [130.39.129.108]) by
> listserv.lsu.edu (AIX4.3/UCB 8.8.8/LSU-v1.0) with ESMTP id WAA09486; Sun, 25
> Mar 2001 22:17:08 -0600
> Received: from LISTSERV.LSU.EDU by LISTSERV.LSU.EDU (LISTSERV-TCP/IP release
> 1.8d) with spool id 1367 for WPERCY-L_at_LISTSERV.LSU.EDU; Sun, 25
> Mar
> 2001 22:16:45 -0600
> Received: from mta02-srv.alltel.net (mta02.alltel.net [166.102.165.144]) by
> listserv.lsu.edu (AIX4.3/UCB 8.8.8/LSU-v1.0) with ESMTP id
> WAA23080
> for <WPERCY-L_at_listserv.lsu.edu>; Sun, 25 Mar 2001 22:16:43 -0600
> Received: from p.hunt ([165.39.142.79]) by mta02-srv.alltel.net with SMTP id
> <20010326041641.PWIU10181.mta02-srv.alltel.net_at_p.hunt> for
> <WPERCY-L_at_listserv.lsu.edu>; Sun, 25 Mar 2001 22:16:41 -0600
> MIME-Version: 1.0
>
> Content-Type: multipart/alternative;
> boundary="----=_NextPart_000_0015_01C0B580.7174DF60"
>
> X-Priority: 3
> X-MSMail-Priority: Normal
> X-Mailer: Microsoft Outlook Express 5.00.0810.800
> X-MimeOLE: Produced By Microsoft MimeOLE V5.00.0810.800
> Message-ID: <001801c0b5b2$c00da620$4f9027a2_at_p.hunt>
> Date: Sun, 25 Mar 2001 23:08:02 -0600
> Reply-To: Paul Hunt <p.hunt_at_mail.alltel.net>
> Sender: Walker Percy Literary and Philosophic Topics
> <WPERCY-L_at_listserv.lsu.edu>
> From: Paul Hunt <p.hunt_at_ALLTEL.NET>
> Subject: Defining Characteristics
> To: WPERCY-L_at_listserv.lsu.edu
> Status:
>
> <x-html><!x-stuff-for-pete base="" src="" id="0"><!DOCTYPE HTML PUBLIC
> "-//W3C//DTD HTML 4.0 Transitional//EN">
> <HTML><HEAD>
> <META content="text/html; charset=windows-1252" http-equiv=Content-Type>
> <META content="MSHTML 5.00.2314.1000" name=GENERATOR>
> <STYLE></STYLE>
> </HEAD>
> <BODY bgColor=#ffffff>
> <P>Dear listserv: </P>
> <P>A fellow undergraduate asked me last week what the defining
> characteristic of
> contemporary American literature is. So, I asked the same question to my
> literature professor who gave me three possibilities: moral ambiguity,
> sexuality, or existentialism. Albeit an absolute answer to this question
> does
> not exist and albeit these three possibilities are narrow in scope, I
> thought
> that I would ask you which one of the above three possibilities you agree
> with.
> Why? </P>
> <P>Cordially, </P>
> <P>Paul Hunt</P><FONT face="Courier New"></FONT></BODY></HTML>
>
> </x-html>

***********End Sample***********

--First note that this email was sent as HTML (by MIcrosoft Outlook Express 5.0).
--Second, note that the multipart/alternative specification is listed in the header correctly (I have added artificial spacing around it to demonstrate). However, as you scroll down to the message body, there is no additional Content-type specification defining the following text as HTML (other than some type of reference in the META content tags). Because the additional expression "Content-type: text/html" is not present to define the content that follows, hypermail freezes. (See the HTML email at the bottom of my email for a well-formed message that does contain correct and complete specification and which does not hang hypermail).

In the past, I have corrected this problem by doing a search and replace on my batch file, using a wildcard string to capture the multipart/alternative expressions and replace them with "Content-Type: text/html". This can be laborious, though, because of the double-checking and eye-strain involved.

Why does this problem happen? Some email clients apparently give the user the option of sending emails as HTML only or as HTML & PLAIN/TEXT at the same time. It may be that users choosing the former are the ones who are creating the problem MIMEs. I've noted that the following programs have generated emails of this sort to hang hypermail. I am sure there are other clients out there as well:

AOL 6.0 for Windows
Microsoft Outlook Express 5.0
Microsoft Outlook Express 5.5
Mozilla 4.76 [en] (Win98; U)

How to correct the problem? I'm not sure. I am neither a programmer nor network admin by trade. But these are the possibilities I've explored (assuming I have indeed diagnosed the problem correctly).

  1. Manual Option (work around): go into the batch file and wildcard find and replace all "multipart/alternative" instances with "text/html." This can obviously be a maddening and laborious task if you are not constantly watching your archive.
  2. Current Configuration Options: I've looked through all the configuration options and only HM_PREFERED_TYPES seems related, but it doesn't address the problem of an email message with incomplete "multipart/alternative" specifications throughout the entire email, only whether "text/plain" or "text/html" content is given preferential treatment in the batch write-out to the archive.
  3. Hypermail Patch?: Could a new sub-routine be written that specifies an action for multipart/alternative emails that do not have complete content-type designations? Perhaps either substituting "Content-Type: text/html" outright or discarding incomplete messages to a scratch file so as not to freeze future output?

That's the work I have to offer on this issue. Has anyone else noticed this problem or come up with other ways to address it? Below is a correct multipart/alternative email with complete specifications throughout.

Thanks,
Henry Mills

***************Begin Correct Multipart Example*****************

NOTE: each part below has an appropriate Content-type specification, the text version of the message being specified as <Content-type: text/plain, etc.> and the HTML version of the message as <Content-type: text/html, etc.>. "Content-type: multipart/alternative" appears correctly in the message header.

> From hpmills3_at_telocity.com Sun Nov 25 12:49:20 2001
> Received: from c007.snv.cp.net (c007-h011.c007.snv.cp.net [209.228.33.217])
> by trance.metalab.unc.edu (8.11.6/8.11.0) with SMTP id fAPHnJB22089
> for <wpercy1_at_metalab.unc.edu>; Sun, 25 Nov 2001 12:49:20 -0500
> Received: (cpmta 539 invoked from network); 25 Nov 2001 09:49:05 -0800
> Received: from 64.194.107.9 (HELO ?64.194.107.9?)
> by smtp.telocity.com (209.228.33.217) with SMTP; 25 Nov 2001 09:49:05
> -0800
> X-Sent: 25 Nov 2001 17:49:05 GMT
> User-Agent: Microsoft-Outlook-Express-Macintosh-Edition/5.02.2022
> Date: Sun, 25 Nov 2001 11:46:39 -0600
> Subject: HTML test
> From: "Henry P. Mills" <hpmills3_at_telocity.com>
> To: wpercy1 <wpercy1_at_luna.metalab.unc.edu>
> Message-ID: <B8268A9F.451B%hpmills3_at_telocity.com>
> Mime-version: 1.0
> Content-type: multipart/alternative;
> boundary="MS_Mac_OE_3089533599_356733_MIME_Part"
>
>> This message is in MIME format. Since your mail reader does not understand
> this format, some or all of this message may not be legible.
>
> --MS_Mac_OE_3089533599_356733_MIME_Part
> Content-type: text/plain; charset="US-ASCII"
> Content-transfer-encoding: 7bit
>
> This is an HTML file. The Mail format is set to "html" in OE Compose
> Preferences.
>
> Here is some text in italics.
>
> --MS_Mac_OE_3089533599_356733_MIME_Part
> Content-type: text/html; charset="US-ASCII"
> Content-transfer-encoding: quoted-printable
>
> <HTML>
> <HEAD>
> <TITLE>HTML test</TITLE>
> </HEAD>
> <BODY BGCOLOR=3D"#00FF00">
> <H1>This is an HTML file. &nbsp;The Mail format is set to &quot;html&quot; =
> in OE Compose Preferences.<BR>
> </H1><BR>
> <I>Here is some text in italics.</I>=20
> </BODY>
> </HTML>
>
>
> --MS_Mac_OE_3089533599_356733_MIME_Part--

**************End Correct Multipart************** Received on Sun 25 Nov 2001 09:21:21 PM GMT

This archive was generated by hypermail 2.3.0 : Sat 13 Mar 2010 03:46:12 AM GMT GMT