Re: [hypermail] XHTML hypermail... getting there!

From: Jose Kahan <jose.kahan_at_w3.org_at_hypermail-project.org>
Date: Tue, 8 Apr 2003 20:10:17 +0200
Message-ID: <20030408181017.GE10367_at_inrialpes.fr>


On Tue, Apr 08, 2003 at 10:23:21AM -0700, Peter C. McCluskey wrote:
> jose.kahan_at_w3.org (Jose Kahan) writes:
> >1) XHTML and inline html.
>
> >For the moment, we can just disable this option by default and
> >add a message saying "if you activate this option, you'll be generating
> >invalid documents." until there is more time to fix it.
>
> Is it always invalid? If so, please explain what's invalid about it.
> If you're referring to the problems you've mentioned before about invalid
> or inappropriate html often getting included, the warning message that
> you propose to add shouldn't imply that the result will necessarily be
> invalid.

The inline HTML can make an invalid document because of a number of reasons. Unless stated, the problem exists independently of it being done inside XHTML or HTML and I'm using HTML to refer to both cases:

  1. The document that is to be included can be HTML and we're inside an XHTML document. Or it can be the opposite, if we're inside the HTML world, including an XHTML document.
  2. The document that is to be included can be a full HTML document which includes it's own DTD, HEAD, BODY and other tags that should only appear once per HTML document.
  3. The document that is to be included is not a valid HTML document.

Reasons 1. and 2. will make a document invalid. Reasons 2. already happens as such today.

> >2) the <A HREF= weird convertion
> >
> >Does anyone knows why there is an exception in print.c:ConvURLsString for handling
> >text that begins with <A HREF= ?
>
> >6. And finally (what made me notice it), for XHTML, it should convert
> > the <A HREF= </A> markup to lowercase.
> >
> >I was going to do 6. but as the function seems to have broken
> >heuristics, I think that it's better to supress its call.
>
> It's my impression that it's currently doing what the user would prefer
> over 99% of the time, and that simply disabling ConvURLsWithHrefs() would
> usually cause a subsequent call to parseurl() from ConvURLsString() to
> add an unwanted tag.
> I'm not sure what the best solution is, but making the existing behavior
> an option would be better than disabling it entirely.

N.B. This has nothing to do with moving to XHTML.

If the user sent an HTML document, then it's correct that it quotes URLs using the HTML mechanism of <A HREF and such. By not doing so, hypermail is assuming that any user who sends a message that has <A HREF inside it is not quoting a literal piece of markup, but it's using an advanced hypermail API for making a URI through a mail.

I mean, on the one hand you save a function call and some little markup. On the other hand, you have some heuristics that can misinterpret a message sent by someone who was not aware of this heuristic, and you can't quote this specific markup anymore inside a message. Besides, the function's heuristics are broken (e.g., doesn't check for end ", doesn't convert characaters). I don't think it's such a good idea to leave it to save a few bytes and CPU power per message, where it can have a handicap.

If you don't want to remove its call, let's make it an option. You can turn it off then if you want.

Please tell me if we should disable or just leave it as an option so that I can do it and add the conversion of <A HREF to lowercase. I won't try to fix its heuristics or the rest of the function, though. One such problem is the escaping of characters.

For example, translateurl() didn't escape the characters inside a URL. You can have a msg quoting a URL as:

     http://some"invalid"characters

but the quote characters need to be escaped when they're part of the value of an attribute. string.c:translateurl was only escaping some characters that need a special processing and leaving the others untouched.

Again, this is unrelated to XHTML or HTML. It's a common validity problem.

-jose Received on Tue 08 Apr 2003 08:16:49 PM GMT

This archive was generated by hypermail 2.2.0 : Thu 22 Feb 2007 07:33:54 PM GMT GMT