When hypermail converts a message that has an ISO-8859-1 or US-ASCII charset, the charset is dumped. All other charsets are conserved inside HTML META tags.
Martin Duerst, leader of the Internationalization Activity at W3C says that we should conserve it all the time. The HTML 4.0 specs says we cannot assume a charset for a document unless it is explicitly stated, either inside the HTTP headers or in an HTML META tag. The specification does not say what should be done when the charset is missing.
In practice, when this happens, the browser usually assumes a default charset, according to the user preferences. We have a case where browsers in the USA interpret a document using ISO-8859-1, while users in Japan see the same document, but interpreted with another charset (thus, badly rendered).
The solution is quite easy. I found there was an exception in parse.c to avoid storing the charset for the above two charsets. I removed the exception and it is now working correctly... and users in Japan can now see this document correctly too.
Any problems with my commiting this patch?
Thanks,
-jose Received on Fri 25 Jan 2002 03:17:46 AM GMT
This archive was generated by hypermail 2.3.0 : Sat 13 Mar 2010 03:46:12 AM GMT GMT