|
MHonArc::CharEnt - HTML Character routines for MHonArc. |
MHonArc::CharEnt - HTML Character routines for MHonArc.
use MHonArc::CharEnt;
MHonArc resource file:
<CharsetConverters>
...
iso-8859-15; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm
...
</CharsetConverters>
MHonArc::CharEnt provides the main character conversion routine used by MHonArc for converting non-ASCII encoded message header data and text/plain character data into HTML. This module was initially written to just support 8-bit only charsets. However, it has been extended to support multibyte charsets.
All characters are mapped to HTML 4.0 character entity references (e.g. < >) or to Unicode numeric character entity references (e.g. ‾). Most modern browsers will support the Unicode references directly.
UTF-8 conversion is done algorithmically.
This does make reading the raw HTML source for non-English languages difficult, but this may be a non-issue with most users.
$Id: CharEnt.pm,v 1.14 2003/03/05 22:17:15 ehood Exp $
Earl Hood, earl@earlhood.com
MHonArc comes with ABSOLUTELY NO WARRANTY and MHonArc may be copied only under the terms of the GNU General Public License, which may be found in the MHonArc distribution.
|
MHonArc::CharEnt - HTML Character routines for MHonArc. |