Commit | Line | Data |
---|---|---|
86530b38 AT |
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML//EN"> |
2 | <html> | |
3 | <head> | |
4 | <title>MHonArc Resources: CHARSETCONVERTERS</title> | |
5 | </head> | |
6 | <body> | |
7 | <!--x-rc-nav--> | |
8 | <table border=0><tr valign="top"> | |
9 | <td align="left" width="50%">[Prev: <a href="botlinks.html">BOTLINKS</a>]</td><td><nobr>[<a href="../resources.html#charsetconverters">Resources</a>][<a href="../mhonarc.html">TOC</a>]</nobr></td><td align="right" width="50%">[Next: <a href="checknoarchive.html">CHECKNOARCHIVE</a>]</td></tr></table> | |
10 | <!--/x-rc-nav--> | |
11 | <hr> | |
12 | <h1>CHARSETCONVERTERS</h1> | |
13 | ||
14 | <!-- *************************************************************** --> | |
15 | <hr> | |
16 | <h2>Syntax</h2> | |
17 | ||
18 | <dl> | |
19 | ||
20 | <dt><strong>Envariable</strong></dt> | |
21 | <dd><p>N/A | |
22 | </p> | |
23 | </dd> | |
24 | ||
25 | <dt><strong>Element</strong></dt> | |
26 | <dd><p> | |
27 | <code><CHARSETCONVERTERS></code><br> | |
28 | <var>charset-filter-specification</var><br> | |
29 | <code></CHARSETCONVERTERS></code><br> | |
30 | </p> | |
31 | </dd> | |
32 | ||
33 | <dt><strong>Command-line Option</strong></dt> | |
34 | <dd><p>N/A | |
35 | </p> | |
36 | </dd> | |
37 | ||
38 | </dl> | |
39 | ||
40 | <!-- *************************************************************** --> | |
41 | <hr> | |
42 | <h2>Description</h2> | |
43 | ||
44 | <p>The CHARSETCONVERTERS resource specifies Perl routines to call | |
45 | for filtering characters of a character set to HTML legal characters. | |
46 | The filtering occurs for message header data encoded | |
47 | according to the MIME standard. | |
48 | The following example shows a header with encoded data: | |
49 | </p> | |
50 | ||
51 | <pre> | |
52 | From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu> | |
53 | To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk> | |
54 | CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be> | |
55 | Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= | |
56 | =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= | |
57 | </pre> | |
58 | ||
59 | <p>This CHARSETCONVERTERS resource can only be defined via the | |
60 | resource file. Each line of the element specifies a character set, | |
61 | the Perl routine for filtering the character set, and the Perl source | |
62 | file containing the routine. | |
63 | </p> | |
64 | ||
65 | <p>Example:</p> | |
66 | <pre> | |
67 | <b><CharsetConverters></b> | |
68 | iso-8859-1;MHonArc::CharEnt::str2sgml;MHonArc/CharEnt.pm | |
69 | <b></CharsetConverters></b> | |
70 | </pre> | |
71 | ||
72 | <p>The first field is the character set specification. The second field | |
73 | is the routine name (which should contain a package qualifier). | |
74 | The third field is the source file the routine is defined. The | |
75 | source file is optional if the routine is known to be define in | |
76 | an already listed source file. | |
77 | </p> | |
78 | ||
79 | <p>There are some special character set specifications. They are | |
80 | as follows: | |
81 | </p> | |
82 | ||
83 | <dl> | |
84 | ||
85 | <dt><em>plain</em></dt> | |
86 | <dd><p>This specifies text that is not explicitly encoded in a | |
87 | specific character set. | |
88 | </p> | |
89 | </dd> | |
90 | ||
91 | <dt><em>default</em></dt> | |
92 | <dd><p>This specifies the default routine to invoke for encoded | |
93 | data is no specific character specification exists for the data. | |
94 | </p> | |
95 | </dd> | |
96 | ||
97 | </dl> | |
98 | ||
99 | <p>There are some special character set converter routines | |
100 | values. They are as follows: | |
101 | </p> | |
102 | ||
103 | <dl> | |
104 | ||
105 | <dt><em>-ignore-</em></dt> | |
106 | <dd><p>Leave the data "as-is". I.e. The MIME encoding will be | |
107 | preserved. | |
108 | </p> | |
109 | </dd> | |
110 | ||
111 | <dt><em>-decode-</em></dt> | |
112 | <dd><p>Just decode the data. This is useful if it is known that | |
113 | the characters set is the native character set for the system. | |
114 | </p> | |
115 | <table border=0 cellpadding=4> | |
116 | <tr valign=top> | |
117 | <td><strong><font color=red>WARNING</font></strong></td> | |
118 | <td><p>If the decoded data contains the characters '<', '>', | |
119 | and '&', this may conflict with HTML markup. See | |
120 | <a href="decodeheads.html">DECODEHEADS</a> | |
121 | when "-decode-" can be used. | |
122 | </td> | |
123 | </tr> | |
124 | </table> | |
125 | </dd> | |
126 | ||
127 | </dl> | |
128 | ||
129 | <p>Each charset converter function is invoked as follows: | |
130 | </p> | |
131 | ||
132 | <pre> | |
133 | $converted_data = &function($data, $charset); | |
134 | </pre> | |
135 | ||
136 | <p>The data passed in will already be decoded from quoted-printable | |
137 | or base64 (as specified by the MIME syntax). Therefore, the | |
138 | called routine will be passed the raw byte data. It is important | |
139 | that the routine convert the data into a format suitable to be | |
140 | included in HTML markup. | |
141 | </p> | |
142 | ||
143 | ||
144 | <!-- *************************************************************** --> | |
145 | <hr> | |
146 | <h2>Default Setting</h2> | |
147 | ||
148 | <pre> | |
149 | <b><CharsetConverters></b> | |
150 | plain; mhonarc::htmlize; | |
151 | us-ascii; mhonarc::htmlize; | |
152 | iso-8859-1; mhonarc::htmlize; | |
153 | iso-8859-2; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
154 | iso-8859-3; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
155 | iso-8859-4; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
156 | iso-8859-5; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
157 | iso-8859-6; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
158 | iso-8859-7; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
159 | iso-8859-8; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
160 | iso-8859-9; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
161 | iso-8859-10; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
162 | iso-2022-jp; iso_2022_jp::str2html; iso2022jp.pl | |
163 | latin1; mhonarc::htmlize; | |
164 | latin2; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
165 | latin3; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
166 | latin4; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
167 | latin5; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
168 | latin6; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
169 | windows-1250; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
170 | windows-1252; MHonArc::CharEnt::str2sgml; MHonArc/CharEnt.pm | |
171 | default; -ignore- | |
172 | <b></CharsetConverters></b> | |
173 | </pre> | |
174 | ||
175 | <!-- *************************************************************** --> | |
176 | <hr> | |
177 | <h2>Resource Variables</h2> | |
178 | ||
179 | <p>N/A | |
180 | </p> | |
181 | ||
182 | <!-- *************************************************************** --> | |
183 | <hr> | |
184 | <h2>Examples</h2> | |
185 | ||
186 | <p>The following example specifies to just decode iso-8859-1 | |
187 | character data since it is the default character set used by most | |
188 | browsers: | |
189 | </p> | |
190 | ||
191 | <pre> | |
192 | <b><a href="decodeheads.html"><DecodeHeads></a></b> | |
193 | <b><CharsetConverters></b> | |
194 | iso-8859-1;-decode- | |
195 | <b></CharsetConverters></b> | |
196 | </pre> | |
197 | ||
198 | <!-- *************************************************************** --> | |
199 | <hr> | |
200 | <h2>Version</h2> | |
201 | ||
202 | <p>2.0 | |
203 | </p> | |
204 | ||
205 | <!-- *************************************************************** --> | |
206 | <hr> | |
207 | <h2>See Also</h2> | |
208 | ||
209 | <p> | |
210 | <a href="decodeheads.html">DECODEHEADS</a>, | |
211 | <a href="mimedecoders.html">MIMEDECODERS</a>, | |
212 | <a href="mimefilters.html">MIMEFILTERS</a>, | |
213 | <a href="perlinc.html">PERLINC</a> | |
214 | </p> | |
215 | ||
216 | <!-- *************************************************************** --> | |
217 | <hr> | |
218 | <!--x-rc-nav--> | |
219 | <table border=0><tr valign="top"> | |
220 | <td align="left" width="50%">[Prev: <a href="botlinks.html">BOTLINKS</a>]</td><td><nobr>[<a href="../resources.html#charsetconverters">Resources</a>][<a href="../mhonarc.html">TOC</a>]</nobr></td><td align="right" width="50%">[Next: <a href="checknoarchive.html">CHECKNOARCHIVE</a>]</td></tr></table> | |
221 | <!--/x-rc-nav--> | |
222 | <hr> | |
223 | <address> | |
224 | $Date: 2002/07/27 05:13:10 $ <br> | |
225 | <img align="top" src="../monicon.png" alt=""> | |
226 | <a href="http://www.mhonarc.org/"><strong>MHonArc</strong></a><br> | |
227 | Copyright © 1997-2001, <a href="http://www.earlhood.com/">Earl Hood</a>, <a href="mailto:mhonarc@mhonarc.org">mhonarc@mhonarc.org</a><br> | |
228 | </address> | |
229 | ||
230 | </body> | |
231 | </html> |