Commit | Line | Data |
---|---|---|
86530b38 AT |
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML//EN"> |
2 | <html> | |
3 | <head> | |
4 | <title>MHonArc Resources: TEXTCLIPFUNC</title> | |
5 | </head> | |
6 | <body> | |
7 | <!--x-rc-nav--> | |
8 | <table border=0><tr valign="top"> | |
9 | <td align="left" width="50%">[Prev: <a href="tcontend.html">TCONTEND</a>]</td><td><nobr>[<a href="../resources.html#textclipfunc">Resources</a>][<a href="../mhonarc.html">TOC</a>]</nobr></td><td align="right" width="50%">[Next: <a href="tfirstpglink.html">TFIRSTPGLINK</a>]</td></tr></table> | |
10 | <!--/x-rc-nav--> | |
11 | <hr> | |
12 | <h1>TEXTCLIPFUNC</h1> | |
13 | ||
14 | <!-- *************************************************************** --> | |
15 | <hr> | |
16 | <h2>Syntax</h2> | |
17 | ||
18 | <dl> | |
19 | ||
20 | <dt><strong>Envariable</strong></dt> | |
21 | <dd><p>N/A | |
22 | </p> | |
23 | </dd> | |
24 | ||
25 | <dt><strong>Element</strong></dt> | |
26 | <dd><p> | |
27 | <code><TEXTCLIPFUNC></code><br> | |
28 | <var>function_name</var>;<var>source_file</var><br> | |
29 | <code></TEXTCLIPFUNC></code><br> | |
30 | </p> | |
31 | </dd> | |
32 | ||
33 | <dt><strong>Command-line Option</strong></dt> | |
34 | <dd><p>N/A | |
35 | </p> | |
36 | </dd> | |
37 | ||
38 | </dl> | |
39 | ||
40 | <!-- *************************************************************** --> | |
41 | <hr> | |
42 | <h2>Description</h2> | |
43 | ||
44 | <p>TEXTCLIPFUNC defines the the Perl function to invoke when | |
45 | MHonArc clips text. For example, the function specified would | |
46 | be invoked when a length specifier is used for a | |
47 | <a href="../rcvars.html">resource variable</a>, e.g. | |
48 | <tt>$SUBJECTNA:72$</tt>. | |
49 | </p> | |
50 | ||
51 | <p>The syntax for TEXTCLIPFUNC is as follows: | |
52 | </p> | |
53 | ||
54 | <pre> | |
55 | <var>routine-name</var>;<var>file-of-routine</var></pre> | |
56 | ||
57 | <p>The definition of each semi-colon-separated value is as follows: | |
58 | </p> | |
59 | ||
60 | <dl> | |
61 | <dt><var>routine-name</var></dt> | |
62 | <dd><p>The actual routine name of the filter. The name | |
63 | should be fully qualified by the package it is defined in | |
64 | (e.g. "<code>mypackage::filter</code>"). | |
65 | </p> | |
66 | <dd> | |
67 | <dt><var>file-of-routine</var></dt> | |
68 | <dd><p>The name of the file that defines | |
69 | <var>routine-name</var>. If the file is not a full | |
70 | pathname, MHonArc finds the file by looking in the | |
71 | standard include paths of Perl, and the paths specified by the | |
72 | <A HREF="perlinc.html">PERLINC</A> | |
73 | resource. | |
74 | </p> | |
75 | <p><var>file-of-routine</var> can be left blank if it is | |
76 | known that <var>routine-name</var> will already be loaded, as | |
77 | is the case for the default value for this resource since the | |
78 | routine is an internal MHonArc function. | |
79 | </p> | |
80 | </dd> | |
81 | </dl> | |
82 | ||
83 | <h3><a name="writing">Writing a Clipping Function</a></h3> | |
84 | ||
85 | <p>If you want to write your own function, you need to know the Perl | |
86 | programming language. The following information assumes you know Perl. | |
87 | </p> | |
88 | ||
89 | <h4>Function Interface</h4> | |
90 | ||
91 | <P>MHonArc interfaces with text clipping function by calling the routine | |
92 | with a specific set of arguments. The prototype of the interface | |
93 | routine is as follows: </P> | |
94 | ||
95 | <pre> | |
96 | sub <var>clip</var> { | |
97 | my(<b>$text</b>, <b>$clip_length</b>, <b>$is_html</b>, <b>$has_tags</b>) = @_; | |
98 | <var># code here | |
99 | } | |
100 | </pre> | |
101 | ||
102 | <h5>Parameter Descriptions</h5> | |
103 | ||
104 | <table cellspacing=1 border=0 cellpadding=4> | |
105 | ||
106 | <tr valign=top> | |
107 | <td><strong><code>$text</code></strong></td> | |
108 | <td><p>The text to be clipped. | |
109 | </p> | |
110 | <p><b>NOTE:</b> Since Perl allows one to modify the data passed into | |
111 | it, the first argument should <strong>NOT</strong> be modified. If | |
112 | you copy arguments from <tt>@_</tt> as shown above, then you will be | |
113 | okay since the <tt>my</tt> operation creates a copy of the arguments | |
114 | in <tt>@_</tt>. | |
115 | </p> | |
116 | </td> | |
117 | </tr> | |
118 | ||
119 | <tr valign=top> | |
120 | <td><strong><code>$clip_length</code></strong></td> | |
121 | <td><p>The number of characters <code>$text</code> should be clipped to. | |
122 | </p> | |
123 | </td> | |
124 | </tr> | |
125 | ||
126 | <tr valign=top> | |
127 | <td><strong><code>$is_html</code></strong></td> | |
128 | <td><p>The text may contain entity references, e.g. "<tt>&amp;</tt>". | |
129 | Entity references should be considered a single character when | |
130 | clipping <code>$text</code>. | |
131 | </p> | |
132 | </td> | |
133 | </tr> | |
134 | ||
135 | <tr valign=top> | |
136 | <td><strong><code>$has_tags</code></strong></td> | |
137 | <td><p>The text may contain HTML tags, and the tags should be stripped | |
138 | from <code>$text</code> when generating the clip string. For example, | |
139 | if <code>$text</code> equals "<tt><b>MHonArc</b></tt>" and | |
140 | <code>$clip_length</code> equals 2, then the return value of the | |
141 | function should be "<tt>MH</tt>". | |
142 | </p> | |
143 | <table border=0 cellpadding=4> | |
144 | <tr valign=top> | |
145 | <td><strong>NOTE</strong></td> | |
146 | <td><p>The <code>$has_tags</code> argument is currently | |
147 | not used within MHonArc, but it will likely be used in a future release. | |
148 | </p> | |
149 | </td> | |
150 | </tr> | |
151 | </table> | |
152 | </td> | |
153 | </tr> | |
154 | ||
155 | </table> | |
156 | ||
157 | <h5>Return Value</h5> | |
158 | ||
159 | <p>The return value should be the clipped version of <code>$text</code>. | |
160 | </p> | |
161 | ||
162 | <H4>Writing Tips</h4> | |
163 | ||
164 | <ul> | |
165 | <li><p>Qualify your filter in its own package. This eliminates possible | |
166 | variable/routine conflicts with MHonArc. | |
167 | </p> | |
168 | ||
169 | <li><p>Make sure your Perl source file ends with a true statement | |
170 | (like "<code>1;</code>"). MHonArc just performs a | |
171 | <strong><code>require</code></strong> | |
172 | on the file, and if the file does not return | |
173 | true, MHonArc will revert to the default value for TEXTCLIPFUNC. | |
174 | </p> | |
175 | ||
176 | <li><p>Test your function before production use. | |
177 | </p> | |
178 | ||
179 | </ul> | |
180 | ||
181 | <!-- *************************************************************** --> | |
182 | <hr> | |
183 | <h2>Default Setting</h2> | |
184 | ||
185 | <pre> | |
186 | mhonarc::clip_text; | |
187 | </pre> | |
188 | ||
189 | <!-- *************************************************************** --> | |
190 | <hr> | |
191 | <h2>Resource Variables</h2> | |
192 | ||
193 | <p>N/A | |
194 | </p> | |
195 | ||
196 | <!-- *************************************************************** --> | |
197 | <hr> | |
198 | <h2>Examples</h2> | |
199 | ||
200 | <p>The <a href="../rcfileexs/utf-8.mrc.html">Unicode</a> example | |
201 | resource file sets TEXTCLIPFUNC to a routine that understands UTF-8 | |
202 | text. | |
203 | </p> | |
204 | ||
205 | <p>The following is the implementation (as of this writing) of | |
206 | MHonArc's default clipping function: | |
207 | </p> | |
208 | ||
209 | <pre> | |
210 | sub clip_text { | |
211 | my $str = \shift; # Prevent unnecessary copy. | |
212 | my $len = shift; # Clip length | |
213 | my $is_html = shift; # If entity references should be considered | |
214 | my $has_tags = shift; # If html tags should be stripped | |
215 | ||
216 | if (!$is_html) { | |
217 | return substr($$str, 0, $len); | |
218 | } | |
219 | ||
220 | my $text = ""; | |
221 | my $subtext = ""; | |
222 | my $html_len = length($$str); | |
223 | my($pos, $sublen, $erlen, $real_len); | |
224 | my $er_len = 0; | |
225 | ||
226 | for ( $pos=0, $sublen=$len; $pos < $html_len; ) { | |
227 | $subtext = substr($$str, $pos, $sublen); | |
228 | $pos += $sublen; | |
229 | ||
230 | # strip tags | |
231 | if ($has_tags) { | |
232 | $subtext =~ s/\A[^<]*>//; # clipped tag | |
233 | $subtext =~ s/<[^>]*>//g; | |
234 | $subtext =~ s/<[^>]*\Z//; # clipped tag | |
235 | } | |
236 | ||
237 | # check for clipped entity reference | |
238 | if (($pos < $html_len) && ($subtext =~ /\&[^;]*\Z/)) { | |
239 | my $semi = index($$str, ';', $pos); | |
240 | if ($semi < 0) { | |
241 | # malformed entity reference | |
242 | $subtext .= substr($$str, $pos); | |
243 | $pos = $html_len; | |
244 | } else { | |
245 | $subtext .= substr($$str, $pos, $semi-$pos+1) | |
246 | if $semi > $pos; | |
247 | $pos = $semi+1; | |
248 | } | |
249 | } | |
250 | ||
251 | # compute entity reference lengths to determine "real" character | |
252 | # count and not raw character count. | |
253 | while ($subtext =~ /(\&[^;]+);/g) { | |
254 | $er_len += length($1); | |
255 | } | |
256 | ||
257 | $text .= $subtext; | |
258 | ||
259 | # done if we have enough | |
260 | $real_len = length($text)-$er_len; | |
261 | if ($real_len >= $len) { | |
262 | last; | |
263 | } | |
264 | $sublen = $len - (length($text)-$er_len); | |
265 | } | |
266 | $text; | |
267 | } | |
268 | </pre> | |
269 | ||
270 | <!-- *************************************************************** --> | |
271 | <hr> | |
272 | <h2>Version</h2> | |
273 | ||
274 | <p>2.5.10 | |
275 | </p> | |
276 | ||
277 | <!-- *************************************************************** --> | |
278 | <hr> | |
279 | <h2>See Also</h2> | |
280 | ||
281 | <p><a href="../rcvars.html"><cite>Resource Variables</cite></a> | |
282 | </p> | |
283 | ||
284 | <!-- *************************************************************** --> | |
285 | <hr> | |
286 | <!--x-rc-nav--> | |
287 | <table border=0><tr valign="top"> | |
288 | <td align="left" width="50%">[Prev: <a href="tcontend.html">TCONTEND</a>]</td><td><nobr>[<a href="../resources.html#textclipfunc">Resources</a>][<a href="../mhonarc.html">TOC</a>]</nobr></td><td align="right" width="50%">[Next: <a href="tfirstpglink.html">TFIRSTPGLINK</a>]</td></tr></table> | |
289 | <!--/x-rc-nav--> | |
290 | <hr> | |
291 | <address> | |
292 | $Date: 2002/08/04 03:58:27 $<br> | |
293 | <img align="top" src="../monicon.png" alt=""> | |
294 | <a href="http://www.mhonarc.org/"><strong>MHonArc</strong></a><br> | |
295 | Copyright © 2002, <a href="http://www.earlhood.com/" | |
296 | >Earl Hood</a>, <a href="mailto:mhonarc@mhonarc.org" | |
297 | >mhonarc@mhonarc.org</a><br> | |
298 | </address> | |
299 | ||
300 | </body> | |
301 | </html> |