Initial commit of OpenSPARC T2 architecture model.
[OpenSPARC-T2-SAM] / sam-t2 / devtools / v9 / man / man3 / Encode::Encoding.3
CommitLineData
920dae64
AT
1.\" Automatically generated by Pod::Man v1.37, Pod::Parser v1.32
2.\"
3.\" Standard preamble:
4.\" ========================================================================
5.de Sh \" Subsection heading
6.br
7.if t .Sp
8.ne 5
9.PP
10\fB\\$1\fR
11.PP
12..
13.de Sp \" Vertical space (when we can't use .PP)
14.if t .sp .5v
15.if n .sp
16..
17.de Vb \" Begin verbatim text
18.ft CW
19.nf
20.ne \\$1
21..
22.de Ve \" End verbatim text
23.ft R
24.fi
25..
26.\" Set up some character translations and predefined strings. \*(-- will
27.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
28.\" double quote, and \*(R" will give a right double quote. | will give a
29.\" real vertical bar. \*(C+ will give a nicer C++. Capital omega is used to
30.\" do unbreakable dashes and therefore won't be available. \*(C` and \*(C'
31.\" expand to `' in nroff, nothing in troff, for use with C<>.
32.tr \(*W-|\(bv\*(Tr
33.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
34.ie n \{\
35. ds -- \(*W-
36. ds PI pi
37. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
38. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
39. ds L" ""
40. ds R" ""
41. ds C` ""
42. ds C' ""
43'br\}
44.el\{\
45. ds -- \|\(em\|
46. ds PI \(*p
47. ds L" ``
48. ds R" ''
49'br\}
50.\"
51.\" If the F register is turned on, we'll generate index entries on stderr for
52.\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index
53.\" entries marked with X<> in POD. Of course, you'll have to process the
54.\" output yourself in some meaningful fashion.
55.if \nF \{\
56. de IX
57. tm Index:\\$1\t\\n%\t"\\$2"
58..
59. nr % 0
60. rr F
61.\}
62.\"
63.\" For nroff, turn off justification. Always turn off hyphenation; it makes
64.\" way too many mistakes in technical documents.
65.hy 0
66.if n .na
67.\"
68.\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).
69.\" Fear. Run. Save yourself. No user-serviceable parts.
70. \" fudge factors for nroff and troff
71.if n \{\
72. ds #H 0
73. ds #V .8m
74. ds #F .3m
75. ds #[ \f1
76. ds #] \fP
77.\}
78.if t \{\
79. ds #H ((1u-(\\\\n(.fu%2u))*.13m)
80. ds #V .6m
81. ds #F 0
82. ds #[ \&
83. ds #] \&
84.\}
85. \" simple accents for nroff and troff
86.if n \{\
87. ds ' \&
88. ds ` \&
89. ds ^ \&
90. ds , \&
91. ds ~ ~
92. ds /
93.\}
94.if t \{\
95. ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u"
96. ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'
97. ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'
98. ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'
99. ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'
100. ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'
101.\}
102. \" troff and (daisy-wheel) nroff accents
103.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'
104.ds 8 \h'\*(#H'\(*b\h'-\*(#H'
105.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#]
106.ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'
107.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'
108.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#]
109.ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#]
110.ds ae a\h'-(\w'a'u*4/10)'e
111.ds Ae A\h'-(\w'A'u*4/10)'E
112. \" corrections for vroff
113.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'
114.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'
115. \" for low resolution devices (crt and lpr)
116.if \n(.H>23 .if \n(.V>19 \
117\{\
118. ds : e
119. ds 8 ss
120. ds o a
121. ds d- d\h'-1'\(ga
122. ds D- D\h'-1'\(hy
123. ds th \o'bp'
124. ds Th \o'LP'
125. ds ae ae
126. ds Ae AE
127.\}
128.rm #[ #] #H #V #F C
129.\" ========================================================================
130.\"
131.IX Title "Encode::Encoding 3"
132.TH Encode::Encoding 3 "2001-09-21" "perl v5.8.8" "Perl Programmers Reference Guide"
133.SH "NAME"
134Encode::Encoding \- Encode Implementation Base Class
135.SH "SYNOPSIS"
136.IX Header "SYNOPSIS"
137.Vb 2
138\& package Encode::MyEncoding;
139\& use base qw(Encode::Encoding);
140.Ve
141.PP
142.Vb 1
143\& __PACKAGE__->Define(qw(myCanonical myAlias));
144.Ve
145.SH "DESCRIPTION"
146.IX Header "DESCRIPTION"
147As mentioned in Encode, encodings are (in the current
148implementation at least) defined as objects. The mapping of encoding
149name to object is via the \f(CW%Encode::Encoding\fR hash. Though you can
150directly manipulate this hash, it is strongly encouraged to use this
151base class module and add \fIencode()\fR and \fIdecode()\fR methods.
152.Sh "Methods you should implement"
153.IX Subsection "Methods you should implement"
154You are strongly encouraged to implement methods below, at least
155either \fIencode()\fR or \fIdecode()\fR.
156.IP "\->encode($string [,$check])" 4
157.IX Item "->encode($string [,$check])"
158\&\s-1MUST\s0 return the octet sequence representing \fI$string\fR.
159.RS 4
160.IP "*" 2
161If \fI$check\fR is true, it \s-1SHOULD\s0 modify \fI$string\fR in place to remove
162the converted part (i.e. the whole string unless there is an error).
163If \fIperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST\s0.
164.IP "*" 2
165If an error occurs, it \s-1SHOULD\s0 return the octet sequence for the
166fragment of string that has been converted and modify \f(CW$string\fR in-place
167to remove the converted part leaving it starting with the problem
168fragment. If \fIperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST\s0.
169.IP "*" 2
170If \fI$check\fR is is false then \f(CW\*(C`encode\*(C'\fR \s-1MUST\s0 make a \*(L"best effort\*(R" to
171convert the string \- for example, by using a replacement character.
172.RE
173.RS 4
174.RE
175.IP "\->decode($octets [,$check])" 4
176.IX Item "->decode($octets [,$check])"
177\&\s-1MUST\s0 return the string that \fI$octets\fR represents.
178.RS 4
179.IP "*" 2
180If \fI$check\fR is true, it \s-1SHOULD\s0 modify \fI$octets\fR in place to remove
181the converted part (i.e. the whole sequence unless there is an
182error). If \fIperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST\s0.
183.IP "*" 2
184If an error occurs, it \s-1SHOULD\s0 return the fragment of string that has
185been converted and modify \f(CW$octets\fR in-place to remove the converted
186part leaving it starting with the problem fragment. If \fIperlio_ok()\fR is
187true, \s-1SHOULD\s0 becomes \s-1MUST\s0.
188.IP "*" 2
189If \fI$check\fR is false then \f(CW\*(C`decode\*(C'\fR should make a \*(L"best effort\*(R" to
190convert the string \- for example by using Unicode's \*(L"\ex{\s-1FFFD\s0}\*(R" as a
191replacement character.
192.RE
193.RS 4
194.RE
195.PP
196If you want your encoding to work with encoding pragma, you should
197also implement the method below.
198.ie n .IP "\->cat_decode($destination, $octets\fR, \f(CW$offset\fR, \f(CW$terminator [,$check])" 4
199.el .IP "\->cat_decode($destination, \f(CW$octets\fR, \f(CW$offset\fR, \f(CW$terminator\fR [,$check])" 4
200.IX Item "->cat_decode($destination, $octets, $offset, $terminator [,$check])"
201\&\s-1MUST\s0 decode \fI$octets\fR with \fI$offset\fR and concatenate it to \fI$destination\fR.
202Decoding will terminate when \f(CW$terminator\fR (a string) appears in output.
203\&\fI$offset\fR will be modified to the last \f(CW$octets\fR position at end of decode.
204Returns true if \f(CW$terminator\fR appears output, else returns false.
205.Sh "Other methods defined in Encode::Encodings"
206.IX Subsection "Other methods defined in Encode::Encodings"
207You do not have to override methods shown below unless you have to.
208.IP "\->name" 4
209.IX Item "->name"
210Predefined As:
211.Sp
212.Vb 1
213\& sub name { return shift->{'Name'} }
214.Ve
215.Sp
216\&\s-1MUST\s0 return the string representing the canonical name of the encoding.
217.IP "\->renew" 4
218.IX Item "->renew"
219Predefined As:
220.Sp
221.Vb 6
222\& sub renew {
223\& my $self = shift;
224\& my $clone = bless { %$self } => ref($self);
225\& $clone->{renewed}++;
226\& return $clone;
227\& }
228.Ve
229.Sp
230This method reconstructs the encoding object if necessary. If you need
231to store the state during encoding, this is where you clone your object.
232.Sp
233PerlIO \s-1ALWAYS\s0 calls this method to make sure it has its own private
234encoding object.
235.IP "\->renewed" 4
236.IX Item "->renewed"
237Predefined As:
238.Sp
239.Vb 1
240\& sub renewed { $_[0]->{renewed} || 0 }
241.Ve
242.Sp
243Tells whether the object is renewed (and how many times). Some
244modules emit \f(CW\*(C`Use of uninitialized value in null operation\*(C'\fR warning
245unless the value is numeric so return 0 for false.
246.IP "\->\fIperlio_ok()\fR" 4
247.IX Item "->perlio_ok()"
248Predefined As:
249.Sp
250.Vb 4
251\& sub perlio_ok {
252\& eval{ require PerlIO::encoding };
253\& return $@ ? 0 : 1;
254\& }
255.Ve
256.Sp
257If your encoding does not support PerlIO for some reasons, just;
258.Sp
259.Vb 1
260\& sub perlio_ok { 0 }
261.Ve
262.IP "\->\fIneeds_lines()\fR" 4
263.IX Item "->needs_lines()"
264Predefined As:
265.Sp
266.Vb 1
267\& sub needs_lines { 0 };
268.Ve
269.Sp
270If your encoding can work with PerlIO but needs line buffering, you
271\&\s-1MUST\s0 define this method so it returns true. 7bit \s-1ISO\-2022\s0 encodings
272are one example that needs this. When this method is missing, false
273is assumed.
274.Sh "Example: Encode::ROT13"
275.IX Subsection "Example: Encode::ROT13"
276.Vb 3
277\& package Encode::ROT13;
278\& use strict;
279\& use base qw(Encode::Encoding);
280.Ve
281.PP
282.Vb 1
283\& __PACKAGE__->Define('rot13');
284.Ve
285.PP
286.Vb 6
287\& sub encode($$;$){
288\& my ($obj, $str, $chk) = @_;
289\& $str =~ tr/A-Za-z/N-ZA-Mn-za-m/;
290\& $_[1] = '' if $chk; # this is what in-place edit means
291\& return $str;
292\& }
293.Ve
294.PP
295.Vb 2
296\& # Jr pna or ynml yvxr guvf;
297\& *decode = \e&encode;
298.Ve
299.PP
300.Vb 1
301\& 1;
302.Ve
303.SH "Why the heck Encode API is different?"
304.IX Header "Why the heck Encode API is different?"
305It should be noted that the \fI$check\fR behaviour is different from the
306outer public \s-1API\s0. The logic is that the \*(L"unchecked\*(R" case is useful
307when the encoding is part of a stream which may be reporting errors
308(e.g. \s-1STDERR\s0). In such cases, it is desirable to get everything
309through somehow without causing additional errors which obscure the
310original one. Also, the encoding is best placed to know what the
311correct replacement character is, so if that is the desired behaviour
312then letting low level code do it is the most efficient.
313.PP
314By contrast, if \fI$check\fR is true, the scheme above allows the
315encoding to do as much as it can and tell the layer above how much
316that was. What is lacking at present is a mechanism to report what
317went wrong. The most likely interface will be an additional method
318call to the object, or perhaps (to avoid forcing per-stream objects
319on otherwise stateless encodings) an additional parameter.
320.PP
321It is also highly desirable that encoding classes inherit from
322\&\f(CW\*(C`Encode::Encoding\*(C'\fR as a base class. This allows that class to define
323additional behaviour for all encoding objects.
324.PP
325.Vb 2
326\& package Encode::MyEncoding;
327\& use base qw(Encode::Encoding);
328.Ve
329.PP
330.Vb 1
331\& __PACKAGE__->Define(qw(myCanonical myAlias));
332.Ve
333.PP
334to create an object with \f(CW\*(C`bless {Name => ...}, $class\*(C'\fR, and call
335define_encoding. They inherit their \f(CW\*(C`name\*(C'\fR method from
336\&\f(CW\*(C`Encode::Encoding\*(C'\fR.
337.Sh "Compiled Encodings"
338.IX Subsection "Compiled Encodings"
339For the sake of speed and efficiency, most of the encodings are now
340supported via a \fIcompiled form\fR: \s-1XS\s0 modules generated from \s-1UCM\s0
341files. Encode provides the enc2xs tool to achieve that. Please see
342enc2xs for more details.
343.SH "SEE ALSO"
344.IX Header "SEE ALSO"
345perlmod, enc2xs