.\" Automatically generated by Pod::Man v1.37, Pod::Parser v1.32
.\" ========================================================================
.de Sh \" Subsection heading
.de Sp \" Vertical space (when we can't use .PP)
.de Vb \" Begin verbatim text
.de Ve \" End verbatim text
.\" Set up some character translations and predefined strings. \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote. | will give a
.\" real vertical bar. \*(C+ will give a nicer C++. Capital omega is used to
.\" do unbreakable dashes and therefore won't be available. \*(C` and \*(C'
.\" expand to `' in nroff, nothing in troff, for use with C<>.
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
.\" If the F register is turned on, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index
.\" entries marked with X<> in POD. Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
. tm Index:\\$1\t\\n%\t"\\$2"
.\" For nroff, turn off justification. Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).
.\" Fear. Run. Save yourself. No user-serviceable parts.
. \" fudge factors for nroff and troff
. ds #H ((1u-(\\\\n(.fu%2u))*.13m)
. \" simple accents for nroff and troff
. ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u"
. ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'
. ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'
. ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'
. ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'
. ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'
. \" troff and (daisy-wheel) nroff accents
.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'
.ds 8 \h'\*(#H'\(*b\h'-\*(#H'
.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#]
.ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'
.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'
.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#]
.ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#]
.ds ae a\h'-(\w'a'u*4/10)'e
.ds Ae A\h'-(\w'A'u*4/10)'E
. \" corrections for vroff
.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'
.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'
. \" for low resolution devices (crt and lpr)
.if \n(.H>23 .if \n(.V>19 \
.\" ========================================================================
.TH Encode::TW 3 "2001-09-21" "perl v5.8.8" "Perl Programmers Reference Guide"
Encode::TW \- Taiwan\-based Chinese Encodings
\& use Encode qw/encode decode/;
\& $big5 = encode("big5", $utf8); # loads Encode::TW implicitly
\& $utf8 = decode("big5", $big5); # ditto
This module implements tradition Chinese charset encodings as used
Encodings supported are as follows.
\& Canonical Alias Description
\& --------------------------------------------------------------------
\& big5-eten /\ebbig-?5$/i Big5 encoding (with ETen extensions)
\& big5-hkscs /\ebbig5-?hk(scs)?$/i
\& Big5 + Cantonese characters in Hong Kong
\& MacChineseTrad Big5 + Apple Vendor Mappings
\& = Big5 + Microsoft vendor mappings
\& --------------------------------------------------------------------
To find out how to use this module in detail, see Encode.
Due to size concerns, \f(CW\*(C`EUC\-TW\*(C'\fR (Extended Unix Character), \f(CW\*(C`CCCII\*(C'\fR
(Chinese Character Code for Information Interchange), \f(CW\*(C`BIG5PLUS\*(C'\fR
(\s-1CMEX\s0's Big5+) and \f(CW\*(C`BIG5EXT\*(C'\fR (\s-1CMEX\s0's Big5e) are distributed separately
on \s-1CPAN\s0, under the name Encode::HanExtra. That module also contains
extra China-based encodings.
Since the original \f(CW\*(C`big5\*(C'\fR encoding (1984) is not supported anywhere
(glibc and DOS-based systems uses \f(CW\*(C`big5\*(C'\fR to mean \f(CW\*(C`big5\-eten\*(C'\fR; Microsoft
uses \f(CW\*(C`big5\*(C'\fR to mean \f(CW\*(C`cp950\*(C'\fR), a conscious decision was made to alias
\&\f(CW\*(C`big5\*(C'\fR to \f(CW\*(C`big5\-eten\*(C'\fR, which is the de facto superset of the original
The \f(CW\*(C`CNS11643\*(C'\fR encoding files are not complete. For common \f(CW\*(C`CNS11643\*(C'\fR
manipulation, please use \f(CW\*(C`EUC\-TW\*(C'\fR in Encode::HanExtra, which contains
The \s-1ASCII\s0 region (0x00\-0x7f) is preserved for all encodings, even
though this conflicts with mappings by the Unicode Consortium. See
<http://www.debian.or.jp/~kubota/unicode\-symbols.html.en>
to find out why it is implemented that way.