.\" Automatically generated by Pod::Man v1.37, Pod::Parser v1.32 .\" .\" Standard preamble: .\" ======================================================================== .de Sh \" Subsection heading .br .if t .Sp .ne 5 .PP \fB\\$1\fR .PP .. .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. | will give a .\" real vertical bar. \*(C+ will give a nicer C++. Capital omega is used to .\" do unbreakable dashes and therefore won't be available. \*(C` and \*(C' .\" expand to `' in nroff, nothing in troff, for use with C<>. .tr \(*W-|\(bv\*(Tr .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' 'br\} .\" .\" If the F register is turned on, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . nr % 0 . rr F .\} .\" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .hy 0 .if n .na .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "Encode::JP 3" .TH Encode::JP 3 "2001-09-21" "perl v5.8.8" "Perl Programmers Reference Guide" .SH "NAME" Encode::JP \- Japanese Encodings .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 3 \& use Encode qw/encode decode/; \& $euc_jp = encode("euc-jp", $utf8); # loads Encode::JP implicitly \& $utf8 = decode("euc-jp", $euc_jp); # ditto .Ve .SH "ABSTRACT" .IX Header "ABSTRACT" This module implements Japanese charset encodings. Encodings supported are as follows. .PP .Vb 21 \& Canonical Alias Description \& -------------------------------------------------------------------- \& euc-jp /\ebeuc.*jp$/i EUC (Extended Unix Character) \& /\ebjp.*euc/i \& /\ebujis$/i \& shiftjis /\ebshift.*jis$/i Shift JIS (aka MS Kanji) \& /\ebsjis$/i \& 7bit-jis /\ebjis$/i 7bit JIS \& iso-2022-jp ISO-2022-JP [RFC1468] \& = 7bit JIS with all Halfwidth Kana \& converted to Fullwidth \& iso-2022-jp-1 ISO-2022-JP-1 [RFC2237] \& = ISO-2022-JP with JIS X 0212-1990 \& support. See below \& MacJapanese Shift JIS + Apple vendor mappings \& cp932 /\ebwindows-31j$/i Code Page 932 \& = Shift JIS + MS/IBM vendor mappings \& jis0201-raw JIS0201, raw format \& jis0208-raw JIS0201, raw format \& jis0212-raw JIS0201, raw format \& -------------------------------------------------------------------- .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" To find out how to use this module in detail, see Encode. .SH "Note on ISO\-2022\-JP(\-1)?" .IX Header "Note on ISO-2022-JP(-1)?" \&\s-1ISO\-2022\-JP\-1\s0 (\s-1RFC2237\s0) is a superset of \s-1ISO\-2022\-JP\s0 (\s-1RFC1468\s0) which adds support for \s-1JIS\s0 X 0212\-1990. That means you can use the same code to decode to utf8 but not vice versa. .PP .Vb 1 \& $utf8 = decode('iso-2022-jp-1', $stream); .Ve .PP and .PP .Vb 1 \& $utf8 = decode('iso-2022-jp', $stream); .Ve .PP yield the same result but .PP .Vb 1 \& $with_0212 = encode('iso-2022-jp-1', $utf8); .Ve .PP is now different from .PP .Vb 1 \& $without_0212 = encode('iso-2022-jp', $utf8 ); .Ve .PP In the latter case, characters that map to 0212 are first converted to U+3013 (0xA2AE in \s-1EUC\-JP\s0; a white square also known as 'Tofu' or \&'geta mark') then fed to the decoding engine. U+FFFD is not used, in order to preserve text layout as much as possible. .SH "BUGS" .IX Header "BUGS" The \s-1ASCII\s0 region (0x00\-0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium. See .PP .PP to find out why it is implemented that way. .SH "SEE ALSO" .IX Header "SEE ALSO" Encode