Commit | Line | Data |
---|---|---|
c2e56add MT |
1 | .\" Copyright (c) 1989 The Regents of the University of California. |
2 | .\" All rights reserved. | |
3 | .\" | |
4 | .\" Redistribution and use in source and binary forms are permitted | |
5 | .\" provided that the above copyright notice and this paragraph are | |
6 | .\" duplicated in all such forms and that any documentation, | |
7 | .\" advertising materials, and other materials related to such | |
8 | .\" distribution and use acknowledge that the software was developed | |
9 | .\" by the University of California, Berkeley. The name of the | |
10 | .\" University may not be used to endorse or promote products derived | |
11 | .\" from this software without specific prior written permission. | |
12 | .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR | |
13 | .\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED | |
14 | .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. | |
15 | .\" | |
6fae5e3d | 16 | .\" @(#)vis.3 5.2 (Berkeley) %G% |
c2e56add MT |
17 | .\" |
18 | .TH <CENCODE> <3> "" | |
19 | .UC 7 | |
20 | .AT 3 | |
21 | .SH NAME | |
22 | cencode, cdecode \- encode (decode) non-printing characters | |
23 | .SH SYNOPSIS | |
24 | .nf | |
25 | .B #include <cencode.h> | |
26 | .PP | |
27 | .B char *cencode(c, cflag) | |
28 | .B char c; | |
29 | .B int flag; | |
30 | .PP | |
31 | .B cdecode(c, cp, dflag) | |
32 | .B char c, *cp; | |
33 | .B int flag; | |
34 | .SH DESCRIPTION | |
35 | \fICencode\fP converts a non-printing character into a printable, | |
36 | invertable representation; \fIcdecode\fP inverts | |
37 | from that representation back to the original character. | |
38 | Both functions pass through printable characters, and | |
39 | are useful for filtering a stream of characters | |
40 | to and from a visual representation. | |
41 | .PP | |
42 | By default, \fIcencode\fP considers isgraph(c), space, tab, and | |
43 | newline as printable characters. Setting CENC_WHITE in | |
44 | cflag causes space, tab, and newline to be | |
45 | encoded as well. | |
46 | .PP | |
47 | There are 3 forms of representation, and all | |
48 | three can be requested, independent of each other, | |
49 | since some encode only a subset | |
50 | of the non-printable characters. | |
51 | All | |
52 | forms use the backslash character to introduce the visual | |
53 | sequence; two backslashs are used to represent a | |
54 | real backslash. The following lists the name of the form | |
55 | (specified in the cflag), and a description: | |
56 | .TP | |
57 | .I CENC_CTYPE | |
58 | Use C-style backslash sequences where possible. The following | |
59 | sequences are used to represent the indicated character: | |
60 | .nf | |
61 | ||
62 | \\n - NL (012) | |
63 | \\r - CR (015) | |
64 | \\b - BS (010) | |
65 | \\a - BEL (007) | |
66 | \\v - VT (013) | |
67 | \\t - HT (011) | |
68 | \\f - NP (014) | |
69 | \\000 - NUL (000) | |
70 | ||
71 | .fi | |
72 | These are the only characters that are converted using CDEC_CTYPE. | |
73 | The more familiar abbreviation of \\0 for NULL cannot be used since | |
74 | it could be confused as another octal number if the sequence | |
75 | is laid ahead of other octal digits. | |
76 | .PP | |
77 | .TP | |
78 | .I CENC_GRAPHIC | |
79 | Use an M to represent meta characters (chars with the 8th bit set), | |
80 | and use hat (^) to represent control characters (iscntrl(c)). The | |
81 | following forms are possible: | |
82 | .nf | |
83 | ||
84 | \\^C - Represents control character 'C'. Spans | |
85 | characters 000 through 037, and 0177 (as \\^?). | |
86 | \\M-C - Represents character 'C' with the 8th bit set. | |
87 | Spans characters 0240 (241 if CENC_WHITE is set) | |
88 | through 0376. | |
89 | \\M^C - Represents control character 'C' with the 8th | |
90 | bit set. Spans characters 0200 through 0237, | |
91 | and 0377 (as \\M^?). | |
92 | ||
93 | .fi | |
94 | The only characters that cannot be displayed using CDEC_GRAPHIC | |
95 | are space and meta-space, and only when CENC_WHITE is set. | |
96 | .TP | |
97 | .I CENC_OCTAL | |
98 | Use a three digit octal sequence. The form is: | |
99 | .nf | |
100 | ||
101 | \\ddd | |
102 | ||
103 | .fi | |
104 | where d represents an octal digit. All non-printing characters | |
105 | can be displayed in this form. | |
106 | .PP | |
107 | \fICencode\fP returns a pointer to a string that contains the | |
108 | printable representation of the character passed in c. If the character | |
109 | could not be encoded (because none of the selected formats can | |
110 | encode that character), it is placed in the returned | |
111 | string un-encoded. Note that if NULL is not encoded, it is placed | |
112 | in the string as two nulls. If the caller expects to encounter | |
113 | this situation, it suffices to always extract one character from | |
114 | the returned string before checking for NULL. If CDEC_OCTAL | |
115 | is selected, in addition to any other formats, this situation | |
116 | can never arrise. Also, calling \fIcencode\fP with no requested formats | |
117 | results in no encoding being done; however, backslashes are | |
118 | still doubled. | |
119 | .PP | |
120 | Using \fIcdecode\fP to decode previously encoded data is a little | |
121 | trickier. Essentially, characters are passed to \fIcdecode\fP | |
122 | until the decoder recognizes a character to return. There are | |
123 | five return codes which need to be handled: | |
124 | .TP | |
125 | .I CDEC_NEEDMORE | |
126 | The decoder is not done recognizing a control sequence; pass it | |
127 | another character in c. | |
128 | .TP | |
129 | .I CDEC_OK | |
6fae5e3d | 130 | A character was recognized and has been placed in *cp. |
c2e56add MT |
131 | .TP |
132 | .I CDEC_OKPUSH | |
6fae5e3d MT |
133 | A character was recognized and has been placed in *cp; however, |
134 | the character that was just passed in c is not yet needed. | |
c2e56add MT |
135 | When processing a stream of characters, the current character |
136 | should be used again. | |
137 | .TP | |
138 | .I CDEC_NOCHAR | |
139 | A sequence which represents no character was detected. | |
140 | .TP | |
141 | .I CDEC_SYNBAD | |
142 | An unrecognized backslash sequence was detected. The decoder | |
143 | was automatically reset to a normal state. All characters since | |
144 | the last un-escaped backslash character constitute the | |
145 | unrecognized sequence. | |
146 | .PP | |
147 | When the caller is finished feeding characters to \fIcdecode\fP, | |
148 | it | |
149 | should be called one last time with dflag set to CDEC_END. This will extract | |
150 | any remaining character. | |
151 | A sample code fragment | |
152 | is given to illustrate using cdecode: | |
153 | .nf | |
154 | ||
155 | char nc; | |
156 | while ((c = getchar()) != EOF) { | |
157 | again: | |
158 | switch(cdecode((char)c, &nc, 0)) { | |
159 | case CDEC_NEEDMORE: | |
160 | case CDEC_NOCHAR: | |
161 | break; | |
162 | case CDEC_OK: | |
163 | putchar(nc); | |
164 | break; | |
165 | case CDEC_OKPUSH: | |
166 | putchar(nc); | |
167 | goto again; | |
168 | case CDEC_SYNBAD: | |
169 | fprintf(stderr, "Bad sequence\n"); | |
170 | exit(1); | |
171 | } | |
172 | } | |
173 | if (cdecode((char)0, &nc, CDEC_END) == CDEC_OK) | |
174 | putchar(nc); | |
175 | ||
176 | .fi | |
177 | .SH "SEE ALSO" | |
178 | vis(1) |