.\" Copyright (c) 1989 The Regents of the University of California.
.\" Redistribution and use in source and binary forms are permitted
.\" provided that the above copyright notice and this paragraph are
.\" duplicated in all such forms and that any documentation,
.\" advertising materials, and other materials related to such
.\" distribution and use acknowledge that the software was developed
.\" by the University of California, Berkeley. The name of the
.\" University may not be used to endorse or promote products derived
.\" from this software without specific prior written permission.
.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
.\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
.\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
.\" @(#)vis.3 5.3 (Berkeley) %G%
cencode, cdecode \- encode (decode) non-printing characters
char *cencode(character, cflag)
cdecode(character, store, dflag)
converts a non-printing character into a printable, invertible
converts that representation back into the original character.
These functions are useful for filtering a stream of characters to
and from a visual representation.
returns a pointer to a string that contains the printable
representation of the character passed as the argument
considers characters selected by
space, tab, and newline to be printable characters.
There are three possible forms of representation, as specified by the
All forms use the backslash character (``\e'') to introduce a special
sequence; two backslashes are used to represent a real backslash.
one or more of the following values:
causes space, tab, and newline characters to be considered non-printable,
Use C-style backslash sequences to represent standard non-printable
The following sequences are used to represent the indicated characters:
These are the only characters that are converted using
The more familiar abbreviation of ``\e0'' for NULL cannot be used
as it could be confused with other octal numbers if the sequence
Use an ``M'' to represent meta characters (characters with the 8th
bit set), and use carat (``^'') to represent control characters see
The following formats are used:
Represents the control character ``C''.
Spans characters \e000 through \e037, and \e0177 (as ``\e^?'').
Represents character ``C'' with the 8th bit set.
Spans characters \e0240 (\e0241 if
Represents control character ``C'' with the 8th bit set.
Spans characters \e0200 through \e0237, and \e0377 (as ``\eM^?'').
The only characters that cannot be displayed using
are space and meta-space, and only when
Use a three digit octal sequence. The form is ``\eddd'' where
d represents an octal digit.
All non-printing characters may be displayed in this form.
If the supplied character could not be encoded (because the selected
formats were unable encode that character) it is placed in the return
Note that if NULL's are not encoded, it is placed in the string as two
If the caller expects to encounter this situation, it suffices to always
extract one character from the returned string before checking for NULL.
is selected, in addition to any other formats, this situation can never
with no requested formats results in no encoding being done; however,
backslashes are still doubled.
is used decode data encoded by
until the decoder recognizes a character to return.
one or more of the following values:
Treat the carat (``^'') character specially, i.e. decode the sequence
``^C'' as the control character ``C''.
This is separate from the sequence ``\e^C'' as output by
flag set as it does not require the preceding backslash character.
Reset the state of the decoder to the initial state, and flush out
any characters have been retained in the decoder.
There are five possible return values from
The decoder has not yet recognized a control sequence; supply it
A valid sequence which did not result in a character was decoded.
A character was recognized and has been placed in the location
A character was recognized and has been placed in the location
however, the character that was just supplied to
When processing a stream of characters, the current character should be
An unrecognized backslash sequence was detected.
The decoder was automatically reset to a normal state.
All characters since the last un-escaped backslash character constitute
the unrecognized sequence.
The following code fragment illustrates the use of
while ((ch = getchar()) != EOF) {
switch(cdecode((char)ch, &nc, 0)) {
(void)fprintf(stderr, "bad sequence!\n");
if (cdecode((char)0, &nc, CDEC_END) == CDEC_OK)