is the primary interface to the system
how to get the most out of
The next few sections will discuss
and labor-saving devices.
Not all of these will be instantly useful
to any one person, of course,
and the others should give you ideas to store
until you try these things,
they will remain theoretical knowledge,
not something you have confidence in.
provides two commands for printing the contents of the lines
Most people are familiar with
to print all the lines you're editing,
(the letter `\fIl\|\fR'),
which gives slightly more information than
makes visible characters that are normally invisible,
such as tabs and backspaces.
If you list a line that contains some of these,
This makes it much easier to correct the sort of typing mistake
that inserts extra spaces adjacent to tabs,
or inserts a backspace followed by a space.
also `folds' long lines for printing _
any line that exceeds 72 characters is printed on multiple lines;
each printed line except the last is terminated by a backslash
so you can tell it was folded.
This is useful for printing long lines on short terminals.
command will print in a line a string of numbers preceded by a backslash,
These combinations are used to make visible characters that normally don't print,
like form feed or vertical tab or bell.
Each such combination is a single character.
When you see such characters, be wary _
they may have surprising meanings when printed on some terminals.
Often their presence means that your finger slipped while you were typing;
you almost never want them.
The Substitute Command `s'
Most of the next few sections will be taken up with a discussion
Since this is the command for changing the contents of individual
it probably has the most complexity of any
and the most potential for effective use.
As the simplest place to begin,
recall the meaning of a trailing
after a substitute command.
If there is more than one `this' on the line,
command can be followed by
to `print' or `list' (as described in the previous section)
the contents of the line:
are all legal, and mean slightly different things.
Make sure you know what the differences are.
command can be preceded by one or two `line numbers'
to specify that the substitution is to take place
`mispell' to `misspell' on every line of the file.
(and this is more likely to be what you wanted in this
You should also notice that if you add a
to the end of any of these substitute commands,
only the last line that got changed will be printed,
We will talk later about how to print all the lines
Occasionally you will make a substitution in a line,
only to realize too late that it was a ghastly mistake.
lets you `undo' the last substitution:
the last line that was substituted can be restored to
its previous state by typing the command
As you have undoubtedly noticed
certain characters have unexpected meanings
when they occur in the left side of a substitute command,
or in a search for a particular line.
In the next several sections, we will talk about
these special characters,
which are often called `metacharacters'.
The first one is the period `\*.'.
On the left side of a substitute command,
or in a search with `/.../',
finds any line where `x' and `y' occur separated by
a single character, as in
(We will use \*B to stand for a space whenever we need to
Since `\*.' matches a single character,
that gives you a way to deal with funny characters
Suppose you have a line that, when printed with the
and you want to get rid of the
(which represents the bell character, by the way).
The most obvious solution is to try
but this will fail. (Try it.)
The brute force solution, which most people would now take,
is to re-type the entire line.
This is guaranteed, and is actually quite a reasonable tactic
if the line in question isn't too big,
but for a very long line,
This is where the metacharacter `\*.' comes in handy.
Since `\*e07' really represents a single character,
The `\*.' matches the mysterious character between the `h' and the `i',
Bear in mind that since `\*.' matches any single character,
converts the first character on a line into a `,',
which very often is not what you intended.
As is true of many characters in
the `\*.' has several meanings, depending
This line shows all three:
The first `\*.' is a line number,
which is called `line dot'.
(We will discuss line dot more in Section 3.)
The second `\*.' is a metacharacter
that matches any single character on that line.
The third `\*.' is the only one that really is
an honest literal period.
side of a substitution, `\*.'
If you apply this command to the line
which is probably not what you intended.
Since a period means `any character',
the question naturally arises of what to do
when you really want a period.
For example, how do you convert the line
The backslash `\*e' does the job.
A backslash turns off any special meaning that the next character
might have; in particular,
`\*e\*.' converts the `\*.' from a `match anything'
you can use it to replace
The pair of characters `\*e\*.' is considered by
to be a single real period.
The backslash can also be used when searching for lines
that contain a special character.
Suppose you are looking for a line that contains
isn't adequate, for it will find
because the `\*.' matches the letter `A'.
you will find only lines that contain `\*.PP'.
The backslash can also be used to turn off special meanings for
characters other than `\*.'.
For example, consider finding a line that contains a backslash.
because the `\*e' isn't a literal `\*e', but instead means that the second `/'
no longer \%delimits the search.
But by preceding a backslash with another one,
you can search for a literal backslash.
Similarly, you can search for a forward slash `/' with
The backslash turns off the meaning of the immediately following `/' so that
it doesn't terminate the /.../ construction prematurely.
As an exercise, before reading further, find two substitute commands each of which will
Here are several solutions;
verify that each works as advertised.
A couple of miscellaneous notes about
backslashes and special characters.
First, you can use any character to delimit the pieces
command: there is nothing sacred about slashes.
(But you must use slashes for context searching.)
For instance, in a line that contains a lot of slashes already, like
//exec //sys.fort.go // etc...
you could use a colon as the delimiter _
to delete all the slashes, type
Second, if # and @ are your character erase and line kill characters,
you have to type \*e# and \*e@;
this is true whether you're talking to
When you are adding text with
backslash is not special, and you should only put in
one backslash for each one you really want.
The next metacharacter, the `$', stands for `the end of the line'.
As its most obvious use, suppose you have the line
and you wish to add the word `time' to the end.
Notice that a space is needed before `time' in
As another example, replace the second comma in
the following line with a period without altering the first:
Now is the time, for all good men,
The $ sign here provides context to make specific which comma we mean.
Without it, of course, the
command would operate on the first comma to produce
Now is the time\*. for all good men,
As another example, to convert
as we did earlier, we can use
has multiple meanings depending on context.
the first `$' refers to the
the second refers to the end of that line,
and the third is a literal dollar sign,
to be added to that line.
The circumflex (or hat or caret)
`^' stands for the beginning of the line.
For example, suppose you are looking for a line that begins
you will in all likelihood find several lines that contain `the' in the middle before
arriving at the one you want.
you narrow the context, and thus arrive at the desired one
The other use of `^' is of course to enable you to insert
something at the beginning of a line:
places a space at the beginning of the current line.
Metacharacters can be combined. To search for a
Suppose you have a line that looks like this:
\fItext \fR x y \fI text \fR
and there are some indeterminate number of spaces between the
Suppose the job is to replace all the spaces between
The line is too long to retype, and there are too many spaces
This is where the metacharacter `*'
A character followed by a star
stands for as many consecutive occurrences of that
To refer to all the spaces at once, say
`as many spaces as possible'.
Thus `x\*B*y' means `an x, as many spaces as possible, then a y'.
The star can be used with any character, not just space.
If the original example was instead
\fItext \fR x--------y \fI text \fR
then all `\-' signs can be replaced by a single space
Finally, suppose that the line was
\fItext \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fR
Can you see what trap lies in wait for the unwary?
The answer, naturally, is that it depends.
If there are no other x's or y's on the line,
then everything works, but it's blind luck, not good management.
Remember that `\*.' matches
Then `\*.*' matches as many single characters as possible,
and unless you're careful, it can eat up a lot more of the line
If the line was, for example, like this:
\fItext \fRx\fI text \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fRy\fI text \fR
will take everything from the
which, in this example, is undoubtedly more than you wanted.
The solution, of course, is to turn off the special meaning of
Now everything works, for `\*e\*.*' means `as many
There are times when the pattern `\*.*' is exactly what you want.
Now is the time for all good men ....
use `\*.*' to eat up everything after the `for':
There are a couple of additional pitfalls associated with `*' that you should be aware of.
Most notable is the fact that `as many as possible' means
The fact that zero is a legitimate possibility is
sometimes rather surprising.
For example, if our line contained
\fItext \fR xy \fI text \fR x y \fI text \fR
`xy' matches this pattern, for it consists of an `x',
The result is that the substitute acts on the first `xy',
and does not touch the later one that actually contains some intervening spaces.
The way around this, if it matters, is to specify a pattern like
which says `an x, a space, then as many more spaces as possible, then a y',
in other words, one or more spaces.
The other startling behavior of `*' is again related to the fact
that zero is a legitimate number of occurrences of something
followed by a star. The command
which is almost certainly not what was intended.
The reason for this behavior is that zero is a legal number
and there are no x's at the beginning of the line
(so that gets converted into a `y'),
nor between the `a' and the `b'
(so that gets converted into a `y'), nor ...
Make sure you really want zero matches;
if not, in this case write
`xx*' is one or more x's.
Suppose that you want to delete any numbers
at the beginning of all lines of a file.
You might first think of trying a series of commands like
but this is clearly going to take forever if the numbers are at all long.
Unless you want to repeat the commands over and over until
finally all numbers are gone,
you must get all the digits on one pass.
This is the purpose of the brackets [ and ].
matches any single digit _
the whole thing is called a `character class'.
With a character class, the job is easy.
The pattern `[0123456789]*' matches zero or more digits (an entire number), so
deletes all digits from the beginning of all lines.
Any characters can appear within a character class,
and just to confuse the issue there are essentially no special characters
even the backslash doesn't have a special meaning.
To search for special characters, for example, you can say
Within [...], the `[' is not special.
To get a `]' into a character class,
make it the first character.
It's a nuisance to have to spell out the digits,
so you can abbreviate them as
similarly, [a\-z] stands for the lower case letters,
As a final frill on character classes, you can specify a class
that means `none of the following characters'.
This is done by beginning the class with a `^':
stands for `any character
Thus you might find the first line that doesn't begin with a tab or space
Within a character class,
the circumflex has a special meaning
only if it occurs at the beginning.
Just to convince yourself, verify that
finds a line that doesn't begin with a circumflex.
The ampersand `&' is used primarily to save typing.
Suppose you have the line
Of course you can always say
but it seems silly to have to repeat the `the'.
The `&' is used to eliminate the repetition.
side of a substitute, the ampersand means `whatever
was just matched', so you can say
and the `&' will stand for `the'.
Of course this isn't much of a saving if the thing
matched is just `the', but if it is something truly long or awful,
or if it is something like `.*'
which matches a lot of text,
you can save some tedious typing.
There is also much less chance of making a typing error
For example, to parenthesize a line,
regardless of its length,
The ampersand can occur more than once on the right side:
s/the/& best and & worst/
Now is the best and the worst time
converts the original line into
Now is the time? Now is the time!!
To get a literal ampersand, naturally the backslash is used to turn off the special meaning:
converts the word into the symbol.
Notice that `&' is not special on the left side
of a substitute, only on the
provides a facility for splitting a single line into two or more shorter lines by `substituting in a newline'.
As the simplest example, suppose a line has gotten unmanageably long
because of editing (or merely because it was unwisely typed).
\fItext \fR xy \fI text \fR
you can break it between the `x' and the `y' like this:
This is actually a single command,
although it is typed on two lines.
Bearing in mind that `\*e' turns off special meanings,
it seems relatively intuitive that a `\*e' at the end of
a line would make the newline there
You can in fact make a single line into several lines
with this same mechanism.
As a large example, consider underlining the word `very'
by splitting `very' onto a separate line,
formatting command `.ul'.
\fItext \fR a very big \fI text \fR
converts the line into four shorter lines,
preceding the word `very' by the
and eliminating the spaces around the `very',
When a newline is substituted
in, dot is left pointing at the last line created.
Lines may also be joined together,
but this is done with the
and supposing that dot is set to the first of them,
which is why we carefully showed a blank
at the beginning of the second line.
joins line dot to line dot+1,
but any contiguous set of lines can be joined.
Just specify the starting and ending line numbers.
joins all the lines into one big one
(More on line numbers in Section 3.)
Rearranging a Line with \*e( ... \*e)
(This section should be skipped on first reading.)
Recall that `&' is a shorthand that stands for whatever
was matched by the left side of an
In much the same way you can capture separate pieces
the only difference is that you have to specify
on the left side just what pieces you're interested in.
Suppose, for instance, that
you have a file of lines that consist of names in the form
and you want the initials to precede the name, as in
It is possible to do this with a series of editing commands,
but it is tedious and error-prone.
(It is instructive to figure out how it is done, though.)
is to `tag' the pieces of the pattern (in this case,
the last name, and the initials),
and then rearrange the pieces.
On the left side of a substitution,
if part of the pattern is enclosed between
whatever matched that part is remembered,
and available for use on the right side.
the symbol `\*e1' refers to whatever
matched the first \*e(...\*e) pair,
`\*e2' to the second \*e(...\*e),
1,$s/^\*e([^,]*\*e),\*B*\*e(\*.*\*e)/\*e2\*B\*e1/
although hard to read, does the job.
The first \*e(...\*e) matches the last name,
which is any string up to the comma;
this is referred to on the right side with `\*e1'.
The second \*e(...\*e) is whatever follows
the comma and any spaces,
and is referred to as `\*e2'.
Of course, with any editing sequence this complicated,
it's foolhardy to simply run it and hope.
provide a way for you to print exactly those
lines which were affected by the
and thus verify that it did what you wanted