Commit | Line | Data |
---|---|---|
8340f87c BJ |
1 | .NH |
2 | SPECIAL CHARACTERS | |
3 | .PP | |
4 | The editor | |
5 | .UL ed | |
6 | is the primary interface to the system | |
7 | for many people, so | |
8 | it is worthwhile to know | |
9 | how to get the most out of | |
10 | .UL ed | |
11 | for the least effort. | |
12 | .PP | |
13 | The next few sections will discuss | |
14 | shortcuts | |
15 | and labor-saving devices. | |
16 | Not all of these will be instantly useful | |
17 | to any one person, of course, | |
18 | but a few will be, | |
19 | and the others should give you ideas to store | |
20 | away for future use. | |
21 | And as always, | |
22 | until you try these things, | |
23 | they will remain theoretical knowledge, | |
24 | not something you have confidence in. | |
25 | .SH | |
26 | The List command `l' | |
27 | .PP | |
28 | .UL ed | |
29 | provides two commands for printing the contents of the lines | |
30 | you're editing. | |
31 | Most people are familiar with | |
32 | .UL p , | |
33 | in combinations like | |
34 | .P1 | |
35 | 1,$p | |
36 | .P2 | |
37 | to print all the lines you're editing, | |
38 | or | |
39 | .P1 | |
40 | s/abc/def/p | |
41 | .P2 | |
42 | to change | |
43 | `abc' | |
44 | to | |
45 | `def' | |
46 | on the current line. | |
47 | Less familiar is the | |
48 | .ul | |
49 | list | |
50 | command | |
51 | .UL l | |
52 | (the letter `\fIl\|\fR'), | |
53 | which gives slightly more information than | |
54 | .UL p . | |
55 | In particular, | |
56 | .UL l | |
57 | makes visible characters that are normally invisible, | |
58 | such as tabs and backspaces. | |
59 | If you list a line that contains some of these, | |
60 | .UL l | |
61 | will print each tab as | |
62 | .UL \z\(mi> | |
63 | and each backspace as | |
64 | .UL \z\(mi< . | |
65 | This makes it much easier to correct the sort of typing mistake | |
66 | that inserts extra spaces adjacent to tabs, | |
67 | or inserts a backspace followed by a space. | |
68 | .PP | |
69 | The | |
70 | .UL l | |
71 | command | |
72 | also `folds' long lines for printing _ | |
73 | any line that exceeds 72 characters is printed on multiple lines; | |
74 | each printed line except the last is terminated by a backslash | |
75 | .UL \*e , | |
76 | so you can tell it was folded. | |
77 | This is useful for printing long lines on short terminals. | |
78 | .PP | |
79 | Occasionally the | |
80 | .UL l | |
81 | command will print in a line a string of numbers preceded by a backslash, | |
82 | such as \*e07 or \*e16. | |
83 | These combinations are used to make visible characters that normally don't print, | |
84 | like form feed or vertical tab or bell. | |
85 | Each such combination is a single character. | |
86 | When you see such characters, be wary _ | |
87 | they may have surprising meanings when printed on some terminals. | |
88 | Often their presence means that your finger slipped while you were typing; | |
89 | you almost never want them. | |
90 | .SH | |
91 | The Substitute Command `s' | |
92 | .PP | |
93 | Most of the next few sections will be taken up with a discussion | |
94 | of the | |
95 | substitute | |
96 | command | |
97 | .UL s . | |
98 | Since this is the command for changing the contents of individual | |
99 | lines, | |
100 | it probably has the most complexity of any | |
101 | .UL ed | |
102 | command, | |
103 | and the most potential for effective use. | |
104 | .PP | |
105 | As the simplest place to begin, | |
106 | recall the meaning of a trailing | |
107 | .UL g | |
108 | after a substitute command. | |
109 | With | |
110 | .P1 | |
111 | s/this/that/ | |
112 | .P2 | |
113 | and | |
114 | .P1 | |
115 | s/this/that/g | |
116 | .P2 | |
117 | the | |
118 | first | |
119 | one replaces the | |
120 | .ul | |
121 | first | |
122 | `this' on the line | |
123 | with `that'. | |
124 | If there is more than one `this' on the line, | |
125 | the second form | |
126 | with the trailing | |
127 | .UL g | |
128 | changes | |
129 | .ul | |
130 | all | |
131 | of them. | |
132 | .PP | |
133 | Either form of the | |
134 | .UL s | |
135 | command can be followed by | |
136 | .UL p | |
137 | or | |
138 | .UL l | |
139 | to `print' or `list' (as described in the previous section) | |
140 | the contents of the line: | |
141 | .P1 | |
142 | s/this/that/p | |
143 | s/this/that/l | |
144 | s/this/that/gp | |
145 | s/this/that/gl | |
146 | .P2 | |
147 | are all legal, and mean slightly different things. | |
148 | Make sure you know what the differences are. | |
149 | .PP | |
150 | Of course, any | |
151 | .UL s | |
152 | command can be preceded by one or two `line numbers' | |
153 | to specify that the substitution is to take place | |
154 | on a group of lines. | |
155 | Thus | |
156 | .P1 | |
157 | 1,$s/mispell/misspell/ | |
158 | .P2 | |
159 | changes the | |
160 | .ul | |
161 | first | |
162 | occurrence of | |
163 | `mispell' to `misspell' on every line of the file. | |
164 | But | |
165 | .P1 | |
166 | 1,$s/mispell/misspell/g | |
167 | .P2 | |
168 | changes | |
169 | .ul | |
170 | every | |
171 | occurrence in every line | |
172 | (and this is more likely to be what you wanted in this | |
173 | particular case). | |
174 | .PP | |
175 | You should also notice that if you add a | |
176 | .UL p | |
177 | or | |
178 | .UL l | |
179 | to the end of any of these substitute commands, | |
180 | only the last line that got changed will be printed, | |
181 | not all the lines. | |
182 | We will talk later about how to print all the lines | |
183 | that were modified. | |
184 | .SH | |
185 | The Undo Command `u' | |
186 | .PP | |
187 | Occasionally you will make a substitution in a line, | |
188 | only to realize too late that it was a ghastly mistake. | |
189 | The `undo' command | |
190 | .UL u | |
191 | lets you `undo' the last substitution: | |
192 | the last line that was substituted can be restored to | |
193 | its previous state by typing the command | |
194 | .P1 | |
195 | u | |
196 | .P2 | |
197 | .SH | |
198 | The Metacharacter `\*.' | |
199 | .PP | |
200 | As you have undoubtedly noticed | |
201 | when you use | |
202 | .UL ed , | |
203 | certain characters have unexpected meanings | |
204 | when they occur in the left side of a substitute command, | |
205 | or in a search for a particular line. | |
206 | In the next several sections, we will talk about | |
207 | these special characters, | |
208 | which are often called `metacharacters'. | |
209 | .PP | |
210 | The first one is the period `\*.'. | |
211 | On the left side of a substitute command, | |
212 | or in a search with `/.../', | |
213 | `\*.' stands for | |
214 | .ul | |
215 | any | |
216 | single character. | |
217 | Thus the search | |
218 | .P1 | |
219 | /x\*.y/ | |
220 | .P2 | |
221 | finds any line where `x' and `y' occur separated by | |
222 | a single character, as in | |
223 | .P1 | |
224 | x+y | |
225 | x\-y | |
226 | x\*By | |
227 | x\*.y | |
228 | .P2 | |
229 | and so on. | |
230 | (We will use \*B to stand for a space whenever we need to | |
231 | make it visible.) | |
232 | .PP | |
233 | Since `\*.' matches a single character, | |
234 | that gives you a way to deal with funny characters | |
235 | printed by | |
236 | .UL l . | |
237 | Suppose you have a line that, when printed with the | |
238 | .UL l | |
239 | command, appears as | |
240 | .P1 | |
241 | .... th\*e07is .... | |
242 | .P2 | |
243 | and you want to get rid of the | |
244 | \*e07 | |
245 | (which represents the bell character, by the way). | |
246 | .PP | |
247 | The most obvious solution is to try | |
248 | .P1 | |
249 | s/\*e07// | |
250 | .P2 | |
251 | but this will fail. (Try it.) | |
252 | The brute force solution, which most people would now take, | |
253 | is to re-type the entire line. | |
254 | This is guaranteed, and is actually quite a reasonable tactic | |
255 | if the line in question isn't too big, | |
256 | but for a very long line, | |
257 | re-typing is a bore. | |
258 | This is where the metacharacter `\*.' comes in handy. | |
259 | Since `\*e07' really represents a single character, | |
260 | if we say | |
261 | .P1 | |
262 | s/th\*.is/this/ | |
263 | .P2 | |
264 | the job is done. | |
265 | The `\*.' matches the mysterious character between the `h' and the `i', | |
266 | .ul | |
267 | whatever it is. | |
268 | .PP | |
269 | Bear in mind that since `\*.' matches any single character, | |
270 | the command | |
271 | .P1 | |
272 | s/\*./,/ | |
273 | .P2 | |
274 | converts the first character on a line into a `,', | |
275 | which very often is not what you intended. | |
276 | .PP | |
277 | As is true of many characters in | |
278 | .UL ed , | |
279 | the `\*.' has several meanings, depending | |
280 | on its context. | |
281 | This line shows all three: | |
282 | .P1 | |
283 | \&\*.s/\*./\*./ | |
284 | .P2 | |
285 | The first `\*.' is a line number, | |
286 | the number of | |
287 | the line we are editing, | |
288 | which is called `line dot'. | |
289 | (We will discuss line dot more in Section 3.) | |
290 | The second `\*.' is a metacharacter | |
291 | that matches any single character on that line. | |
292 | The third `\*.' is the only one that really is | |
293 | an honest literal period. | |
294 | On the | |
295 | .ul | |
296 | right | |
297 | side of a substitution, `\*.' | |
298 | is not special. | |
299 | If you apply this command to the line | |
300 | .P1 | |
301 | Now is the time\*. | |
302 | .P2 | |
303 | the result will | |
304 | be | |
305 | .P1 | |
306 | \&\*.ow is the time\*. | |
307 | .P2 | |
308 | which is probably not what you intended. | |
309 | .SH | |
310 | The Backslash `\*e' | |
311 | .PP | |
312 | Since a period means `any character', | |
313 | the question naturally arises of what to do | |
314 | when you really want a period. | |
315 | For example, how do you convert the line | |
316 | .P1 | |
317 | Now is the time\*. | |
318 | .P2 | |
319 | into | |
320 | .P1 | |
321 | Now is the time? | |
322 | .P2 | |
323 | The backslash `\*e' does the job. | |
324 | A backslash turns off any special meaning that the next character | |
325 | might have; in particular, | |
326 | `\*e\*.' converts the `\*.' from a `match anything' | |
327 | into a period, so | |
328 | you can use it to replace | |
329 | the period in | |
330 | .P1 | |
331 | Now is the time\*. | |
332 | .P2 | |
333 | like this: | |
334 | .P1 | |
335 | s/\*e\*./?/ | |
336 | .P2 | |
337 | The pair of characters `\*e\*.' is considered by | |
338 | .UL ed | |
339 | to be a single real period. | |
340 | .PP | |
341 | The backslash can also be used when searching for lines | |
342 | that contain a special character. | |
343 | Suppose you are looking for a line that contains | |
344 | .P1 | |
345 | \&\*.PP | |
346 | .P2 | |
347 | The search | |
348 | .P1 | |
349 | /\*.PP/ | |
350 | .P2 | |
351 | isn't adequate, for it will find | |
352 | a line like | |
353 | .P1 | |
354 | THE APPLICATION OF ... | |
355 | .P2 | |
356 | because the `\*.' matches the letter `A'. | |
357 | But if you say | |
358 | .P1 | |
359 | /\*e\*.PP/ | |
360 | .P2 | |
361 | you will find only lines that contain `\*.PP'. | |
362 | .PP | |
363 | The backslash can also be used to turn off special meanings for | |
364 | characters other than `\*.'. | |
365 | For example, consider finding a line that contains a backslash. | |
366 | The search | |
367 | .P1 | |
368 | /\*e/ | |
369 | .P2 | |
370 | won't work, | |
371 | because the `\*e' isn't a literal `\*e', but instead means that the second `/' | |
372 | no longer \%delimits the search. | |
373 | But by preceding a backslash with another one, | |
374 | you can search for a literal backslash. | |
375 | Thus | |
376 | .P1 | |
377 | /\*e\*e/ | |
378 | .P2 | |
379 | does work. | |
380 | Similarly, you can search for a forward slash `/' with | |
381 | .P1 | |
382 | /\*e// | |
383 | .P2 | |
384 | The backslash turns off the meaning of the immediately following `/' so that | |
385 | it doesn't terminate the /.../ construction prematurely. | |
386 | .PP | |
387 | As an exercise, before reading further, find two substitute commands each of which will | |
388 | convert the line | |
389 | .P1 | |
390 | \*ex\*e\*.\*ey | |
391 | .P2 | |
392 | into the line | |
393 | .P1 | |
394 | \*ex\*ey | |
395 | .P2 | |
396 | .PP | |
397 | Here are several solutions; | |
398 | verify that each works as advertised. | |
399 | .P1 | |
400 | s/\*e\*e\*e\*.// | |
401 | s/x\*.\*./x/ | |
402 | s/\*.\*.y/y/ | |
403 | .P2 | |
404 | .PP | |
405 | A couple of miscellaneous notes about | |
406 | backslashes and special characters. | |
407 | First, you can use any character to delimit the pieces | |
408 | of an | |
409 | .UL s | |
410 | command: there is nothing sacred about slashes. | |
411 | (But you must use slashes for context searching.) | |
412 | For instance, in a line that contains a lot of slashes already, like | |
413 | .P1 | |
414 | //exec //sys.fort.go // etc... | |
415 | .P2 | |
416 | you could use a colon as the delimiter _ | |
417 | to delete all the slashes, type | |
418 | .P1 | |
419 | s:/::g | |
420 | .P2 | |
421 | .PP | |
422 | Second, if # and @ are your character erase and line kill characters, | |
423 | you have to type \*e# and \*e@; | |
424 | this is true whether you're talking to | |
425 | .UL ed | |
426 | or any other program. | |
427 | .PP | |
428 | When you are adding text with | |
429 | .UL a | |
430 | or | |
431 | .UL i | |
432 | or | |
433 | .UL c , | |
434 | backslash is not special, and you should only put in | |
435 | one backslash for each one you really want. | |
436 | .SH | |
437 | The Dollar Sign `$' | |
438 | .PP | |
439 | The next metacharacter, the `$', stands for `the end of the line'. | |
440 | As its most obvious use, suppose you have the line | |
441 | .P1 | |
442 | Now is the | |
443 | .P2 | |
444 | and you wish to add the word `time' to the end. | |
445 | Use the $ like this: | |
446 | .P1 | |
447 | s/$/\*Btime/ | |
448 | .P2 | |
449 | to get | |
450 | .P1 | |
451 | Now is the time | |
452 | .P2 | |
453 | Notice that a space is needed before `time' in | |
454 | the substitute command, | |
455 | or you will get | |
456 | .P1 | |
457 | Now is thetime | |
458 | .P2 | |
459 | .PP | |
460 | As another example, replace the second comma in | |
461 | the following line with a period without altering the first: | |
462 | .P1 | |
463 | Now is the time, for all good men, | |
464 | .P2 | |
465 | The command needed is | |
466 | .P1 | |
467 | s/,$/\*./ | |
468 | .P2 | |
469 | The $ sign here provides context to make specific which comma we mean. | |
470 | Without it, of course, the | |
471 | .UL s | |
472 | command would operate on the first comma to produce | |
473 | .P1 | |
474 | Now is the time\*. for all good men, | |
475 | .P2 | |
476 | .PP | |
477 | As another example, to convert | |
478 | .P1 | |
479 | Now is the time\*. | |
480 | .P2 | |
481 | into | |
482 | .P1 | |
483 | Now is the time? | |
484 | .P2 | |
485 | as we did earlier, we can use | |
486 | .P1 | |
487 | s/\*.$/?/ | |
488 | .P2 | |
489 | .PP | |
490 | Like `\*.', the `$' | |
491 | has multiple meanings depending on context. | |
492 | In the line | |
493 | .P1 | |
494 | $s/$/$/ | |
495 | .P2 | |
496 | the first `$' refers to the | |
497 | last line of the file, | |
498 | the second refers to the end of that line, | |
499 | and the third is a literal dollar sign, | |
500 | to be added to that line. | |
501 | .SH | |
502 | The Circumflex `^' | |
503 | .PP | |
504 | The circumflex (or hat or caret) | |
505 | `^' stands for the beginning of the line. | |
506 | For example, suppose you are looking for a line that begins | |
507 | with `the'. | |
508 | If you simply say | |
509 | .P1 | |
510 | /the/ | |
511 | .P2 | |
512 | you will in all likelihood find several lines that contain `the' in the middle before | |
513 | arriving at the one you want. | |
514 | But with | |
515 | .P1 | |
516 | /^the/ | |
517 | .P2 | |
518 | you narrow the context, and thus arrive at the desired one | |
519 | more easily. | |
520 | .PP | |
521 | The other use of `^' is of course to enable you to insert | |
522 | something at the beginning of a line: | |
523 | .P1 | |
524 | s/^/\*B/ | |
525 | .P2 | |
526 | places a space at the beginning of the current line. | |
527 | .PP | |
528 | Metacharacters can be combined. To search for a | |
529 | line that contains | |
530 | .ul | |
531 | only | |
532 | the characters | |
533 | .P1 | |
534 | \&\*.PP | |
535 | .P2 | |
536 | you can use the command | |
537 | .P1 | |
538 | /^\*e\*.PP$/ | |
539 | .P2 | |
540 | .SH | |
541 | The Star `*' | |
542 | .PP | |
543 | Suppose you have a line that looks like this: | |
544 | .P1 | |
545 | \fItext \fR x y \fI text \fR | |
546 | .P2 | |
547 | where | |
548 | .ul | |
549 | text | |
550 | stands | |
551 | for lots of text, | |
552 | and there are some indeterminate number of spaces between the | |
553 | .UL x | |
554 | and the | |
555 | .UL y . | |
556 | Suppose the job is to replace all the spaces between | |
557 | .UL x | |
558 | and | |
559 | .UL y | |
560 | by a single space. | |
561 | The line is too long to retype, and there are too many spaces | |
562 | to count. | |
563 | What now? | |
564 | .PP | |
565 | This is where the metacharacter `*' | |
566 | comes in handy. | |
567 | A character followed by a star | |
568 | stands for as many consecutive occurrences of that | |
569 | character as possible. | |
570 | To refer to all the spaces at once, say | |
571 | .P1 | |
572 | s/x\*B*y/x\*By/ | |
573 | .P2 | |
574 | The construction | |
575 | `\*B*' | |
576 | means | |
577 | `as many spaces as possible'. | |
578 | Thus `x\*B*y' means `an x, as many spaces as possible, then a y'. | |
579 | .PP | |
580 | The star can be used with any character, not just space. | |
581 | If the original example was instead | |
582 | .P1 | |
583 | \fItext \fR x--------y \fI text \fR | |
584 | .P2 | |
585 | then all `\-' signs can be replaced by a single space | |
586 | with the command | |
587 | .P1 | |
588 | s/x-*y/x\*By/ | |
589 | .P2 | |
590 | .PP | |
591 | Finally, suppose that the line was | |
592 | .P1 | |
593 | \fItext \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fR | |
594 | .P2 | |
595 | Can you see what trap lies in wait for the unwary? | |
596 | If you blindly type | |
597 | .P1 | |
598 | s/x\*.*y/x\*By/ | |
599 | .P2 | |
600 | what will happen? | |
601 | The answer, naturally, is that it depends. | |
602 | If there are no other x's or y's on the line, | |
603 | then everything works, but it's blind luck, not good management. | |
604 | Remember that `\*.' matches | |
605 | .ul | |
606 | any | |
607 | single character? | |
608 | Then `\*.*' matches as many single characters as possible, | |
609 | and unless you're careful, it can eat up a lot more of the line | |
610 | than you expected. | |
611 | If the line was, for example, like this: | |
612 | .P1 | |
613 | \fItext \fRx\fI text \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fRy\fI text \fR | |
614 | .P2 | |
615 | then saying | |
616 | .P1 | |
617 | s/x\*.*y/x\*By/ | |
618 | .P2 | |
619 | will take everything from the | |
620 | .ul | |
621 | first | |
622 | `x' to the | |
623 | .ul | |
624 | last | |
625 | `y', | |
626 | which, in this example, is undoubtedly more than you wanted. | |
627 | .PP | |
628 | The solution, of course, is to turn off the special meaning of | |
629 | `\*.' with | |
630 | `\*e\*.': | |
631 | .P1 | |
632 | s/x\*e\*.*y/x\*By/ | |
633 | .P2 | |
634 | Now everything works, for `\*e\*.*' means `as many | |
635 | .ul | |
636 | periods | |
637 | as possible'. | |
638 | .PP | |
639 | There are times when the pattern `\*.*' is exactly what you want. | |
640 | For example, to change | |
641 | .P1 | |
642 | Now is the time for all good men .... | |
643 | .P2 | |
644 | into | |
645 | .P1 | |
646 | Now is the time\*. | |
647 | .P2 | |
648 | use `\*.*' to eat up everything after the `for': | |
649 | .P1 | |
650 | s/\*Bfor\*.*/\*./ | |
651 | .P2 | |
652 | .PP | |
653 | There are a couple of additional pitfalls associated with `*' that you should be aware of. | |
654 | Most notable is the fact that `as many as possible' means | |
655 | .ul | |
656 | zero | |
657 | or more. | |
658 | The fact that zero is a legitimate possibility is | |
659 | sometimes rather surprising. | |
660 | For example, if our line contained | |
661 | .P1 | |
662 | \fItext \fR xy \fI text \fR x y \fI text \fR | |
663 | .P2 | |
664 | and we said | |
665 | .P1 | |
666 | s/x\*B*y/x\*By/ | |
667 | .P2 | |
668 | the | |
669 | .ul | |
670 | first | |
671 | `xy' matches this pattern, for it consists of an `x', | |
672 | zero spaces, and a `y'. | |
673 | The result is that the substitute acts on the first `xy', | |
674 | and does not touch the later one that actually contains some intervening spaces. | |
675 | .PP | |
676 | The way around this, if it matters, is to specify a pattern like | |
677 | .P1 | |
678 | /x\*B\*B*y/ | |
679 | .P2 | |
680 | which says `an x, a space, then as many more spaces as possible, then a y', | |
681 | in other words, one or more spaces. | |
682 | .PP | |
683 | The other startling behavior of `*' is again related to the fact | |
684 | that zero is a legitimate number of occurrences of something | |
685 | followed by a star. The command | |
686 | .P1 | |
687 | s/x*/y/g | |
688 | .P2 | |
689 | when applied to the line | |
690 | .P1 | |
691 | abcdef | |
692 | .P2 | |
693 | produces | |
694 | .P1 | |
695 | yaybycydyeyfy | |
696 | .P2 | |
697 | which is almost certainly not what was intended. | |
698 | The reason for this behavior is that zero is a legal number | |
699 | of matches, | |
700 | and there are no x's at the beginning of the line | |
701 | (so that gets converted into a `y'), | |
702 | nor between the `a' and the `b' | |
703 | (so that gets converted into a `y'), nor ... | |
704 | and so on. | |
705 | Make sure you really want zero matches; | |
706 | if not, in this case write | |
707 | .P1 | |
708 | s/xx*/y/g | |
709 | .P2 | |
710 | `xx*' is one or more x's. | |
711 | .SH | |
712 | The Brackets `[ ]' | |
713 | .PP | |
714 | Suppose that you want to delete any numbers | |
715 | that appear | |
716 | at the beginning of all lines of a file. | |
717 | You might first think of trying a series of commands like | |
718 | .P1 | |
719 | 1,$s/^1*// | |
720 | 1,$s/^2*// | |
721 | 1,$s/^3*// | |
722 | .P2 | |
723 | and so on, | |
724 | but this is clearly going to take forever if the numbers are at all long. | |
725 | Unless you want to repeat the commands over and over until | |
726 | finally all numbers are gone, | |
727 | you must get all the digits on one pass. | |
728 | This is the purpose of the brackets [ and ]. | |
729 | .PP | |
730 | The construction | |
731 | .P1 | |
732 | [0123456789] | |
733 | .P2 | |
734 | matches any single digit _ | |
735 | the whole thing is called a `character class'. | |
736 | With a character class, the job is easy. | |
737 | The pattern `[0123456789]*' matches zero or more digits (an entire number), so | |
738 | .P1 | |
739 | 1,$s/^[0123456789]*// | |
740 | .P2 | |
741 | deletes all digits from the beginning of all lines. | |
742 | .PP | |
743 | Any characters can appear within a character class, | |
744 | and just to confuse the issue there are essentially no special characters | |
745 | inside the brackets; | |
746 | even the backslash doesn't have a special meaning. | |
747 | To search for special characters, for example, you can say | |
748 | .P1 | |
749 | /[\*.\*e$^[]/ | |
750 | .P2 | |
751 | Within [...], the `[' is not special. | |
752 | To get a `]' into a character class, | |
753 | make it the first character. | |
754 | .PP | |
755 | It's a nuisance to have to spell out the digits, | |
756 | so you can abbreviate them as | |
757 | [0\-9]; | |
758 | similarly, [a\-z] stands for the lower case letters, | |
759 | and | |
760 | [A\-Z] for upper case. | |
761 | .PP | |
762 | As a final frill on character classes, you can specify a class | |
763 | that means `none of the following characters'. | |
764 | This is done by beginning the class with a `^': | |
765 | .P1 | |
766 | [^0-9] | |
767 | .P2 | |
768 | stands for `any character | |
769 | .ul | |
770 | except | |
771 | a digit'. | |
772 | Thus you might find the first line that doesn't begin with a tab or space | |
773 | by a search like | |
774 | .P1 | |
775 | /^[^(space)(tab)]/ | |
776 | .P2 | |
777 | .PP | |
778 | Within a character class, | |
779 | the circumflex has a special meaning | |
780 | only if it occurs at the beginning. | |
781 | Just to convince yourself, verify that | |
782 | .P1 | |
783 | /^[^^]/ | |
784 | .P2 | |
785 | finds a line that doesn't begin with a circumflex. | |
786 | .SH | |
787 | The Ampersand `&' | |
788 | .PP | |
789 | The ampersand `&' is used primarily to save typing. | |
790 | Suppose you have the line | |
791 | .P1 | |
792 | Now is the time | |
793 | .P2 | |
794 | and you want to make it | |
795 | .P1 | |
796 | Now is the best time | |
797 | .P2 | |
798 | Of course you can always say | |
799 | .P1 | |
800 | s/the/the best/ | |
801 | .P2 | |
802 | but it seems silly to have to repeat the `the'. | |
803 | The `&' is used to eliminate the repetition. | |
804 | On the | |
805 | .ul | |
806 | right | |
807 | side of a substitute, the ampersand means `whatever | |
808 | was just matched', so you can say | |
809 | .P1 | |
810 | s/the/& best/ | |
811 | .P2 | |
812 | and the `&' will stand for `the'. | |
813 | Of course this isn't much of a saving if the thing | |
814 | matched is just `the', but if it is something truly long or awful, | |
815 | or if it is something like `.*' | |
816 | which matches a lot of text, | |
817 | you can save some tedious typing. | |
818 | There is also much less chance of making a typing error | |
819 | in the replacement text. | |
820 | For example, to parenthesize a line, | |
821 | regardless of its length, | |
822 | .P1 | |
823 | s/\*.*/(&)/ | |
824 | .P2 | |
825 | .PP | |
826 | The ampersand can occur more than once on the right side: | |
827 | .P1 | |
828 | s/the/& best and & worst/ | |
829 | .P2 | |
830 | makes | |
831 | .P1 | |
832 | Now is the best and the worst time | |
833 | .P2 | |
834 | and | |
835 | .P1 | |
836 | s/\*.*/&? &!!/ | |
837 | .P2 | |
838 | converts the original line into | |
839 | .P1 | |
840 | Now is the time? Now is the time!! | |
841 | .P2 | |
842 | .PP | |
843 | To get a literal ampersand, naturally the backslash is used to turn off the special meaning: | |
844 | .P1 | |
845 | s/ampersand/\*e&/ | |
846 | .P2 | |
847 | converts the word into the symbol. | |
848 | Notice that `&' is not special on the left side | |
849 | of a substitute, only on the | |
850 | .ul | |
851 | right | |
852 | side. | |
853 | .SH | |
854 | Substituting Newlines | |
855 | .PP | |
856 | .UL ed | |
857 | provides a facility for splitting a single line into two or more shorter lines by `substituting in a newline'. | |
858 | As the simplest example, suppose a line has gotten unmanageably long | |
859 | because of editing (or merely because it was unwisely typed). | |
860 | If it looks like | |
861 | .P1 | |
862 | \fItext \fR xy \fI text \fR | |
863 | .P2 | |
864 | you can break it between the `x' and the `y' like this: | |
865 | .P1 | |
866 | s/xy/x\*e | |
867 | y/ | |
868 | .P2 | |
869 | This is actually a single command, | |
870 | although it is typed on two lines. | |
871 | Bearing in mind that `\*e' turns off special meanings, | |
872 | it seems relatively intuitive that a `\*e' at the end of | |
873 | a line would make the newline there | |
874 | no longer special. | |
875 | .PP | |
876 | You can in fact make a single line into several lines | |
877 | with this same mechanism. | |
878 | As a large example, consider underlining the word `very' | |
879 | in a long line | |
880 | by splitting `very' onto a separate line, | |
881 | and preceding it by the | |
882 | .UL roff | |
883 | or | |
884 | .UL nroff | |
885 | formatting command `.ul'. | |
886 | .P1 | |
887 | \fItext \fR a very big \fI text \fR | |
888 | .P2 | |
889 | The command | |
890 | .P1 | |
891 | s/\*Bvery\*B/\*e | |
892 | \&.ul\*e | |
893 | very\*e | |
894 | / | |
895 | .P2 | |
896 | converts the line into four shorter lines, | |
897 | preceding the word `very' by the | |
898 | line | |
899 | `.ul', | |
900 | and eliminating the spaces around the `very', | |
901 | all at the same time. | |
902 | .PP | |
903 | When a newline is substituted | |
904 | in, dot is left pointing at the last line created. | |
905 | .PP | |
906 | .SH | |
907 | Joining Lines | |
908 | .PP | |
909 | Lines may also be joined together, | |
910 | but this is done with the | |
911 | .UL j | |
912 | command | |
913 | instead of | |
914 | .UL s . | |
915 | Given the lines | |
916 | .P1 | |
917 | Now is | |
918 | \*Bthe time | |
919 | .P2 | |
920 | and supposing that dot is set to the first of them, | |
921 | then the command | |
922 | .P1 | |
923 | j | |
924 | .P2 | |
925 | joins them together. | |
926 | No blanks are added, | |
927 | which is why we carefully showed a blank | |
928 | at the beginning of the second line. | |
929 | .PP | |
930 | All by itself, | |
931 | a | |
932 | .UL j | |
933 | command | |
934 | joins line dot to line dot+1, | |
935 | but any contiguous set of lines can be joined. | |
936 | Just specify the starting and ending line numbers. | |
937 | For example, | |
938 | .P1 | |
939 | 1,$jp | |
940 | .P2 | |
941 | joins all the lines into one big one | |
942 | and prints it. | |
943 | (More on line numbers in Section 3.) | |
944 | .SH | |
945 | Rearranging a Line with \*e( ... \*e) | |
946 | .PP | |
947 | (This section should be skipped on first reading.) | |
948 | Recall that `&' is a shorthand that stands for whatever | |
949 | was matched by the left side of an | |
950 | .UL s | |
951 | command. | |
952 | In much the same way you can capture separate pieces | |
953 | of what was matched; | |
954 | the only difference is that you have to specify | |
955 | on the left side just what pieces you're interested in. | |
956 | .PP | |
957 | Suppose, for instance, that | |
958 | you have a file of lines that consist of names in the form | |
959 | .P1 | |
960 | Smith, A. B. | |
961 | Jones, C. | |
962 | .P2 | |
963 | and so on, | |
964 | and you want the initials to precede the name, as in | |
965 | .P1 | |
966 | A. B. Smith | |
967 | C. Jones | |
968 | .P2 | |
969 | It is possible to do this with a series of editing commands, | |
970 | but it is tedious and error-prone. | |
971 | (It is instructive to figure out how it is done, though.) | |
972 | .PP | |
973 | The alternative | |
974 | is to `tag' the pieces of the pattern (in this case, | |
975 | the last name, and the initials), | |
976 | and then rearrange the pieces. | |
977 | On the left side of a substitution, | |
978 | if part of the pattern is enclosed between | |
979 | \*e( and \*e), | |
980 | whatever matched that part is remembered, | |
981 | and available for use on the right side. | |
982 | On the right side, | |
983 | the symbol `\*e1' refers to whatever | |
984 | matched the first \*e(...\*e) pair, | |
985 | `\*e2' to the second \*e(...\*e), | |
986 | and so on. | |
987 | .PP | |
988 | The command | |
989 | .P1 | |
990 | 1,$s/^\*e([^,]*\*e),\*B*\*e(\*.*\*e)/\*e2\*B\*e1/ | |
991 | .P2 | |
992 | although hard to read, does the job. | |
993 | The first \*e(...\*e) matches the last name, | |
994 | which is any string up to the comma; | |
995 | this is referred to on the right side with `\*e1'. | |
996 | The second \*e(...\*e) is whatever follows | |
997 | the comma and any spaces, | |
998 | and is referred to as `\*e2'. | |
999 | .PP | |
1000 | Of course, with any editing sequence this complicated, | |
1001 | it's foolhardy to simply run it and hope. | |
1002 | The global commands | |
1003 | .UL g | |
1004 | and | |
1005 | .UL v | |
1006 | discussed in section 4 | |
1007 | provide a way for you to print exactly those | |
1008 | lines which were affected by the | |
1009 | substitute command, | |
1010 | and thus verify that it did what you wanted | |
1011 | in all cases. |