.if !\n(xx .so tmac.e .SH Substitute replacement patterns .PP There are several metacharacters which may be used in substitute replacement patterns. As is the case for the regular expression metacharacters, there are fewer replacement pattern metacharacters if .I nomagic is set. This is discussed more below. In fact, with .I nomagic the only replacement pattern metacharacter is the escaping `\e' (this is the default for \fIedit\fR). .PP The basic metacharacters for the replacement pattern are `&' and `~'. These are given as `\e&' and `\e~' when .I nomagic is set. The metacharacter `&' is by far the most important of these. Each instance of this metacharacter is replaced by the characters which the regular expression matched. Thus the substitute command .DS \fBsubstitute\fR/some/& other/ .DE will replace the string `some' with the string `some other' the first time it occurs on the current line. The metacharacter `~' stands, in the replacement pattern, as it did in regular expression formation, for the defining text of the previous replacement pattern. .PP Other metasequences are possible in the replacement pattern, and are introduced by the escaping character `\e'; this is the default for .I edit . The sequence `\e\fIn\fR' is replaced by the text matched by the \fIn\fR-th regular subexpression enclosed between `\e(' and `\e)'.\u\s-2\(dg\s0\d .FS \(dg When nested, parenthesized subexpressions are present, \fIn\fR is determined by counting occurrences of `\e(' starting from the left. .FE The metasequences `\eu', `\el', `\eU', `\eL', and `\eE' and `\ee' are used to perform systematic case conversion of letters. The sequences `\eu' and `\el' cause the immediately following character in the replacement to be converted to upper- or lower-case respectively if this character is a letter. The sequences `\eU' and `\eL' turn such conversion on, either until `\eE' or `\ee' is encountered, or until the end of the replacement pattern. By bracketing selected portions of a regular expression with `\e(' and `\e)' and using `\eU' or `\eL' it is possible to systematically capitalize entire words or phrases. .SH Regular expressions .PP .Ex supports a form of regular expression notation. A regular expression specifies a set of strings of characters. A member of this set of strings is said to be .I matched by the regular expression. Regular expressions may be used in locating or selecting lines by their content, in .I open and .I visual modes to position the cursor within the file, and in the .I substitute command to select the portion of a line to be substituted. .PP .Ex remembers two previous regular expressions: the previous regular expression used in a .I substitute command and the previous regular expression used elsewhere (referred to as the previous \fIscanning\fR regular expression.) The previous regular expression can always be referred to by a null \fIre\fR, e.g. `//' or `??'. .SH Magic and nomagic .PP The regular expressions allowed by .I ex are constructed in one of two ways depending on the setting of the .I magic option. The .I ex default setting of .I magic gives quick access to a powerful set of regular expression metacharacters. The disadvantage of .I magic is that the user must remember that these metacharacters are .I magic and precede them with the character `\e' to use them as ``ordinary'' characters. With .I nomagic , the default for .I edit , regular expressions are much simpler, there being only two metacharacters. The power of the other metacharacters is still available by preceding the (now) ordinary character with a `\e'. Note that `\e' is thus always a metacharacter. .PP The remainder of the discussion of regular expressions assumes that that the setting of this option is .I magic . To discern what is true with .I nomagic it suffices to remember that the only special characters in this case will be `\(ua' at the beginning of a regular expression, `$' at the end of a regular expression, and `\e'.\u\s-2\(dd\s0\d .FS \(dd With .I nomagic the characters `\s+2~\s0' and `&' also lose their special meanings related to the replacement pattern of a substitute. .FE .SH Basic regular expression summary .PP The following basic constructs are used to construct .I magic mode regular expressions. .TS H allbox center; c s c l c aw(65). Basic regular expression forms Form Meaning .TH char T{ An ordinary character which matches itself. The character `\(ua' (`^') at the beginning of a line, `$' at the end of line, `*' as any character other than the first, `.', `\e', `[', and `\s+2~\s0' are not ordinary characters and must be escaped (preceded) by `\e' to be treated as such. T} \(ua T{ Up-arrow (or circumflex `^') at the beginning of a pattern forces the match to succeed only at the beginning of a line. T} $ T{ At the end of a regular expression forces the match to succeed only at the end of the line. T} \&\fB.\fR T{ A period character matches any single character except the new-line character. T} \e< T{ This sequence in a regular expression forces the match to occur only at the beginning of a ``variable'' or ``word''; that is, either at the beginning of a line, or just before a letter, digit, or underline and after a character not one of these. T} \e> T{ Similar to `\e<', but matching the end of a ``variable'' or ``word'', i.e. either the end of the line or before character which is neither a letter, nor a digit, nor the underline character. T} [\fIstring\fR] T{ A string of characters enclosed in square brackets matches any (single) character in the class defined by .I string . Most characters in .I string define themselves. A pair of characters separated by `\-' in .I string defines the set of characters collating between the specified lower and upper bounds, thus `[a\-z]' as a regular expression matches any (single) lower-case letter. If the first character of .I string is an `\(ua' or `^' then the construct matches those characters which it otherwise would not; thus `[^a\-z]' matches anything but a lower-case letter (and of course a newline). To place any of the characters `\(ua', `^', `[', or `\-' in .I string you may escape them by preceding them with a `\e'. T} .TE .PP More complicated regular expressions are built by putting these simple pieces together. The concatenation of two regular expressions matches the longest string which can be divided with the first piece matching the first regular expression and the second piece matching the second. Thus the regular expression `\fB..\fRe' will match any three characters ending in the character `e', while `^[aeiou]' matches any vowel which appears at the beginning of a line. .PP Any of the (single character matching) regular expressions mentioned above may be followed by the character `*' to form a regular expression which matches any number of adjacent occurrences (including 0) of characters matched by the regular expression it follows. The character `\s+2~\s0' may be used in a regular expression, and matches the text which defined the replacement part of the last .I substitute command. A regular expression may be enclosed between the sequences `\e(' and `\e)' with side effects in the .I substitute command, and an escaped digit, e.g. `\e1', matches the text which was matched by the corresponding previous `\e(' and `\e)' bracketed expression, numbered in order of occurrence of the `\e(' delimiters. .bp