get rid of getdtablesize; this code should be rethought
[unix-history] / usr / src / usr.bin / sed / POSIX
CommitLineData
c8efee25
KB
1# @(#)POSIX 5.1 (Berkeley) %G%
2
3 Comments on the IEEE P1003.2 Draft 11.2 September 1991
4
5 Part 2: Shell and Utilities
6 Section 4.55: sed - Stream editor
7
8In the following paragraphs, `wrong' means `inconsistent with historic
9practice'. Many of the comments refer to undocumented inconsistencies
10between the historical versions of sed and the POSIX standard. All the
11comments are notes taken while implementing a POSIX-compatible version
12of sed, and should not be interpreted as official opinions or criticism
13towards the POSIX committee. Many are insignificant, pedantic and even
14wrong.
15 Diomidis Spinellis <dds@doc.ic.ac.uk>
16
17[Some are significant and right, too. -- Keith Bostic]
18
191. For the text argument of the a command it is not specified if lines are
20 stripped from their initial blanks or not. There are some hints in D2
21 11335-11337 and in D2 11512-11514, but nothing concrete. Historical
22 practice is to strip the blanks, i.e.:
23
24 #!/bin/sed -f
25 a\
26 foo\
27 bar
28
29 produces:
30
31 foo
32 bar
33
342. In the s command we assume that the w file is the last flag. This is
35 historical practice, but not specified in the standard.
36
373. In the s command the standard does not specify that a space must follow
38 w. Also the standard does not specify that any number of spaces after
39 the w command are allowed and removed.
40
414. The specification of the a command is wrong. With the current
42 specification both of these scripts should produce the same output:
43
44 #!/bin/sed -f
45 d
46 a\
47 hello
48
49 #!/bin/sed -f
50 a\
51 hello
52 d
53
545. The specification of the c command in conjunction with the specification
55 of the default operation (D2 11293-11299) is wrong. The default operation
56 specifies that a newline is printed after the pattern space. This is not
57 the case when the pattern space has been deleted by a c command.
58
596. The rule for the l command differs from historic practice. Table 2-15
60 includes the various escape sequences including \\. Is this meant by
61 the standard? Furthermore some versions of sed print two digit octal
62 numbers. Why does the standard require a three digit octal number?
63 Normally the pattern space does not end with a newline. Will an implict
64 \n be printed? Finaly the standard does not specify that a newline must
65 follow the '$' sign (it seems logical to me).
66
677. The specification for ! does not specify that for a single command the
68 command must not contain an address specification whereas the command
69 list can contain address specifications.
70
718. The standard does not specify what happens with consequitive ! commands
72 (e.g. /foo/!!!p) Current implementations allow any number of !'s without
73 changing behaviour. It seems logical that each one should reverse the
74 default behaviour.
75
769. The ; command separator is not allowed for the commands a c i w r : b t
77 # and at the end of a w flag in the s command.
78
7910. The standard does not specify that if an end of file occurs on the
80 execution of the n command the program terminates (e.g.
81
82 sed -e '
83 n
84 i\
85 hello
86 ' </dev/null
87
88 will not produce any output.
89
9011. The standard does not specify that the q command causes all lines that
91 have been appended to be output and that the pattern space is printed
92 before exiting.
93
9412. Historic implementations ignore comments in the text of the i and a
95 commands.
96
9713. The historic implementation does not consider the last line of a file
98 to match $ if a null file follows:
99
100 sed -n -e '$p' /usr/dict/words /dev/null
101
102 will not print anything.
103
10414. Historical implementations do not output the change text of a c command
105 in the case of an address range whose second line number is greater than
106 the first (e.g. 3,1). The standard seems to imply otherwise.
107
10815. Historical implementations output the c text on EVERY line not included
109 in the two address range in the case of a negation '!'.
110
11116. The standard does not specify that the p flag at the s command will
112 write the pattern space plus a newline on the standard output
113
11417. The standard does not specify whether address ranges are checked and
115 reset if a command is not executed due to a jump. The following
116 program can behave in two different ways depending on whether the range
117 operator is reset at line 6 or not. This is important in the case of
118 pattern matches.
119
120 sed -n -e '
121 4,8b
122 s/^/XXX/p
123 1,6 {
124 p
125 }'
126
12718. Historical implementations allow an output suppressing #n at the
128 beginning of -e arguments as well.
129
13019. POSIX does not specify whether more than one numeric flag is
131 allowed on the s command
132
13320. Existing versions of sed have the undocumented feature of allowing
134 a semicolon to delimit commands. It is not specified in the standard.
135
13621. The standard does not specify whether a script is mandatory. The
137 sed implementations I tested behave differently with ls | sed (no
138 output) and ls | sed - e'' (behaves like cat).
139
14022. The requirement to open all wfiles from the beginning makes sed behave
141 nonintuitively when the w commands are preceded by addresses or are
142 within conditional blocks.
143
14423. The rule specified in lines 11412-11413 of the standard does not
145 seem consistent with existing practice. The sed implementations I
146 tested copied the rfile on standard output every time the r command was
147 executed and not before reading a line of input. The wording should be
148 changed to be consistent with the 'a' command i.e.
149
15024. The standard does not specify how excape sequences other than \n
151 and \D (where D is the delimiter character) are to be treated. A
152 strict interpretation would be that they should be treated literaly.
153 In the sed implementations I have tried the \ is simply ingored.
154
15525. The standard specifies in line 11304 that an address can be empty.
156 This is wrong since it implied that constructs like ,d or 1,d or ,5d
157 are allowed. The sed implementation I tested do not allow them.
158
15926. The b t and : commands ignore leading white space, but not trailing
160 white space. This is not specified in the standard.
161
16227. Although the standard specifies that reading from files that do not
163 exist from within the script must not terminate the script; it does not
164 specify what happens if a write command fails.
165
16628. In the sed implementation I tested the \n construct for newlines
167 works on both strings of a y command. This is not specified in the
168 standard.
169
17029. The standard does not specify if the "nth occurrence" of a regular
171 expression in a substitute command is an overlapping or a
172 non-overlappoin one. I.e. what is the result of s/a*/A/2 on the
173 pattern "aaaaa aaaaa". (It crashes the implementation of sed I
174 tested.)
175
17630. Existing implementations of sed ignore the regular expression
177 delimiter characters within character classes. This is not specified
178 in the standard.