Commit | Line | Data |
---|---|---|
c8efee25 KB |
1 | # @(#)POSIX 5.1 (Berkeley) %G% |
2 | ||
3 | Comments on the IEEE P1003.2 Draft 11.2 September 1991 | |
4 | ||
5 | Part 2: Shell and Utilities | |
6 | Section 4.55: sed - Stream editor | |
7 | ||
8 | In the following paragraphs, `wrong' means `inconsistent with historic | |
9 | practice'. Many of the comments refer to undocumented inconsistencies | |
10 | between the historical versions of sed and the POSIX standard. All the | |
11 | comments are notes taken while implementing a POSIX-compatible version | |
12 | of sed, and should not be interpreted as official opinions or criticism | |
13 | towards the POSIX committee. Many are insignificant, pedantic and even | |
14 | wrong. | |
15 | Diomidis Spinellis <dds@doc.ic.ac.uk> | |
16 | ||
17 | [Some are significant and right, too. -- Keith Bostic] | |
18 | ||
19 | 1. For the text argument of the a command it is not specified if lines are | |
20 | stripped from their initial blanks or not. There are some hints in D2 | |
21 | 11335-11337 and in D2 11512-11514, but nothing concrete. Historical | |
22 | practice is to strip the blanks, i.e.: | |
23 | ||
24 | #!/bin/sed -f | |
25 | a\ | |
26 | foo\ | |
27 | bar | |
28 | ||
29 | produces: | |
30 | ||
31 | foo | |
32 | bar | |
33 | ||
34 | 2. In the s command we assume that the w file is the last flag. This is | |
35 | historical practice, but not specified in the standard. | |
36 | ||
37 | 3. In the s command the standard does not specify that a space must follow | |
38 | w. Also the standard does not specify that any number of spaces after | |
39 | the w command are allowed and removed. | |
40 | ||
41 | 4. The specification of the a command is wrong. With the current | |
42 | specification both of these scripts should produce the same output: | |
43 | ||
44 | #!/bin/sed -f | |
45 | d | |
46 | a\ | |
47 | hello | |
48 | ||
49 | #!/bin/sed -f | |
50 | a\ | |
51 | hello | |
52 | d | |
53 | ||
54 | 5. The specification of the c command in conjunction with the specification | |
55 | of the default operation (D2 11293-11299) is wrong. The default operation | |
56 | specifies that a newline is printed after the pattern space. This is not | |
57 | the case when the pattern space has been deleted by a c command. | |
58 | ||
59 | 6. The rule for the l command differs from historic practice. Table 2-15 | |
60 | includes the various escape sequences including \\. Is this meant by | |
61 | the standard? Furthermore some versions of sed print two digit octal | |
62 | numbers. Why does the standard require a three digit octal number? | |
63 | Normally the pattern space does not end with a newline. Will an implict | |
64 | \n be printed? Finaly the standard does not specify that a newline must | |
65 | follow the '$' sign (it seems logical to me). | |
66 | ||
67 | 7. The specification for ! does not specify that for a single command the | |
68 | command must not contain an address specification whereas the command | |
69 | list can contain address specifications. | |
70 | ||
71 | 8. The standard does not specify what happens with consequitive ! commands | |
72 | (e.g. /foo/!!!p) Current implementations allow any number of !'s without | |
73 | changing behaviour. It seems logical that each one should reverse the | |
74 | default behaviour. | |
75 | ||
76 | 9. The ; command separator is not allowed for the commands a c i w r : b t | |
77 | # and at the end of a w flag in the s command. | |
78 | ||
79 | 10. The standard does not specify that if an end of file occurs on the | |
80 | execution of the n command the program terminates (e.g. | |
81 | ||
82 | sed -e ' | |
83 | n | |
84 | i\ | |
85 | hello | |
86 | ' </dev/null | |
87 | ||
88 | will not produce any output. | |
89 | ||
90 | 11. The standard does not specify that the q command causes all lines that | |
91 | have been appended to be output and that the pattern space is printed | |
92 | before exiting. | |
93 | ||
94 | 12. Historic implementations ignore comments in the text of the i and a | |
95 | commands. | |
96 | ||
97 | 13. The historic implementation does not consider the last line of a file | |
98 | to match $ if a null file follows: | |
99 | ||
100 | sed -n -e '$p' /usr/dict/words /dev/null | |
101 | ||
102 | will not print anything. | |
103 | ||
104 | 14. Historical implementations do not output the change text of a c command | |
105 | in the case of an address range whose second line number is greater than | |
106 | the first (e.g. 3,1). The standard seems to imply otherwise. | |
107 | ||
108 | 15. Historical implementations output the c text on EVERY line not included | |
109 | in the two address range in the case of a negation '!'. | |
110 | ||
111 | 16. The standard does not specify that the p flag at the s command will | |
112 | write the pattern space plus a newline on the standard output | |
113 | ||
114 | 17. The standard does not specify whether address ranges are checked and | |
115 | reset if a command is not executed due to a jump. The following | |
116 | program can behave in two different ways depending on whether the range | |
117 | operator is reset at line 6 or not. This is important in the case of | |
118 | pattern matches. | |
119 | ||
120 | sed -n -e ' | |
121 | 4,8b | |
122 | s/^/XXX/p | |
123 | 1,6 { | |
124 | p | |
125 | }' | |
126 | ||
127 | 18. Historical implementations allow an output suppressing #n at the | |
128 | beginning of -e arguments as well. | |
129 | ||
130 | 19. POSIX does not specify whether more than one numeric flag is | |
131 | allowed on the s command | |
132 | ||
133 | 20. Existing versions of sed have the undocumented feature of allowing | |
134 | a semicolon to delimit commands. It is not specified in the standard. | |
135 | ||
136 | 21. The standard does not specify whether a script is mandatory. The | |
137 | sed implementations I tested behave differently with ls | sed (no | |
138 | output) and ls | sed - e'' (behaves like cat). | |
139 | ||
140 | 22. The requirement to open all wfiles from the beginning makes sed behave | |
141 | nonintuitively when the w commands are preceded by addresses or are | |
142 | within conditional blocks. | |
143 | ||
144 | 23. The rule specified in lines 11412-11413 of the standard does not | |
145 | seem consistent with existing practice. The sed implementations I | |
146 | tested copied the rfile on standard output every time the r command was | |
147 | executed and not before reading a line of input. The wording should be | |
148 | changed to be consistent with the 'a' command i.e. | |
149 | ||
150 | 24. The standard does not specify how excape sequences other than \n | |
151 | and \D (where D is the delimiter character) are to be treated. A | |
152 | strict interpretation would be that they should be treated literaly. | |
153 | In the sed implementations I have tried the \ is simply ingored. | |
154 | ||
155 | 25. The standard specifies in line 11304 that an address can be empty. | |
156 | This is wrong since it implied that constructs like ,d or 1,d or ,5d | |
157 | are allowed. The sed implementation I tested do not allow them. | |
158 | ||
159 | 26. The b t and : commands ignore leading white space, but not trailing | |
160 | white space. This is not specified in the standard. | |
161 | ||
162 | 27. Although the standard specifies that reading from files that do not | |
163 | exist from within the script must not terminate the script; it does not | |
164 | specify what happens if a write command fails. | |
165 | ||
166 | 28. In the sed implementation I tested the \n construct for newlines | |
167 | works on both strings of a y command. This is not specified in the | |
168 | standard. | |
169 | ||
170 | 29. The standard does not specify if the "nth occurrence" of a regular | |
171 | expression in a substitute command is an overlapping or a | |
172 | non-overlappoin one. I.e. what is the result of s/a*/A/2 on the | |
173 | pattern "aaaaa aaaaa". (It crashes the implementation of sed I | |
174 | tested.) | |
175 | ||
176 | 30. Existing implementations of sed ignore the regular expression | |
177 | delimiter characters within character classes. This is not specified | |
178 | in the standard. |