Commit | Line | Data |
---|---|---|
ca2dddd6 C |
1 | .rn '' }` |
2 | ''' $RCSfile: perl.man,v $$Revision: 4.0.1.6 $$Date: 92/06/08 15:07:29 $ | |
3 | ''' | |
4 | ''' $Log: perl.man,v $ | |
5 | ''' Revision 4.0.1.6 92/06/08 15:07:29 lwall | |
6 | ''' patch20: documented that numbers may contain underline | |
7 | ''' patch20: clarified that DATA may only be read from main script | |
8 | ''' patch20: relaxed requirement for semicolon at the end of a block | |
9 | ''' patch20: added ... as variant on .. | |
10 | ''' patch20: documented need for 1; at the end of a required file | |
11 | ''' patch20: extended bracket-style quotes to two-arg operators: s()() and tr()() | |
12 | ''' patch20: paragraph mode now skips extra newlines automatically | |
13 | ''' patch20: documented PERLLIB and PERLDB | |
14 | ''' patch20: documented limit on size of regexp | |
15 | ''' | |
16 | ''' Revision 4.0.1.5 91/11/11 16:42:00 lwall | |
17 | ''' patch19: added little-endian pack/unpack options | |
18 | ''' | |
19 | ''' Revision 4.0.1.4 91/11/05 18:11:05 lwall | |
20 | ''' patch11: added sort {} LIST | |
21 | ''' patch11: added eval {} | |
22 | ''' patch11: documented meaning of scalar(%foo) | |
23 | ''' patch11: sprintf() now supports any length of s field | |
24 | ''' | |
25 | ''' Revision 4.0.1.3 91/06/10 01:26:02 lwall | |
26 | ''' patch10: documented some newer features in addenda | |
27 | ''' | |
28 | ''' Revision 4.0.1.2 91/06/07 11:41:23 lwall | |
29 | ''' patch4: added global modifier for pattern matches | |
30 | ''' patch4: default top-of-form format is now FILEHANDLE_TOP | |
31 | ''' patch4: added $^P variable to control calling of perldb routines | |
32 | ''' patch4: added $^F variable to specify maximum system fd, default 2 | |
33 | ''' patch4: changed old $^P to $^X | |
34 | ''' | |
35 | ''' Revision 4.0.1.1 91/04/11 17:50:44 lwall | |
36 | ''' patch1: fixed some typos | |
37 | ''' | |
38 | ''' Revision 4.0 91/03/20 01:38:08 lwall | |
39 | ''' 4.0 baseline. | |
40 | ''' | |
41 | ''' | |
42 | .de Sh | |
43 | .br | |
44 | .ne 5 | |
45 | .PP | |
46 | \fB\\$1\fR | |
47 | .PP | |
48 | .. | |
49 | .de Sp | |
50 | .if t .sp .5v | |
51 | .if n .sp | |
52 | .. | |
53 | .de Ip | |
54 | .br | |
55 | .ie \\n(.$>=3 .ne \\$3 | |
56 | .el .ne 3 | |
57 | .IP "\\$1" \\$2 | |
58 | .. | |
59 | ''' | |
60 | ''' Set up \*(-- to give an unbreakable dash; | |
61 | ''' string Tr holds user defined translation string. | |
62 | ''' Bell System Logo is used as a dummy character. | |
63 | ''' | |
64 | .tr \(*W-|\(bv\*(Tr | |
65 | .ie n \{\ | |
66 | .ds -- \(*W- | |
67 | .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch | |
68 | .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch | |
69 | .ds L" "" | |
70 | .ds R" "" | |
71 | .ds L' ' | |
72 | .ds R' ' | |
73 | 'br\} | |
74 | .el\{\ | |
75 | .ds -- \(em\| | |
76 | .tr \*(Tr | |
77 | .ds L" `` | |
78 | .ds R" '' | |
79 | .ds L' ` | |
80 | .ds R' ' | |
81 | 'br\} | |
82 | .TH PERL 1 "\*(RP" | |
83 | .UC | |
84 | .SH NAME | |
85 | perl \- Practical Extraction and Report Language | |
86 | .SH SYNOPSIS | |
87 | .B perl | |
88 | [options] filename args | |
89 | .SH DESCRIPTION | |
90 | .I Perl | |
91 | is an interpreted language optimized for scanning arbitrary text files, | |
92 | extracting information from those text files, and printing reports based | |
93 | on that information. | |
94 | It's also a good language for many system management tasks. | |
95 | The language is intended to be practical (easy to use, efficient, complete) | |
96 | rather than beautiful (tiny, elegant, minimal). | |
97 | It combines (in the author's opinion, anyway) some of the best features of C, | |
98 | \fIsed\fR, \fIawk\fR, and \fIsh\fR, | |
99 | so people familiar with those languages should have little difficulty with it. | |
100 | (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and | |
101 | even BASIC-PLUS.) | |
102 | Expression syntax corresponds quite closely to C expression syntax. | |
103 | Unlike most Unix utilities, | |
104 | .I perl | |
105 | does not arbitrarily limit the size of your data\*(--if you've got | |
106 | the memory, | |
107 | .I perl | |
108 | can slurp in your whole file as a single string. | |
109 | Recursion is of unlimited depth. | |
110 | And the hash tables used by associative arrays grow as necessary to prevent | |
111 | degraded performance. | |
112 | .I Perl | |
113 | uses sophisticated pattern matching techniques to scan large amounts of | |
114 | data very quickly. | |
115 | Although optimized for scanning text, | |
116 | .I perl | |
117 | can also deal with binary data, and can make dbm files look like associative | |
118 | arrays (where dbm is available). | |
119 | Setuid | |
120 | .I perl | |
121 | scripts are safer than C programs | |
122 | through a dataflow tracing mechanism which prevents many stupid security holes. | |
123 | If you have a problem that would ordinarily use \fIsed\fR | |
124 | or \fIawk\fR or \fIsh\fR, but it | |
125 | exceeds their capabilities or must run a little faster, | |
126 | and you don't want to write the silly thing in C, then | |
127 | .I perl | |
128 | may be for you. | |
129 | There are also translators to turn your | |
130 | .I sed | |
131 | and | |
132 | .I awk | |
133 | scripts into | |
134 | .I perl | |
135 | scripts. | |
136 | OK, enough hype. | |
137 | .PP | |
138 | Upon startup, | |
139 | .I perl | |
140 | looks for your script in one of the following places: | |
141 | .Ip 1. 4 2 | |
142 | Specified line by line via | |
143 | .B \-e | |
144 | switches on the command line. | |
145 | .Ip 2. 4 2 | |
146 | Contained in the file specified by the first filename on the command line. | |
147 | (Note that systems supporting the #! notation invoke interpreters this way.) | |
148 | .Ip 3. 4 2 | |
149 | Passed in implicitly via standard input. | |
150 | This only works if there are no filename arguments\*(--to pass | |
151 | arguments to a | |
152 | .I stdin | |
153 | script you must explicitly specify a \- for the script name. | |
154 | .PP | |
155 | After locating your script, | |
156 | .I perl | |
157 | compiles it to an internal form. | |
158 | If the script is syntactically correct, it is executed. | |
159 | .Sh "Options" | |
160 | Note: on first reading this section may not make much sense to you. It's here | |
161 | at the front for easy reference. | |
162 | .PP | |
163 | A single-character option may be combined with the following option, if any. | |
164 | This is particularly useful when invoking a script using the #! construct which | |
165 | only allows one argument. Example: | |
166 | .nf | |
167 | ||
168 | .ne 2 | |
169 | #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak | |
170 | .\|.\|. | |
171 | ||
172 | .fi | |
173 | Options include: | |
174 | .TP 5 | |
175 | .BI \-0 digits | |
176 | specifies the record separator ($/) as an octal number. | |
177 | If there are no digits, the null character is the separator. | |
178 | Other switches may precede or follow the digits. | |
179 | For example, if you have a version of | |
180 | .I find | |
181 | which can print filenames terminated by the null character, you can say this: | |
182 | .nf | |
183 | ||
184 | find . \-name '*.bak' \-print0 | perl \-n0e unlink | |
185 | ||
186 | .fi | |
187 | The special value 00 will cause Perl to slurp files in paragraph mode. | |
188 | The value 0777 will cause Perl to slurp files whole since there is no | |
189 | legal character with that value. | |
190 | .TP 5 | |
191 | .B \-a | |
192 | turns on autosplit mode when used with a | |
193 | .B \-n | |
194 | or | |
195 | .BR \-p . | |
196 | An implicit split command to the @F array | |
197 | is done as the first thing inside the implicit while loop produced by | |
198 | the | |
199 | .B \-n | |
200 | or | |
201 | .BR \-p . | |
202 | .nf | |
203 | ||
204 | perl \-ane \'print pop(@F), "\en";\' | |
205 | ||
206 | is equivalent to | |
207 | ||
208 | while (<>) { | |
209 | @F = split(\' \'); | |
210 | print pop(@F), "\en"; | |
211 | } | |
212 | ||
213 | .fi | |
214 | .TP 5 | |
215 | .B \-c | |
216 | causes | |
217 | .I perl | |
218 | to check the syntax of the script and then exit without executing it. | |
219 | .TP 5 | |
220 | .BI \-d | |
221 | runs the script under the perl debugger. | |
222 | See the section on Debugging. | |
223 | .TP 5 | |
224 | .BI \-D number | |
225 | sets debugging flags. | |
226 | To watch how it executes your script, use | |
227 | .BR \-D14 . | |
228 | (This only works if debugging is compiled into your | |
229 | .IR perl .) | |
230 | Another nice value is \-D1024, which lists your compiled syntax tree. | |
231 | And \-D512 displays compiled regular expressions. | |
232 | .TP 5 | |
233 | .BI \-e " commandline" | |
234 | may be used to enter one line of script. | |
235 | Multiple | |
236 | .B \-e | |
237 | commands may be given to build up a multi-line script. | |
238 | If | |
239 | .B \-e | |
240 | is given, | |
241 | .I perl | |
242 | will not look for a script filename in the argument list. | |
243 | .TP 5 | |
244 | .BI \-i extension | |
245 | specifies that files processed by the <> construct are to be edited | |
246 | in-place. | |
247 | It does this by renaming the input file, opening the output file by the | |
248 | same name, and selecting that output file as the default for print statements. | |
249 | The extension, if supplied, is added to the name of the | |
250 | old file to make a backup copy. | |
251 | If no extension is supplied, no backup is made. | |
252 | Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using | |
253 | the script: | |
254 | .nf | |
255 | ||
256 | .ne 2 | |
257 | #!/usr/bin/perl \-pi.bak | |
258 | s/foo/bar/; | |
259 | ||
260 | which is equivalent to | |
261 | ||
262 | .ne 14 | |
263 | #!/usr/bin/perl | |
264 | while (<>) { | |
265 | if ($ARGV ne $oldargv) { | |
266 | rename($ARGV, $ARGV . \'.bak\'); | |
267 | open(ARGVOUT, ">$ARGV"); | |
268 | select(ARGVOUT); | |
269 | $oldargv = $ARGV; | |
270 | } | |
271 | s/foo/bar/; | |
272 | } | |
273 | continue { | |
274 | print; # this prints to original filename | |
275 | } | |
276 | select(STDOUT); | |
277 | ||
278 | .fi | |
279 | except that the | |
280 | .B \-i | |
281 | form doesn't need to compare $ARGV to $oldargv to know when | |
282 | the filename has changed. | |
283 | It does, however, use ARGVOUT for the selected filehandle. | |
284 | Note that | |
285 | .I STDOUT | |
286 | is restored as the default output filehandle after the loop. | |
287 | .Sp | |
288 | You can use eof to locate the end of each input file, in case you want | |
289 | to append to each file, or reset line numbering (see example under eof). | |
290 | .TP 5 | |
291 | .BI \-I directory | |
292 | may be used in conjunction with | |
293 | .B \-P | |
294 | to tell the C preprocessor where to look for include files. | |
295 | By default /usr/include and /usr/lib/perl are searched. | |
296 | .TP 5 | |
297 | .BI \-l octnum | |
298 | enables automatic line-ending processing. It has two effects: | |
299 | first, it automatically chops the line terminator when used with | |
300 | .B \-n | |
301 | or | |
302 | .B \-p , | |
303 | and second, it assigns $\e to have the value of | |
304 | .I octnum | |
305 | so that any print statements will have that line terminator added back on. If | |
306 | .I octnum | |
307 | is omitted, sets $\e to the current value of $/. | |
308 | For instance, to trim lines to 80 columns: | |
309 | .nf | |
310 | ||
311 | perl -lpe \'substr($_, 80) = ""\' | |
312 | ||
313 | .fi | |
314 | Note that the assignment $\e = $/ is done when the switch is processed, | |
315 | so the input record separator can be different than the output record | |
316 | separator if the | |
317 | .B \-l | |
318 | switch is followed by a | |
319 | .B \-0 | |
320 | switch: | |
321 | .nf | |
322 | ||
323 | gnufind / -print0 | perl -ln0e 'print "found $_" if -p' | |
324 | ||
325 | .fi | |
326 | This sets $\e to newline and then sets $/ to the null character. | |
327 | .TP 5 | |
328 | .B \-n | |
329 | causes | |
330 | .I perl | |
331 | to assume the following loop around your script, which makes it iterate | |
332 | over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR: | |
333 | .nf | |
334 | ||
335 | .ne 3 | |
336 | while (<>) { | |
337 | .\|.\|. # your script goes here | |
338 | } | |
339 | ||
340 | .fi | |
341 | Note that the lines are not printed by default. | |
342 | See | |
343 | .B \-p | |
344 | to have lines printed. | |
345 | Here is an efficient way to delete all files older than a week: | |
346 | .nf | |
347 | ||
348 | find . \-mtime +7 \-print | perl \-nle \'unlink;\' | |
349 | ||
350 | .fi | |
351 | This is faster than using the \-exec switch of find because you don't have to | |
352 | start a process on every filename found. | |
353 | .TP 5 | |
354 | .B \-p | |
355 | causes | |
356 | .I perl | |
357 | to assume the following loop around your script, which makes it iterate | |
358 | over filename arguments somewhat like \fIsed\fR: | |
359 | .nf | |
360 | ||
361 | .ne 5 | |
362 | while (<>) { | |
363 | .\|.\|. # your script goes here | |
364 | } continue { | |
365 | print; | |
366 | } | |
367 | ||
368 | .fi | |
369 | Note that the lines are printed automatically. | |
370 | To suppress printing use the | |
371 | .B \-n | |
372 | switch. | |
373 | A | |
374 | .B \-p | |
375 | overrides a | |
376 | .B \-n | |
377 | switch. | |
378 | .TP 5 | |
379 | .B \-P | |
380 | causes your script to be run through the C preprocessor before | |
381 | compilation by | |
382 | .IR perl . | |
383 | (Since both comments and cpp directives begin with the # character, | |
384 | you should avoid starting comments with any words recognized | |
385 | by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".) | |
386 | .TP 5 | |
387 | .B \-s | |
388 | enables some rudimentary switch parsing for switches on the command line | |
389 | after the script name but before any filename arguments (or before a \-\|\-). | |
390 | Any switch found there is removed from @ARGV and sets the corresponding variable in the | |
391 | .I perl | |
392 | script. | |
393 | The following script prints \*(L"true\*(R" if and only if the script is | |
394 | invoked with a \-xyz switch. | |
395 | .nf | |
396 | ||
397 | .ne 2 | |
398 | #!/usr/bin/perl \-s | |
399 | if ($xyz) { print "true\en"; } | |
400 | ||
401 | .fi | |
402 | .TP 5 | |
403 | .B \-S | |
404 | makes | |
405 | .I perl | |
406 | use the PATH environment variable to search for the script | |
407 | (unless the name of the script starts with a slash). | |
408 | Typically this is used to emulate #! startup on machines that don't | |
409 | support #!, in the following manner: | |
410 | .nf | |
411 | ||
412 | #!/usr/bin/perl | |
413 | eval "exec /usr/bin/perl \-S $0 $*" | |
414 | if $running_under_some_shell; | |
415 | ||
416 | .fi | |
417 | The system ignores the first line and feeds the script to /bin/sh, | |
418 | which proceeds to try to execute the | |
419 | .I perl | |
420 | script as a shell script. | |
421 | The shell executes the second line as a normal shell command, and thus | |
422 | starts up the | |
423 | .I perl | |
424 | interpreter. | |
425 | On some systems $0 doesn't always contain the full pathname, | |
426 | so the | |
427 | .B \-S | |
428 | tells | |
429 | .I perl | |
430 | to search for the script if necessary. | |
431 | After | |
432 | .I perl | |
433 | locates the script, it parses the lines and ignores them because | |
434 | the variable $running_under_some_shell is never true. | |
435 | A better construct than $* would be ${1+"$@"}, which handles embedded spaces | |
436 | and such in the filenames, but doesn't work if the script is being interpreted | |
437 | by csh. | |
438 | In order to start up sh rather than csh, some systems may have to replace the | |
439 | #! line with a line containing just | |
440 | a colon, which will be politely ignored by perl. | |
441 | Other systems can't control that, and need a totally devious construct that | |
442 | will work under any of csh, sh or perl, such as the following: | |
443 | .nf | |
444 | ||
445 | .ne 3 | |
446 | eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}' | |
447 | & eval 'exec /usr/bin/perl -S $0 $argv:q' | |
448 | if 0; | |
449 | ||
450 | .fi | |
451 | .TP 5 | |
452 | .B \-u | |
453 | causes | |
454 | .I perl | |
455 | to dump core after compiling your script. | |
456 | You can then take this core dump and turn it into an executable file | |
457 | by using the undump program (not supplied). | |
458 | This speeds startup at the expense of some disk space (which you can | |
459 | minimize by stripping the executable). | |
460 | (Still, a "hello world" executable comes out to about 200K on my machine.) | |
461 | If you are going to run your executable as a set-id program then you | |
462 | should probably compile it using taintperl rather than normal perl. | |
463 | If you want to execute a portion of your script before dumping, use the | |
464 | dump operator instead. | |
465 | Note: availability of undump is platform specific and may not be available | |
466 | for a specific port of perl. | |
467 | .TP 5 | |
468 | .B \-U | |
469 | allows | |
470 | .I perl | |
471 | to do unsafe operations. | |
472 | Currently the only \*(L"unsafe\*(R" operations are the unlinking of directories while | |
473 | running as superuser, and running setuid programs with fatal taint checks | |
474 | turned into warnings. | |
475 | .TP 5 | |
476 | .B \-v | |
477 | prints the version and patchlevel of your | |
478 | .I perl | |
479 | executable. | |
480 | .TP 5 | |
481 | .B \-w | |
482 | prints warnings about identifiers that are mentioned only once, and scalar | |
483 | variables that are used before being set. | |
484 | Also warns about redefined subroutines, and references to undefined | |
485 | filehandles or filehandles opened readonly that you are attempting to | |
486 | write on. | |
487 | Also warns you if you use == on values that don't look like numbers, and if | |
488 | your subroutines recurse more than 100 deep. | |
489 | .TP 5 | |
490 | .BI \-x directory | |
491 | tells | |
492 | .I perl | |
493 | that the script is embedded in a message. | |
494 | Leading garbage will be discarded until the first line that starts | |
495 | with #! and contains the string "perl". | |
496 | Any meaningful switches on that line will be applied (but only one | |
497 | group of switches, as with normal #! processing). | |
498 | If a directory name is specified, Perl will switch to that directory | |
499 | before running the script. | |
500 | The | |
501 | .B \-x | |
502 | switch only controls the the disposal of leading garbage. | |
503 | The script must be terminated with _\|_END_\|_ if there is trailing garbage | |
504 | to be ignored (the script can process any or all of the trailing garbage | |
505 | via the DATA filehandle if desired). | |
506 | .Sh "Data Types and Objects" | |
507 | .PP | |
508 | .I Perl | |
509 | has three data types: scalars, arrays of scalars, and | |
510 | associative arrays of scalars. | |
511 | Normal arrays are indexed by number, and associative arrays by string. | |
512 | .PP | |
513 | The interpretation of operations and values in perl sometimes | |
514 | depends on the requirements | |
515 | of the context around the operation or value. | |
516 | There are three major contexts: string, numeric and array. | |
517 | Certain operations return array values | |
518 | in contexts wanting an array, and scalar values otherwise. | |
519 | (If this is true of an operation it will be mentioned in the documentation | |
520 | for that operation.) | |
521 | Operations which return scalars don't care whether the context is looking | |
522 | for a string or a number, but | |
523 | scalar variables and values are interpreted as strings or numbers | |
524 | as appropriate to the context. | |
525 | A scalar is interpreted as TRUE in the boolean sense if it is not the null | |
526 | string or 0. | |
527 | Booleans returned by operators are 1 for true and 0 or \'\' (the null | |
528 | string) for false. | |
529 | .PP | |
530 | There are actually two varieties of null string: defined and undefined. | |
531 | Undefined null strings are returned when there is no real value for something, | |
532 | such as when there was an error, or at end of file, or when you refer | |
533 | to an uninitialized variable or element of an array. | |
534 | An undefined null string may become defined the first time you access it, but | |
535 | prior to that you can use the defined() operator to determine whether the | |
536 | value is defined or not. | |
537 | .PP | |
538 | References to scalar variables always begin with \*(L'$\*(R', even when referring | |
539 | to a scalar that is part of an array. | |
540 | Thus: | |
541 | .nf | |
542 | ||
543 | .ne 3 | |
544 | $days \h'|2i'# a simple scalar variable | |
545 | $days[28] \h'|2i'# 29th element of array @days | |
546 | $days{\'Feb\'}\h'|2i'# one value from an associative array | |
547 | $#days \h'|2i'# last index of array @days | |
548 | ||
549 | but entire arrays or array slices are denoted by \*(L'@\*(R': | |
550 | ||
551 | @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n]) | |
552 | @days[3,4,5]\h'|2i'# same as @days[3.\|.5] | |
553 | @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'}) | |
554 | ||
555 | and entire associative arrays are denoted by \*(L'%\*(R': | |
556 | ||
557 | %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.) | |
558 | .fi | |
559 | .PP | |
560 | Any of these eight constructs may serve as an lvalue, | |
561 | that is, may be assigned to. | |
562 | (It also turns out that an assignment is itself an lvalue in | |
563 | certain contexts\*(--see examples under s, tr and chop.) | |
564 | Assignment to a scalar evaluates the righthand side in a scalar context, | |
565 | while assignment to an array or array slice evaluates the righthand side | |
566 | in an array context. | |
567 | .PP | |
568 | You may find the length of array @days by evaluating | |
569 | \*(L"$#days\*(R", as in | |
570 | .IR csh . | |
571 | (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.) | |
572 | Assigning to $#days changes the length of the array. | |
573 | Shortening an array by this method does not actually destroy any values. | |
574 | Lengthening an array that was previously shortened recovers the values that | |
575 | were in those elements. | |
576 | You can also gain some measure of efficiency by preextending an array that | |
577 | is going to get big. | |
578 | (You can also extend an array by assigning to an element that is off the | |
579 | end of the array. | |
580 | This differs from assigning to $#whatever in that intervening values | |
581 | are set to null rather than recovered.) | |
582 | You can truncate an array down to nothing by assigning the null list () to | |
583 | it. | |
584 | The following are exactly equivalent | |
585 | .nf | |
586 | ||
587 | @whatever = (); | |
588 | $#whatever = $[ \- 1; | |
589 | ||
590 | .fi | |
591 | .PP | |
592 | If you evaluate an array in a scalar context, it returns the length of | |
593 | the array. | |
594 | The following is always true: | |
595 | .nf | |
596 | ||
597 | scalar(@whatever) == $#whatever \- $[ + 1; | |
598 | ||
599 | .fi | |
600 | If you evaluate an associative array in a scalar context, it returns | |
601 | a value which is true if and only if the array contains any elements. | |
602 | (If there are any elements, the value returned is a string consisting | |
603 | of the number of used buckets and the number of allocated buckets, separated | |
604 | by a slash.) | |
605 | .PP | |
606 | Multi-dimensional arrays are not directly supported, but see the discussion | |
607 | of the $; variable later for a means of emulating multiple subscripts with | |
608 | an associative array. | |
609 | You could also write a subroutine to turn multiple subscripts into a single | |
610 | subscript. | |
611 | .PP | |
612 | Every data type has its own namespace. | |
613 | You can, without fear of conflict, use the same name for a scalar variable, | |
614 | an array, an associative array, a filehandle, a subroutine name, and/or | |
615 | a label. | |
616 | Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R', | |
617 | or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved | |
618 | with respect to variable names. | |
619 | (They ARE reserved with respect to labels and filehandles, however, which | |
620 | don't have an initial special character. | |
621 | Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\'). | |
622 | Using uppercase filehandles also improves readability and protects you | |
623 | from conflict with future reserved words.) | |
624 | Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all | |
625 | different names. | |
626 | Names which start with a letter may also contain digits and underscores. | |
627 | Names which do not start with a letter are limited to one character, | |
628 | e.g. \*(L"$%\*(R" or \*(L"$$\*(R". | |
629 | (Most of the one character names have a predefined significance to | |
630 | .IR perl . | |
631 | More later.) | |
632 | .PP | |
633 | Numeric literals are specified in any of the usual floating point or | |
634 | integer formats: | |
635 | .nf | |
636 | ||
637 | .ne 6 | |
638 | 12345 | |
639 | 12345.67 | |
640 | .23E-10 | |
641 | 0xffff # hex | |
642 | 0377 # octal | |
643 | 4_294_967_296 | |
644 | ||
645 | .fi | |
646 | String literals are delimited by either single or double quotes. | |
647 | They work much like shell quotes: | |
648 | double-quoted string literals are subject to backslash and variable | |
649 | substitution; single-quoted strings are not (except for \e\' and \e\e). | |
650 | The usual backslash rules apply for making characters such as newline, tab, | |
651 | etc., as well as some more exotic forms: | |
652 | .nf | |
653 | ||
654 | \et tab | |
655 | \en newline | |
656 | \er return | |
657 | \ef form feed | |
658 | \eb backspace | |
659 | \ea alarm (bell) | |
660 | \ee escape | |
661 | \e033 octal char | |
662 | \ex1b hex char | |
663 | \ec[ control char | |
664 | \el lowercase next char | |
665 | \eu uppercase next char | |
666 | \eL lowercase till \eE | |
667 | \eU uppercase till \eE | |
668 | \eE end case modification | |
669 | ||
670 | .fi | |
671 | You can also embed newlines directly in your strings, i.e. they can end on | |
672 | a different line than they begin. | |
673 | This is nice, but if you forget your trailing quote, the error will not be | |
674 | reported until | |
675 | .I perl | |
676 | finds another line containing the quote character, which | |
677 | may be much further on in the script. | |
678 | Variable substitution inside strings is limited to scalar variables, normal | |
679 | array values, and array slices. | |
680 | (In other words, identifiers beginning with $ or @, followed by an optional | |
681 | bracketed expression as a subscript.) | |
682 | The following code segment prints out \*(L"The price is $100.\*(R" | |
683 | .nf | |
684 | ||
685 | .ne 2 | |
686 | $Price = \'$100\';\h'|3.5i'# not interpreted | |
687 | print "The price is $Price.\e\|n";\h'|3.5i'# interpreted | |
688 | ||
689 | .fi | |
690 | Note that you can put curly brackets around the identifier to delimit it | |
691 | from following alphanumerics. | |
692 | Also note that a single quoted string must be separated from a preceding | |
693 | word by a space, since single quote is a valid character in an identifier | |
694 | (see Packages). | |
695 | .PP | |
696 | Two special literals are _\|_LINE_\|_ and _\|_FILE_\|_, which represent the current | |
697 | line number and filename at that point in your program. | |
698 | They may only be used as separate tokens; they will not be interpolated | |
699 | into strings. | |
700 | In addition, the token _\|_END_\|_ may be used to indicate the logical end of the | |
701 | script before the actual end of file. | |
702 | Any following text is ignored, but may be read via the DATA filehandle. | |
703 | (The DATA filehandle may read data only from the main script, but not from | |
704 | any required file or evaluated string.) | |
705 | The two control characters ^D and ^Z are synonyms for _\|_END_\|_. | |
706 | .PP | |
707 | A word that doesn't have any other interpretation in the grammar will be | |
708 | treated as if it had single quotes around it. | |
709 | For this purpose, a word consists only of alphanumeric characters and underline, | |
710 | and must start with an alphabetic character. | |
711 | As with filehandles and labels, a bare word that consists entirely of | |
712 | lowercase letters risks conflict with future reserved words, and if you | |
713 | use the | |
714 | .B \-w | |
715 | switch, Perl will warn you about any such words. | |
716 | .PP | |
717 | Array values are interpolated into double-quoted strings by joining all the | |
718 | elements of the array with the delimiter specified in the $" variable, | |
719 | space by default. | |
720 | (Since in versions of perl prior to 3.0 the @ character was not a metacharacter | |
721 | in double-quoted strings, the interpolation of @array, $array[EXPR], | |
722 | @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is | |
723 | referenced elsewhere in the program or is predefined.) | |
724 | The following are equivalent: | |
725 | .nf | |
726 | ||
727 | .ne 4 | |
728 | $temp = join($",@ARGV); | |
729 | system "echo $temp"; | |
730 | ||
731 | system "echo @ARGV"; | |
732 | ||
733 | .fi | |
734 | Within search patterns (which also undergo double-quotish substitution) | |
735 | there is a bad ambiguity: Is /$foo[bar]/ to be | |
736 | interpreted as /${foo}[bar]/ (where [bar] is a character class for the | |
737 | regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to | |
738 | array @foo)? | |
739 | If @foo doesn't otherwise exist, then it's obviously a character class. | |
740 | If @foo exists, perl takes a good guess about [bar], and is almost always right. | |
741 | If it does guess wrong, or if you're just plain paranoid, | |
742 | you can force the correct interpretation with curly brackets as above. | |
743 | .PP | |
744 | A line-oriented form of quoting is based on the shell here-is syntax. | |
745 | Following a << you specify a string to terminate the quoted material, and all lines | |
746 | following the current line down to the terminating string are the value | |
747 | of the item. | |
748 | The terminating string may be either an identifier (a word), or some | |
749 | quoted text. | |
750 | If quoted, the type of quotes you use determines the treatment of the text, | |
751 | just as in regular quoting. | |
752 | An unquoted identifier works like double quotes. | |
753 | There must be no space between the << and the identifier. | |
754 | (If you put a space it will be treated as a null identifier, which is | |
755 | valid, and matches the first blank line\*(--see Merry Christmas example below.) | |
756 | The terminating string must appear by itself (unquoted and with no surrounding | |
757 | whitespace) on the terminating line. | |
758 | .nf | |
759 | ||
760 | print <<EOF; # same as above | |
761 | The price is $Price. | |
762 | EOF | |
763 | ||
764 | print <<"EOF"; # same as above | |
765 | The price is $Price. | |
766 | EOF | |
767 | ||
768 | print << x 10; # null identifier is delimiter | |
769 | Merry Christmas! | |
770 | ||
771 | print <<`EOC`; # execute commands | |
772 | echo hi there | |
773 | echo lo there | |
774 | EOC | |
775 | ||
776 | print <<foo, <<bar; # you can stack them | |
777 | I said foo. | |
778 | foo | |
779 | I said bar. | |
780 | bar | |
781 | ||
782 | .fi | |
783 | Array literals are denoted by separating individual values by commas, and | |
784 | enclosing the list in parentheses: | |
785 | .nf | |
786 | ||
787 | (LIST) | |
788 | ||
789 | .fi | |
790 | In a context not requiring an array value, the value of the array literal | |
791 | is the value of the final element, as in the C comma operator. | |
792 | For example, | |
793 | .nf | |
794 | ||
795 | .ne 4 | |
796 | @foo = (\'cc\', \'\-E\', $bar); | |
797 | ||
798 | assigns the entire array value to array foo, but | |
799 | ||
800 | $foo = (\'cc\', \'\-E\', $bar); | |
801 | ||
802 | .fi | |
803 | assigns the value of variable bar to variable foo. | |
804 | Note that the value of an actual array in a scalar context is the length | |
805 | of the array; the following assigns to $foo the value 3: | |
806 | .nf | |
807 | ||
808 | .ne 2 | |
809 | @foo = (\'cc\', \'\-E\', $bar); | |
810 | $foo = @foo; # $foo gets 3 | |
811 | ||
812 | .fi | |
813 | You may have an optional comma before the closing parenthesis of an | |
814 | array literal, so that you can say: | |
815 | .nf | |
816 | ||
817 | @foo = ( | |
818 | 1, | |
819 | 2, | |
820 | 3, | |
821 | ); | |
822 | ||
823 | .fi | |
824 | When a LIST is evaluated, each element of the list is evaluated in | |
825 | an array context, and the resulting array value is interpolated into LIST | |
826 | just as if each individual element were a member of LIST. Thus arrays | |
827 | lose their identity in a LIST\*(--the list | |
828 | ||
829 | (@foo,@bar,&SomeSub) | |
830 | ||
831 | contains all the elements of @foo followed by all the elements of @bar, | |
832 | followed by all the elements returned by the subroutine named SomeSub. | |
833 | .PP | |
834 | A list value may also be subscripted like a normal array. | |
835 | Examples: | |
836 | .nf | |
837 | ||
838 | $time = (stat($file))[8]; # stat returns array value | |
839 | $digit = ('a','b','c','d','e','f')[$digit-10]; | |
840 | return (pop(@foo),pop(@foo))[0]; | |
841 | ||
842 | .fi | |
843 | .PP | |
844 | Array lists may be assigned to if and only if each element of the list | |
845 | is an lvalue: | |
846 | .nf | |
847 | ||
848 | ($a, $b, $c) = (1, 2, 3); | |
849 | ||
850 | ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00); | |
851 | ||
852 | The final element may be an array or an associative array: | |
853 | ||
854 | ($a, $b, @rest) = split; | |
855 | local($a, $b, %rest) = @_; | |
856 | ||
857 | .fi | |
858 | You can actually put an array anywhere in the list, but the first array | |
859 | in the list will soak up all the values, and anything after it will get | |
860 | a null value. | |
861 | This may be useful in a local(). | |
862 | .PP | |
863 | An associative array literal contains pairs of values to be interpreted | |
864 | as a key and a value: | |
865 | .nf | |
866 | ||
867 | .ne 2 | |
868 | # same as map assignment above | |
869 | %map = ('red',0x00f,'blue',0x0f0,'green',0xf00); | |
870 | ||
871 | .fi | |
872 | Array assignment in a scalar context returns the number of elements | |
873 | produced by the expression on the right side of the assignment: | |
874 | .nf | |
875 | ||
876 | $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 | |
877 | ||
878 | .fi | |
879 | .PP | |
880 | There are several other pseudo-literals that you should know about. | |
881 | If a string is enclosed by backticks (grave accents), it first undergoes | |
882 | variable substitution just like a double quoted string. | |
883 | It is then interpreted as a command, and the output of that command | |
884 | is the value of the pseudo-literal, like in a shell. | |
885 | In a scalar context, a single string consisting of all the output is | |
886 | returned. | |
887 | In an array context, an array of values is returned, one for each line | |
888 | of output. | |
889 | (You can set $/ to use a different line terminator.) | |
890 | The command is executed each time the pseudo-literal is evaluated. | |
891 | The status value of the command is returned in $? (see Predefined Names | |
892 | for the interpretation of $?). | |
893 | Unlike in \f2csh\f1, no translation is done on the return | |
894 | data\*(--newlines remain newlines. | |
895 | Unlike in any of the shells, single quotes do not hide variable names | |
896 | in the command from interpretation. | |
897 | To pass a $ through to the shell you need to hide it with a backslash. | |
898 | .PP | |
899 | Evaluating a filehandle in angle brackets yields the next line | |
900 | from that file (newline included, so it's never false until EOF, at | |
901 | which time an undefined value is returned). | |
902 | Ordinarily you must assign that value to a variable, | |
903 | but there is one situation where an automatic assignment happens. | |
904 | If (and only if) the input symbol is the only thing inside the conditional of a | |
905 | .I while | |
906 | loop, the value is | |
907 | automatically assigned to the variable \*(L"$_\*(R". | |
908 | (This may seem like an odd thing to you, but you'll use the construct | |
909 | in almost every | |
910 | .I perl | |
911 | script you write.) | |
912 | Anyway, the following lines are equivalent to each other: | |
913 | .nf | |
914 | ||
915 | .ne 5 | |
916 | while ($_ = <STDIN>) { print; } | |
917 | while (<STDIN>) { print; } | |
918 | for (\|;\|<STDIN>;\|) { print; } | |
919 | print while $_ = <STDIN>; | |
920 | print while <STDIN>; | |
921 | ||
922 | .fi | |
923 | The filehandles | |
924 | .IR STDIN , | |
925 | .I STDOUT | |
926 | and | |
927 | .I STDERR | |
928 | are predefined. | |
929 | (The filehandles | |
930 | .IR stdin , | |
931 | .I stdout | |
932 | and | |
933 | .I stderr | |
934 | will also work except in packages, where they would be interpreted as | |
935 | local identifiers rather than global.) | |
936 | Additional filehandles may be created with the | |
937 | .I open | |
938 | function. | |
939 | .PP | |
940 | If a <FILEHANDLE> is used in a context that is looking for an array, an array | |
941 | consisting of all the input lines is returned, one line per array element. | |
942 | It's easy to make a LARGE data space this way, so use with care. | |
943 | .PP | |
944 | The null filehandle <> is special and can be used to emulate the behavior of | |
945 | \fIsed\fR and \fIawk\fR. | |
946 | Input from <> comes either from standard input, or from each file listed on | |
947 | the command line. | |
948 | Here's how it works: the first time <> is evaluated, the ARGV array is checked, | |
949 | and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard | |
950 | input. | |
951 | The ARGV array is then processed as a list of filenames. | |
952 | The loop | |
953 | .nf | |
954 | ||
955 | .ne 3 | |
956 | while (<>) { | |
957 | .\|.\|. # code for each line | |
958 | } | |
959 | ||
960 | .ne 10 | |
961 | is equivalent to the following Perl-like pseudo code: | |
962 | ||
963 | unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[; | |
964 | while ($ARGV = shift) { | |
965 | open(ARGV, $ARGV); | |
966 | while (<ARGV>) { | |
967 | .\|.\|. # code for each line | |
968 | } | |
969 | } | |
970 | ||
971 | .fi | |
972 | except that it isn't as cumbersome to say, and will actually work. | |
973 | It really does shift array ARGV and put the current filename into | |
974 | variable ARGV. | |
975 | It also uses filehandle ARGV internally\*(--<> is just a synonym for | |
976 | <ARGV>, which is magical. | |
977 | (The pseudo code above doesn't work because it treats <ARGV> as non-magical.) | |
978 | .PP | |
979 | You can modify @ARGV before the first <> as long as the array ends up | |
980 | containing the list of filenames you really want. | |
981 | Line numbers ($.) continue as if the input was one big happy file. | |
982 | (But see example under eof for how to reset line numbers on each file.) | |
983 | .PP | |
984 | .ne 5 | |
985 | If you want to set @ARGV to your own list of files, go right ahead. | |
986 | If you want to pass switches into your script, you can | |
987 | put a loop on the front like this: | |
988 | .nf | |
989 | ||
990 | .ne 10 | |
991 | while ($_ = $ARGV[0], /\|^\-/\|) { | |
992 | shift; | |
993 | last if /\|^\-\|\-$\|/\|; | |
994 | /\|^\-D\|(.*\|)/ \|&& \|($debug = $1); | |
995 | /\|^\-v\|/ \|&& \|$verbose++; | |
996 | .\|.\|. # other switches | |
997 | } | |
998 | while (<>) { | |
999 | .\|.\|. # code for each line | |
1000 | } | |
1001 | ||
1002 | .fi | |
1003 | The <> symbol will return FALSE only once. | |
1004 | If you call it again after this it will assume you are processing another | |
1005 | @ARGV list, and if you haven't set @ARGV, will input from | |
1006 | .IR STDIN . | |
1007 | .PP | |
1008 | If the string inside the angle brackets is a reference to a scalar variable | |
1009 | (e.g. <$foo>), | |
1010 | then that variable contains the name of the filehandle to input from. | |
1011 | .PP | |
1012 | If the string inside angle brackets is not a filehandle, it is interpreted | |
1013 | as a filename pattern to be globbed, and either an array of filenames or the | |
1014 | next filename in the list is returned, depending on context. | |
1015 | One level of $ interpretation is done first, but you can't say <$foo> | |
1016 | because that's an indirect filehandle as explained in the previous | |
1017 | paragraph. | |
1018 | You could insert curly brackets to force interpretation as a | |
1019 | filename glob: <${foo}>. | |
1020 | Example: | |
1021 | .nf | |
1022 | ||
1023 | .ne 3 | |
1024 | while (<*.c>) { | |
1025 | chmod 0644, $_; | |
1026 | } | |
1027 | ||
1028 | is equivalent to | |
1029 | ||
1030 | .ne 5 | |
1031 | open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|"); | |
1032 | while (<foo>) { | |
1033 | chop; | |
1034 | chmod 0644, $_; | |
1035 | } | |
1036 | ||
1037 | .fi | |
1038 | In fact, it's currently implemented that way. | |
1039 | (Which means it will not work on filenames with spaces in them unless | |
1040 | you have /bin/csh on your machine.) | |
1041 | Of course, the shortest way to do the above is: | |
1042 | .nf | |
1043 | ||
1044 | chmod 0644, <*.c>; | |
1045 | ||
1046 | .fi | |
1047 | .Sh "Syntax" | |
1048 | .PP | |
1049 | A | |
1050 | .I perl | |
1051 | script consists of a sequence of declarations and commands. | |
1052 | The only things that need to be declared in | |
1053 | .I perl | |
1054 | are report formats and subroutines. | |
1055 | See the sections below for more information on those declarations. | |
1056 | All uninitialized user-created objects are assumed to | |
1057 | start with a null or 0 value until they | |
1058 | are defined by some explicit operation such as assignment. | |
1059 | The sequence of commands is executed just once, unlike in | |
1060 | .I sed | |
1061 | and | |
1062 | .I awk | |
1063 | scripts, where the sequence of commands is executed for each input line. | |
1064 | While this means that you must explicitly loop over the lines of your input file | |
1065 | (or files), it also means you have much more control over which files and which | |
1066 | lines you look at. | |
1067 | (Actually, I'm lying\*(--it is possible to do an implicit loop with either the | |
1068 | .B \-n | |
1069 | or | |
1070 | .B \-p | |
1071 | switch.) | |
1072 | .PP | |
1073 | A declaration can be put anywhere a command can, but has no effect on the | |
1074 | execution of the primary sequence of commands\*(--declarations all take effect | |
1075 | at compile time. | |
1076 | Typically all the declarations are put at the beginning or the end of the script. | |
1077 | .PP | |
1078 | .I Perl | |
1079 | is, for the most part, a free-form language. | |
1080 | (The only exception to this is format declarations, for fairly obvious reasons.) | |
1081 | Comments are indicated by the # character, and extend to the end of the line. | |
1082 | If you attempt to use /* */ C comments, it will be interpreted either as | |
1083 | division or pattern matching, depending on the context. | |
1084 | So don't do that. | |
1085 | .Sh "Compound statements" | |
1086 | In | |
1087 | .IR perl , | |
1088 | a sequence of commands may be treated as one command by enclosing it | |
1089 | in curly brackets. | |
1090 | We will call this a BLOCK. | |
1091 | .PP | |
1092 | The following compound commands may be used to control flow: | |
1093 | .nf | |
1094 | ||
1095 | .ne 4 | |
1096 | if (EXPR) BLOCK | |
1097 | if (EXPR) BLOCK else BLOCK | |
1098 | if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK | |
1099 | LABEL while (EXPR) BLOCK | |
1100 | LABEL while (EXPR) BLOCK continue BLOCK | |
1101 | LABEL for (EXPR; EXPR; EXPR) BLOCK | |
1102 | LABEL foreach VAR (ARRAY) BLOCK | |
1103 | LABEL BLOCK continue BLOCK | |
1104 | ||
1105 | .fi | |
1106 | Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not | |
1107 | statements. | |
1108 | This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed. | |
1109 | If you want to write conditionals without curly brackets there are several | |
1110 | other ways to do it. | |
1111 | The following all do the same thing: | |
1112 | .nf | |
1113 | ||
1114 | .ne 5 | |
1115 | if (!open(foo)) { die "Can't open $foo: $!"; } | |
1116 | die "Can't open $foo: $!" unless open(foo); | |
1117 | open(foo) || die "Can't open $foo: $!"; # foo or bust! | |
1118 | open(foo) ? \'hi mom\' : die "Can't open $foo: $!"; | |
1119 | # a bit exotic, that last one | |
1120 | ||
1121 | .fi | |
1122 | .PP | |
1123 | The | |
1124 | .I if | |
1125 | statement is straightforward. | |
1126 | Since BLOCKs are always bounded by curly brackets, there is never any | |
1127 | ambiguity about which | |
1128 | .I if | |
1129 | an | |
1130 | .I else | |
1131 | goes with. | |
1132 | If you use | |
1133 | .I unless | |
1134 | in place of | |
1135 | .IR if , | |
1136 | the sense of the test is reversed. | |
1137 | .PP | |
1138 | The | |
1139 | .I while | |
1140 | statement executes the block as long as the expression is true | |
1141 | (does not evaluate to the null string or 0). | |
1142 | The LABEL is optional, and if present, consists of an identifier followed by | |
1143 | a colon. | |
1144 | The LABEL identifies the loop for the loop control statements | |
1145 | .IR next , | |
1146 | .IR last , | |
1147 | and | |
1148 | .I redo | |
1149 | (see below). | |
1150 | If there is a | |
1151 | .I continue | |
1152 | BLOCK, it is always executed just before | |
1153 | the conditional is about to be evaluated again, similarly to the third part | |
1154 | of a | |
1155 | .I for | |
1156 | loop in C. | |
1157 | Thus it can be used to increment a loop variable, even when the loop has | |
1158 | been continued via the | |
1159 | .I next | |
1160 | statement (similar to the C \*(L"continue\*(R" statement). | |
1161 | .PP | |
1162 | If the word | |
1163 | .I while | |
1164 | is replaced by the word | |
1165 | .IR until , | |
1166 | the sense of the test is reversed, but the conditional is still tested before | |
1167 | the first iteration. | |
1168 | .PP | |
1169 | In either the | |
1170 | .I if | |
1171 | or the | |
1172 | .I while | |
1173 | statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional | |
1174 | is true if the value of the last command in that block is true. | |
1175 | .PP | |
1176 | The | |
1177 | .I for | |
1178 | loop works exactly like the corresponding | |
1179 | .I while | |
1180 | loop: | |
1181 | .nf | |
1182 | ||
1183 | .ne 12 | |
1184 | for ($i = 1; $i < 10; $i++) { | |
1185 | .\|.\|. | |
1186 | } | |
1187 | ||
1188 | is the same as | |
1189 | ||
1190 | $i = 1; | |
1191 | while ($i < 10) { | |
1192 | .\|.\|. | |
1193 | } continue { | |
1194 | $i++; | |
1195 | } | |
1196 | .fi | |
1197 | .PP | |
1198 | The foreach loop iterates over a normal array value and sets the variable | |
1199 | VAR to be each element of the array in turn. | |
1200 | The variable is implicitly local to the loop, and regains its former value | |
1201 | upon exiting the loop. | |
1202 | The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword, | |
1203 | so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity. | |
1204 | If VAR is omitted, $_ is set to each value. | |
1205 | If ARRAY is an actual array (as opposed to an expression returning an array | |
1206 | value), you can modify each element of the array | |
1207 | by modifying VAR inside the loop. | |
1208 | Examples: | |
1209 | .nf | |
1210 | ||
1211 | .ne 5 | |
1212 | for (@ary) { s/foo/bar/; } | |
1213 | ||
1214 | foreach $elem (@elements) { | |
1215 | $elem *= 2; | |
1216 | } | |
1217 | ||
1218 | .ne 3 | |
1219 | for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) { | |
1220 | print $_, "\en"; sleep(1); | |
1221 | } | |
1222 | ||
1223 | for (1..15) { print "Merry Christmas\en"; } | |
1224 | ||
1225 | .ne 3 | |
1226 | foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) { | |
1227 | print "Item: $item\en"; | |
1228 | } | |
1229 | ||
1230 | .fi | |
1231 | .PP | |
1232 | The BLOCK by itself (labeled or not) is equivalent to a loop that executes | |
1233 | once. | |
1234 | Thus you can use any of the loop control statements in it to leave or | |
1235 | restart the block. | |
1236 | The | |
1237 | .I continue | |
1238 | block is optional. | |
1239 | This construct is particularly nice for doing case structures. | |
1240 | .nf | |
1241 | ||
1242 | .ne 6 | |
1243 | foo: { | |
1244 | if (/^abc/) { $abc = 1; last foo; } | |
1245 | if (/^def/) { $def = 1; last foo; } | |
1246 | if (/^xyz/) { $xyz = 1; last foo; } | |
1247 | $nothing = 1; | |
1248 | } | |
1249 | ||
1250 | .fi | |
1251 | There is no official switch statement in perl, because there | |
1252 | are already several ways to write the equivalent. | |
1253 | In addition to the above, you could write | |
1254 | .nf | |
1255 | ||
1256 | .ne 6 | |
1257 | foo: { | |
1258 | $abc = 1, last foo if /^abc/; | |
1259 | $def = 1, last foo if /^def/; | |
1260 | $xyz = 1, last foo if /^xyz/; | |
1261 | $nothing = 1; | |
1262 | } | |
1263 | ||
1264 | or | |
1265 | ||
1266 | .ne 6 | |
1267 | foo: { | |
1268 | /^abc/ && do { $abc = 1; last foo; }; | |
1269 | /^def/ && do { $def = 1; last foo; }; | |
1270 | /^xyz/ && do { $xyz = 1; last foo; }; | |
1271 | $nothing = 1; | |
1272 | } | |
1273 | ||
1274 | or | |
1275 | ||
1276 | .ne 6 | |
1277 | foo: { | |
1278 | /^abc/ && ($abc = 1, last foo); | |
1279 | /^def/ && ($def = 1, last foo); | |
1280 | /^xyz/ && ($xyz = 1, last foo); | |
1281 | $nothing = 1; | |
1282 | } | |
1283 | ||
1284 | or even | |
1285 | ||
1286 | .ne 8 | |
1287 | if (/^abc/) | |
1288 | { $abc = 1; } | |
1289 | elsif (/^def/) | |
1290 | { $def = 1; } | |
1291 | elsif (/^xyz/) | |
1292 | { $xyz = 1; } | |
1293 | else | |
1294 | {$nothing = 1;} | |
1295 | ||
1296 | .fi | |
1297 | As it happens, these are all optimized internally to a switch structure, | |
1298 | so perl jumps directly to the desired statement, and you needn't worry | |
1299 | about perl executing a lot of unnecessary statements when you have a string | |
1300 | of 50 elsifs, as long as you are testing the same simple scalar variable | |
1301 | using ==, eq, or pattern matching as above. | |
1302 | (If you're curious as to whether the optimizer has done this for a particular | |
1303 | case statement, you can use the \-D1024 switch to list the syntax tree | |
1304 | before execution.) | |
1305 | .Sh "Simple statements" | |
1306 | The only kind of simple statement is an expression evaluated for its side | |
1307 | effects. | |
1308 | Every simple statement must be terminated with a semicolon, unless it is the | |
1309 | final statement in a block, in which case the semicolon is optional. | |
1310 | (Semicolon is still encouraged there if the block takes up more than one line). | |
1311 | .PP | |
1312 | Any simple statement may optionally be followed by a | |
1313 | single modifier, just before the terminating semicolon. | |
1314 | The possible modifiers are: | |
1315 | .nf | |
1316 | ||
1317 | .ne 4 | |
1318 | if EXPR | |
1319 | unless EXPR | |
1320 | while EXPR | |
1321 | until EXPR | |
1322 | ||
1323 | .fi | |
1324 | The | |
1325 | .I if | |
1326 | and | |
1327 | .I unless | |
1328 | modifiers have the expected semantics. | |
1329 | The | |
1330 | .I while | |
1331 | and | |
1332 | .I until | |
1333 | modifiers also have the expected semantics (conditional evaluated first), | |
1334 | except when applied to a do-BLOCK or a do-SUBROUTINE command, | |
1335 | in which case the block executes once before the conditional is evaluated. | |
1336 | This is so that you can write loops like: | |
1337 | .nf | |
1338 | ||
1339 | .ne 4 | |
1340 | do { | |
1341 | $_ = <STDIN>; | |
1342 | .\|.\|. | |
1343 | } until $_ \|eq \|".\|\e\|n"; | |
1344 | ||
1345 | .fi | |
1346 | (See the | |
1347 | .I do | |
1348 | operator below. Note also that the loop control commands described later will | |
1349 | NOT work in this construct, since modifiers don't take loop labels. | |
1350 | Sorry.) | |
1351 | .Sh "Expressions" | |
1352 | Since | |
1353 | .I perl | |
1354 | expressions work almost exactly like C expressions, only the differences | |
1355 | will be mentioned here. | |
1356 | .PP | |
1357 | Here's what | |
1358 | .I perl | |
1359 | has that C doesn't: | |
1360 | .Ip ** 8 2 | |
1361 | The exponentiation operator. | |
1362 | .Ip **= 8 | |
1363 | The exponentiation assignment operator. | |
1364 | .Ip (\|) 8 3 | |
1365 | The null list, used to initialize an array to null. | |
1366 | .Ip . 8 | |
1367 | Concatenation of two strings. | |
1368 | .Ip .= 8 | |
1369 | The concatenation assignment operator. | |
1370 | .Ip eq 8 | |
1371 | String equality (== is numeric equality). | |
1372 | For a mnemonic just think of \*(L"eq\*(R" as a string. | |
1373 | (If you are used to the | |
1374 | .I awk | |
1375 | behavior of using == for either string or numeric equality | |
1376 | based on the current form of the comparands, beware! | |
1377 | You must be explicit here.) | |
1378 | .Ip ne 8 | |
1379 | String inequality (!= is numeric inequality). | |
1380 | .Ip lt 8 | |
1381 | String less than. | |
1382 | .Ip gt 8 | |
1383 | String greater than. | |
1384 | .Ip le 8 | |
1385 | String less than or equal. | |
1386 | .Ip ge 8 | |
1387 | String greater than or equal. | |
1388 | .Ip cmp 8 | |
1389 | String comparison, returning -1, 0, or 1. | |
1390 | .Ip <=> 8 | |
1391 | Numeric comparison, returning -1, 0, or 1. | |
1392 | .Ip =~ 8 2 | |
1393 | Certain operations search or modify the string \*(L"$_\*(R" by default. | |
1394 | This operator makes that kind of operation work on some other string. | |
1395 | The right argument is a search pattern, substitution, or translation. | |
1396 | The left argument is what is supposed to be searched, substituted, or | |
1397 | translated instead of the default \*(L"$_\*(R". | |
1398 | The return value indicates the success of the operation. | |
1399 | (If the right argument is an expression other than a search pattern, | |
1400 | substitution, or translation, it is interpreted as a search pattern | |
1401 | at run time. | |
1402 | This is less efficient than an explicit search, since the pattern must | |
1403 | be compiled every time the expression is evaluated.) | |
1404 | The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else. | |
1405 | .Ip !~ 8 | |
1406 | Just like =~ except the return value is negated. | |
1407 | .Ip x 8 | |
1408 | The repetition operator. | |
1409 | Returns a string consisting of the left operand repeated the | |
1410 | number of times specified by the right operand. | |
1411 | In an array context, if the left operand is a list in parens, it repeats | |
1412 | the list. | |
1413 | .nf | |
1414 | ||
1415 | print \'\-\' x 80; # print row of dashes | |
1416 | print \'\-\' x80; # illegal, x80 is identifier | |
1417 | ||
1418 | print "\et" x ($tab/8), \' \' x ($tab%8); # tab over | |
1419 | ||
1420 | @ones = (1) x 80; # an array of 80 1's | |
1421 | @ones = (5) x @ones; # set all elements to 5 | |
1422 | ||
1423 | .fi | |
1424 | .Ip x= 8 | |
1425 | The repetition assignment operator. | |
1426 | Only works on scalars. | |
1427 | .Ip .\|. 8 | |
1428 | The range operator, which is really two different operators depending | |
1429 | on the context. | |
1430 | In an array context, returns an array of values counting (by ones) | |
1431 | from the left value to the right value. | |
1432 | This is useful for writing \*(L"for (1..10)\*(R" loops and for doing | |
1433 | slice operations on arrays. | |
1434 | .Sp | |
1435 | In a scalar context, .\|. returns a boolean value. | |
1436 | The operator is bistable, like a flip-flop, and | |
1437 | emulates the line-range (comma) operator of sed, awk, and various editors. | |
1438 | Each .\|. operator maintains its own boolean state. | |
1439 | It is false as long as its left operand is false. | |
1440 | Once the left operand is true, the range operator stays true | |
1441 | until the right operand is true, | |
1442 | AFTER which the range operator becomes false again. | |
1443 | (It doesn't become false till the next time the range operator is evaluated. | |
1444 | It can test the right operand and become false on the | |
1445 | same evaluation it became true (as in awk), but it still returns true once. | |
1446 | If you don't want it to test the right operand till the next | |
1447 | evaluation (as in sed), use three dots (.\|.\|.) instead of two.) | |
1448 | The right operand is not evaluated while the operator is in the \*(L"false\*(R" state, | |
1449 | and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state. | |
1450 | The precedence is a little lower than || and &&. | |
1451 | The value returned is either the null string for false, or a sequence number | |
1452 | (beginning with 1) for true. | |
1453 | The sequence number is reset for each range encountered. | |
1454 | The final sequence number in a range has the string \'E0\' appended to it, which | |
1455 | doesn't affect its numeric value, but gives you something to search for if you | |
1456 | want to exclude the endpoint. | |
1457 | You can exclude the beginning point by waiting for the sequence number to be | |
1458 | greater than 1. | |
1459 | If either operand of scalar .\|. is static, that operand is implicitly compared | |
1460 | to the $. variable, the current line number. | |
1461 | Examples: | |
1462 | .nf | |
1463 | ||
1464 | .ne 6 | |
1465 | As a scalar operator: | |
1466 | if (101 .\|. 200) { print; } # print 2nd hundred lines | |
1467 | ||
1468 | next line if (1 .\|. /^$/); # skip header lines | |
1469 | ||
1470 | s/^/> / if (/^$/ .\|. eof()); # quote body | |
1471 | ||
1472 | .ne 4 | |
1473 | As an array operator: | |
1474 | for (101 .\|. 200) { print; } # print $_ 100 times | |
1475 | ||
1476 | @foo = @foo[$[ .\|. $#foo]; # an expensive no-op | |
1477 | @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items | |
1478 | ||
1479 | .fi | |
1480 | .Ip \-x 8 | |
1481 | A file test. | |
1482 | This unary operator takes one argument, either a filename or a filehandle, | |
1483 | and tests the associated file to see if something is true about it. | |
1484 | If the argument is omitted, tests $_, except for \-t, which tests | |
1485 | .IR STDIN . | |
1486 | It returns 1 for true and \'\' for false, or the undefined value if the | |
1487 | file doesn't exist. | |
1488 | Precedence is higher than logical and relational operators, but lower than | |
1489 | arithmetic operators. | |
1490 | The operator may be any of: | |
1491 | .nf | |
1492 | \-r File is readable by effective uid/gid. | |
1493 | \-w File is writable by effective uid/gid. | |
1494 | \-x File is executable by effective uid/gid. | |
1495 | \-o File is owned by effective uid. | |
1496 | \-R File is readable by real uid/gid. | |
1497 | \-W File is writable by real uid/gid. | |
1498 | \-X File is executable by real uid/gid. | |
1499 | \-O File is owned by real uid. | |
1500 | \-e File exists. | |
1501 | \-z File has zero size. | |
1502 | \-s File has non-zero size (returns size). | |
1503 | \-f File is a plain file. | |
1504 | \-d File is a directory. | |
1505 | \-l File is a symbolic link. | |
1506 | \-p File is a named pipe (FIFO). | |
1507 | \-S File is a socket. | |
1508 | \-b File is a block special file. | |
1509 | \-c File is a character special file. | |
1510 | \-u File has setuid bit set. | |
1511 | \-g File has setgid bit set. | |
1512 | \-k File has sticky bit set. | |
1513 | \-t Filehandle is opened to a tty. | |
1514 | \-T File is a text file. | |
1515 | \-B File is a binary file (opposite of \-T). | |
1516 | \-M Age of file in days when script started. | |
1517 | \-A Same for access time. | |
1518 | \-C Same for inode change time. | |
1519 | ||
1520 | .fi | |
1521 | The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X | |
1522 | is based solely on the mode of the file and the uids and gids of the user. | |
1523 | There may be other reasons you can't actually read, write or execute the file. | |
1524 | Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and | |
1525 | \-x and \-X return 1 if any execute bit is set in the mode. | |
1526 | Scripts run by the superuser may thus need to do a stat() in order to determine | |
1527 | the actual mode of the file, or temporarily set the uid to something else. | |
1528 | .Sp | |
1529 | Example: | |
1530 | .nf | |
1531 | .ne 7 | |
1532 | ||
1533 | while (<>) { | |
1534 | chop; | |
1535 | next unless \-f $_; # ignore specials | |
1536 | .\|.\|. | |
1537 | } | |
1538 | ||
1539 | .fi | |
1540 | Note that \-s/a/b/ does not do a negated substitution. | |
1541 | Saying \-exp($foo) still works as expected, however\*(--only single letters | |
1542 | following a minus are interpreted as file tests. | |
1543 | .Sp | |
1544 | The \-T and \-B switches work as follows. | |
1545 | The first block or so of the file is examined for odd characters such as | |
1546 | strange control codes or metacharacters. | |
1547 | If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file. | |
1548 | Also, any file containing null in the first block is considered a binary file. | |
1549 | If \-T or \-B is used on a filehandle, the current stdio buffer is examined | |
1550 | rather than the first block. | |
1551 | Both \-T and \-B return TRUE on a null file, or a file at EOF when testing | |
1552 | a filehandle. | |
1553 | .PP | |
1554 | If any of the file tests (or either stat operator) are given the special | |
1555 | filehandle consisting of a solitary underline, then the stat structure | |
1556 | of the previous file test (or stat operator) is used, saving a system | |
1557 | call. | |
1558 | (This doesn't work with \-t, and you need to remember that lstat and -l | |
1559 | will leave values in the stat structure for the symbolic link, not the | |
1560 | real file.) | |
1561 | Example: | |
1562 | .nf | |
1563 | ||
1564 | print "Can do.\en" if -r $a || -w _ || -x _; | |
1565 | ||
1566 | .ne 9 | |
1567 | stat($filename); | |
1568 | print "Readable\en" if -r _; | |
1569 | print "Writable\en" if -w _; | |
1570 | print "Executable\en" if -x _; | |
1571 | print "Setuid\en" if -u _; | |
1572 | print "Setgid\en" if -g _; | |
1573 | print "Sticky\en" if -k _; | |
1574 | print "Text\en" if -T _; | |
1575 | print "Binary\en" if -B _; | |
1576 | ||
1577 | .fi | |
1578 | .PP | |
1579 | Here is what C has that | |
1580 | .I perl | |
1581 | doesn't: | |
1582 | .Ip "unary &" 12 | |
1583 | Address-of operator. | |
1584 | .Ip "unary *" 12 | |
1585 | Dereference-address operator. | |
1586 | .Ip "(TYPE)" 12 | |
1587 | Type casting operator. | |
1588 | .PP | |
1589 | Like C, | |
1590 | .I perl | |
1591 | does a certain amount of expression evaluation at compile time, whenever | |
1592 | it determines that all of the arguments to an operator are static and have | |
1593 | no side effects. | |
1594 | In particular, string concatenation happens at compile time between literals that don't do variable substitution. | |
1595 | Backslash interpretation also happens at compile time. | |
1596 | You can say | |
1597 | .nf | |
1598 | ||
1599 | .ne 2 | |
1600 | \'Now is the time for all\' . "\|\e\|n" . | |
1601 | \'good men to come to.\' | |
1602 | ||
1603 | .fi | |
1604 | and this all reduces to one string internally. | |
1605 | .PP | |
1606 | The autoincrement operator has a little extra built-in magic to it. | |
1607 | If you increment a variable that is numeric, or that has ever been used in | |
1608 | a numeric context, you get a normal increment. | |
1609 | If, however, the variable has only been used in string contexts since it | |
1610 | was set, and has a value that is not null and matches the | |
1611 | pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done | |
1612 | as a string, preserving each character within its range, with carry: | |
1613 | .nf | |
1614 | ||
1615 | print ++($foo = \'99\'); # prints \*(L'100\*(R' | |
1616 | print ++($foo = \'a0\'); # prints \*(L'a1\*(R' | |
1617 | print ++($foo = \'Az\'); # prints \*(L'Ba\*(R' | |
1618 | print ++($foo = \'zz\'); # prints \*(L'aaa\*(R' | |
1619 | ||
1620 | .fi | |
1621 | The autodecrement is not magical. | |
1622 | .PP | |
1623 | The range operator (in an array context) makes use of the magical | |
1624 | autoincrement algorithm if the minimum and maximum are strings. | |
1625 | You can say | |
1626 | ||
1627 | @alphabet = (\'A\' .. \'Z\'); | |
1628 | ||
1629 | to get all the letters of the alphabet, or | |
1630 | ||
1631 | $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15]; | |
1632 | ||
1633 | to get a hexadecimal digit, or | |
1634 | ||
1635 | @z2 = (\'01\' .. \'31\'); print @z2[$mday]; | |
1636 | ||
1637 | to get dates with leading zeros. | |
1638 | (If the final value specified is not in the sequence that the magical increment | |
1639 | would produce, the sequence goes until the next value would be longer than | |
1640 | the final value specified.) | |
1641 | .PP | |
1642 | The || and && operators differ from C's in that, rather than returning 0 or 1, | |
1643 | they return the last value evaluated. | |
1644 | Thus, a portable way to find out the home directory might be: | |
1645 | .nf | |
1646 | ||
1647 | $home = $ENV{'HOME'} || $ENV{'LOGDIR'} || | |
1648 | (getpwuid($<))[7] || die "You're homeless!\en"; | |
1649 | ||
1650 | .fi | |
1651 | .PP | |
1652 | Along with the literals and variables mentioned earlier, | |
1653 | the operations in the following section can serve as terms in an expression. | |
1654 | Some of these operations take a LIST as an argument. | |
1655 | Such a list can consist of any combination of scalar arguments or array values; | |
1656 | the array values will be included in the list as if each individual element were | |
1657 | interpolated at that point in the list, forming a longer single-dimensional | |
1658 | array value. | |
1659 | Elements of the LIST should be separated by commas. | |
1660 | If an operation is listed both with and without parentheses around its | |
1661 | arguments, it means you can either use it as a unary operator or | |
1662 | as a function call. | |
1663 | To use it as a function call, the next token on the same line must | |
1664 | be a left parenthesis. | |
1665 | (There may be intervening white space.) | |
1666 | Such a function then has highest precedence, as you would expect from | |
1667 | a function. | |
1668 | If any token other than a left parenthesis follows, then it is a | |
1669 | unary operator, with a precedence depending only on whether it is a LIST | |
1670 | operator or not. | |
1671 | LIST operators have lowest precedence. | |
1672 | All other unary operators have a precedence greater than relational operators | |
1673 | but less than arithmetic operators. | |
1674 | See the section on Precedence. | |
1675 | .PP | |
1676 | For operators that can be used in either a scalar or array context, | |
1677 | failure is generally indicated in a scalar context by returning | |
1678 | the undefined value, and in an array context by returning the null list. | |
1679 | Remember though that | |
1680 | THERE IS NO GENERAL RULE FOR CONVERTING A LIST INTO A SCALAR. | |
1681 | Each operator decides which sort of scalar it would be most | |
1682 | appropriate to return. | |
1683 | Some operators return the length of the list | |
1684 | that would have been returned in an array context. | |
1685 | Some operators return the first value in the list. | |
1686 | Some operators return the last value in the list. | |
1687 | Some operators return a count of successful operations. | |
1688 | In general, they do what you want, unless you want consistency. | |
1689 | .Ip "/PATTERN/" 8 4 | |
1690 | See m/PATTERN/. | |
1691 | .Ip "?PATTERN?" 8 4 | |
1692 | This is just like the /pattern/ search, except that it matches only once between | |
1693 | calls to the | |
1694 | .I reset | |
1695 | operator. | |
1696 | This is a useful optimization when you only want to see the first occurrence of | |
1697 | something in each file of a set of files, for instance. | |
1698 | Only ?? patterns local to the current package are reset. | |
1699 | .Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2 | |
1700 | Does the same thing that the accept system call does. | |
1701 | Returns true if it succeeded, false otherwise. | |
1702 | See example in section on Interprocess Communication. | |
1703 | .Ip "alarm(SECONDS)" 8 4 | |
1704 | .Ip "alarm SECONDS" 8 | |
1705 | Arranges to have a SIGALRM delivered to this process after the specified number | |
1706 | of seconds (minus 1, actually) have elapsed. Thus, alarm(15) will cause | |
1707 | a SIGALRM at some point more than 14 seconds in the future. | |
1708 | Only one timer may be counting at once. Each call disables the previous | |
1709 | timer, and an argument of 0 may be supplied to cancel the previous timer | |
1710 | without starting a new one. | |
1711 | The returned value is the amount of time remaining on the previous timer. | |
1712 | .Ip "atan2(Y,X)" 8 2 | |
1713 | Returns the arctangent of Y/X in the range | |
1714 | .if t \-\(*p to \(*p. | |
1715 | .if n \-PI to PI. | |
1716 | .Ip "bind(SOCKET,NAME)" 8 2 | |
1717 | Does the same thing that the bind system call does. | |
1718 | Returns true if it succeeded, false otherwise. | |
1719 | NAME should be a packed address of the proper type for the socket. | |
1720 | See example in section on Interprocess Communication. | |
1721 | .Ip "binmode(FILEHANDLE)" 8 4 | |
1722 | .Ip "binmode FILEHANDLE" 8 4 | |
1723 | Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems | |
1724 | that distinguish between binary and text files. | |
1725 | Files that are not read in binary mode have CR LF sequences translated | |
1726 | to LF on input and LF translated to CR LF on output. | |
1727 | Binmode has no effect under Unix. | |
1728 | If FILEHANDLE is an expression, the value is taken as the name of | |
1729 | the filehandle. | |
1730 | .Ip "caller(EXPR)" | |
1731 | .Ip "caller" | |
1732 | Returns the context of the current subroutine call: | |
1733 | .nf | |
1734 | ||
1735 | ($package,$filename,$line) = caller; | |
1736 | ||
1737 | .fi | |
1738 | With EXPR, returns some extra information that the debugger uses to print | |
1739 | a stack trace. The value of EXPR indicates how many call frames to go | |
1740 | back before the current one. | |
1741 | .Ip "chdir(EXPR)" 8 2 | |
1742 | .Ip "chdir EXPR" 8 2 | |
1743 | Changes the working directory to EXPR, if possible. | |
1744 | If EXPR is omitted, changes to home directory. | |
1745 | Returns 1 upon success, 0 otherwise. | |
1746 | See example under | |
1747 | .IR die . | |
1748 | .Ip "chmod(LIST)" 8 2 | |
1749 | .Ip "chmod LIST" 8 2 | |
1750 | Changes the permissions of a list of files. | |
1751 | The first element of the list must be the numerical mode. | |
1752 | Returns the number of files successfully changed. | |
1753 | .nf | |
1754 | ||
1755 | .ne 2 | |
1756 | $cnt = chmod 0755, \'foo\', \'bar\'; | |
1757 | chmod 0755, @executables; | |
1758 | ||
1759 | .fi | |
1760 | .Ip "chop(LIST)" 8 7 | |
1761 | .Ip "chop(VARIABLE)" 8 | |
1762 | .Ip "chop VARIABLE" 8 | |
1763 | .Ip "chop" 8 | |
1764 | Chops off the last character of a string and returns the character chopped. | |
1765 | It's used primarily to remove the newline from the end of an input record, | |
1766 | but is much more efficient than s/\en// because it neither scans nor copies | |
1767 | the string. | |
1768 | If VARIABLE is omitted, chops $_. | |
1769 | Example: | |
1770 | .nf | |
1771 | ||
1772 | .ne 5 | |
1773 | while (<>) { | |
1774 | chop; # avoid \en on last field | |
1775 | @array = split(/:/); | |
1776 | .\|.\|. | |
1777 | } | |
1778 | ||
1779 | .fi | |
1780 | You can actually chop anything that's an lvalue, including an assignment: | |
1781 | .nf | |
1782 | ||
1783 | chop($cwd = \`pwd\`); | |
1784 | chop($answer = <STDIN>); | |
1785 | ||
1786 | .fi | |
1787 | If you chop a list, each element is chopped. | |
1788 | Only the value of the last chop is returned. | |
1789 | .Ip "chown(LIST)" 8 2 | |
1790 | .Ip "chown LIST" 8 2 | |
1791 | Changes the owner (and group) of a list of files. | |
1792 | The first two elements of the list must be the NUMERICAL uid and gid, | |
1793 | in that order. | |
1794 | Returns the number of files successfully changed. | |
1795 | .nf | |
1796 | ||
1797 | .ne 2 | |
1798 | $cnt = chown $uid, $gid, \'foo\', \'bar\'; | |
1799 | chown $uid, $gid, @filenames; | |
1800 | ||
1801 | .fi | |
1802 | .ne 23 | |
1803 | Here's an example that looks up non-numeric uids in the passwd file: | |
1804 | .nf | |
1805 | ||
1806 | print "User: "; | |
1807 | $user = <STDIN>; | |
1808 | chop($user); | |
1809 | print "Files: " | |
1810 | $pattern = <STDIN>; | |
1811 | chop($pattern); | |
1812 | .ie t \{\ | |
1813 | open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en"; | |
1814 | 'br\} | |
1815 | .el \{\ | |
1816 | open(pass, \'/etc/passwd\') | |
1817 | || die "Can't open passwd: $!\en"; | |
1818 | 'br\} | |
1819 | while (<pass>) { | |
1820 | ($login,$pass,$uid,$gid) = split(/:/); | |
1821 | $uid{$login} = $uid; | |
1822 | $gid{$login} = $gid; | |
1823 | } | |
1824 | @ary = <${pattern}>; # get filenames | |
1825 | if ($uid{$user} eq \'\') { | |
1826 | die "$user not in passwd file"; | |
1827 | } | |
1828 | else { | |
1829 | chown $uid{$user}, $gid{$user}, @ary; | |
1830 | } | |
1831 | ||
1832 | .fi | |
1833 | .Ip "chroot(FILENAME)" 8 5 | |
1834 | .Ip "chroot FILENAME" 8 | |
1835 | Does the same as the system call of that name. | |
1836 | If you don't know what it does, don't worry about it. | |
1837 | If FILENAME is omitted, does chroot to $_. | |
1838 | .Ip "close(FILEHANDLE)" 8 5 | |
1839 | .Ip "close FILEHANDLE" 8 | |
1840 | Closes the file or pipe associated with the file handle. | |
1841 | You don't have to close FILEHANDLE if you are immediately going to | |
1842 | do another open on it, since open will close it for you. | |
1843 | (See | |
1844 | .IR open .) | |
1845 | However, an explicit close on an input file resets the line counter ($.), while | |
1846 | the implicit close done by | |
1847 | .I open | |
1848 | does not. | |
1849 | Also, closing a pipe will wait for the process executing on the pipe to complete, | |
1850 | in case you want to look at the output of the pipe afterwards. | |
1851 | Closing a pipe explicitly also puts the status value of the command into $?. | |
1852 | Example: | |
1853 | .nf | |
1854 | ||
1855 | .ne 4 | |
1856 | open(OUTPUT, \'|sort >foo\'); # pipe to sort | |
1857 | .\|.\|. # print stuff to output | |
1858 | close OUTPUT; # wait for sort to finish | |
1859 | open(INPUT, \'foo\'); # get sort's results | |
1860 | ||
1861 | .fi | |
1862 | FILEHANDLE may be an expression whose value gives the real filehandle name. | |
1863 | .Ip "closedir(DIRHANDLE)" 8 5 | |
1864 | .Ip "closedir DIRHANDLE" 8 | |
1865 | Closes a directory opened by opendir(). | |
1866 | .Ip "connect(SOCKET,NAME)" 8 2 | |
1867 | Does the same thing that the connect system call does. | |
1868 | Returns true if it succeeded, false otherwise. | |
1869 | NAME should be a package address of the proper type for the socket. | |
1870 | See example in section on Interprocess Communication. | |
1871 | .Ip "cos(EXPR)" 8 6 | |
1872 | .Ip "cos EXPR" 8 6 | |
1873 | Returns the cosine of EXPR (expressed in radians). | |
1874 | If EXPR is omitted takes cosine of $_. | |
1875 | .Ip "crypt(PLAINTEXT,SALT)" 8 6 | |
1876 | Encrypts a string exactly like the crypt() function in the C library. | |
1877 | Useful for checking the password file for lousy passwords. | |
1878 | Only the guys wearing white hats should do this. | |
1879 | .Ip "dbmclose(ASSOC_ARRAY)" 8 6 | |
1880 | .Ip "dbmclose ASSOC_ARRAY" 8 | |
1881 | Breaks the binding between a dbm file and an associative array. | |
1882 | The values remaining in the associative array are meaningless unless | |
1883 | you happen to want to know what was in the cache for the dbm file. | |
1884 | This function is only useful if you have ndbm. | |
1885 | .Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6 | |
1886 | This binds a dbm or ndbm file to an associative array. | |
1887 | ASSOC is the name of the associative array. | |
1888 | (Unlike normal open, the first argument is NOT a filehandle, even though | |
1889 | it looks like one). | |
1890 | DBNAME is the name of the database (without the .dir or .pag extension). | |
1891 | If the database does not exist, it is created with protection specified | |
1892 | by MODE (as modified by the umask). | |
1893 | If your system only supports the older dbm functions, you may perform only one | |
1894 | dbmopen in your program. | |
1895 | If your system has neither dbm nor ndbm, calling dbmopen produces a fatal | |
1896 | error. | |
1897 | .Sp | |
1898 | Values assigned to the associative array prior to the dbmopen are lost. | |
1899 | A certain number of values from the dbm file are cached in memory. | |
1900 | By default this number is 64, but you can increase it by preallocating | |
1901 | that number of garbage entries in the associative array before the dbmopen. | |
1902 | You can flush the cache if necessary with the reset command. | |
1903 | .Sp | |
1904 | If you don't have write access to the dbm file, you can only read | |
1905 | associative array variables, not set them. | |
1906 | If you want to test whether you can write, either use file tests or | |
1907 | try setting a dummy array entry inside an eval, which will trap the error. | |
1908 | .Sp | |
1909 | Note that functions such as keys() and values() may return huge array values | |
1910 | when used on large dbm files. | |
1911 | You may prefer to use the each() function to iterate over large dbm files. | |
1912 | Example: | |
1913 | .nf | |
1914 | ||
1915 | .ne 6 | |
1916 | # print out history file offsets | |
1917 | dbmopen(HIST,'/usr/lib/news/history',0666); | |
1918 | while (($key,$val) = each %HIST) { | |
1919 | print $key, ' = ', unpack('L',$val), "\en"; | |
1920 | } | |
1921 | dbmclose(HIST); | |
1922 | ||
1923 | .fi | |
1924 | .Ip "defined(EXPR)" 8 6 | |
1925 | .Ip "defined EXPR" 8 | |
1926 | Returns a boolean value saying whether the lvalue EXPR has a real value | |
1927 | or not. | |
1928 | Many operations return the undefined value under exceptional conditions, | |
1929 | such as end of file, uninitialized variable, system error and such. | |
1930 | This function allows you to distinguish between an undefined null string | |
1931 | and a defined null string with operations that might return a real null | |
1932 | string, in particular referencing elements of an array. | |
1933 | You may also check to see if arrays or subroutines exist. | |
1934 | Use on predefined variables is not guaranteed to produce intuitive results. | |
1935 | Examples: | |
1936 | .nf | |
1937 | ||
1938 | .ne 7 | |
1939 | print if defined $switch{'D'}; | |
1940 | print "$val\en" while defined($val = pop(@ary)); | |
1941 | die "Can't readlink $sym: $!" | |
1942 | unless defined($value = readlink $sym); | |
1943 | eval '@foo = ()' if defined(@foo); | |
1944 | die "No XYZ package defined" unless defined %_XYZ; | |
1945 | sub foo { defined &$bar ? &$bar(@_) : die "No bar"; } | |
1946 | ||
1947 | .fi | |
1948 | See also undef. | |
1949 | .Ip "delete $ASSOC{KEY}" 8 6 | |
1950 | Deletes the specified value from the specified associative array. | |
1951 | Returns the deleted value, or the undefined value if nothing was deleted. | |
1952 | Deleting from $ENV{} modifies the environment. | |
1953 | Deleting from an array bound to a dbm file deletes the entry from the dbm | |
1954 | file. | |
1955 | .Sp | |
1956 | The following deletes all the values of an associative array: | |
1957 | .nf | |
1958 | ||
1959 | .ne 3 | |
1960 | foreach $key (keys %ARRAY) { | |
1961 | delete $ARRAY{$key}; | |
1962 | } | |
1963 | ||
1964 | .fi | |
1965 | (But it would be faster to use the | |
1966 | .I reset | |
1967 | command. | |
1968 | Saying undef %ARRAY is faster yet.) | |
1969 | .Ip "die(LIST)" 8 | |
1970 | .Ip "die LIST" 8 | |
1971 | Outside of an eval, prints the value of LIST to | |
1972 | .I STDERR | |
1973 | and exits with the current value of $! | |
1974 | (errno). | |
1975 | If $! is 0, exits with the value of ($? >> 8) (\`command\` status). | |
1976 | If ($? >> 8) is 0, exits with 255. | |
1977 | Inside an eval, the error message is stuffed into $@ and the eval is terminated | |
1978 | with the undefined value. | |
1979 | .Sp | |
1980 | Equivalent examples: | |
1981 | .nf | |
1982 | ||
1983 | .ne 3 | |
1984 | .ie t \{\ | |
1985 | die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\'; | |
1986 | 'br\} | |
1987 | .el \{\ | |
1988 | die "Can't cd to spool: $!\en" | |
1989 | unless chdir \'/usr/spool/news\'; | |
1990 | 'br\} | |
1991 | ||
1992 | chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en" | |
1993 | ||
1994 | .fi | |
1995 | .Sp | |
1996 | If the value of EXPR does not end in a newline, the current script line | |
1997 | number and input line number (if any) are also printed, and a newline is | |
1998 | supplied. | |
1999 | Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make | |
2000 | better sense when the string \*(L"at foo line 123\*(R" is appended. | |
2001 | Suppose you are running script \*(L"canasta\*(R". | |
2002 | .nf | |
2003 | ||
2004 | .ne 7 | |
2005 | die "/etc/games is no good"; | |
2006 | die "/etc/games is no good, stopped"; | |
2007 | ||
2008 | produce, respectively | |
2009 | ||
2010 | /etc/games is no good at canasta line 123. | |
2011 | /etc/games is no good, stopped at canasta line 123. | |
2012 | ||
2013 | .fi | |
2014 | See also | |
2015 | .IR exit . | |
2016 | .Ip "do BLOCK" 8 4 | |
2017 | Returns the value of the last command in the sequence of commands indicated | |
2018 | by BLOCK. | |
2019 | When modified by a loop modifier, executes the BLOCK once before testing the | |
2020 | loop condition. | |
2021 | (On other statements the loop modifiers test the conditional first.) | |
2022 | .Ip "do SUBROUTINE (LIST)" 8 3 | |
2023 | Executes a SUBROUTINE declared by a | |
2024 | .I sub | |
2025 | declaration, and returns the value | |
2026 | of the last expression evaluated in SUBROUTINE. | |
2027 | If there is no subroutine by that name, produces a fatal error. | |
2028 | (You may use the \*(L"defined\*(R" operator to determine if a subroutine | |
2029 | exists.) | |
2030 | If you pass arrays as part of LIST you may wish to pass the length | |
2031 | of the array in front of each array. | |
2032 | (See the section on subroutines later on.) | |
2033 | The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R" | |
2034 | form. | |
2035 | .Sp | |
2036 | SUBROUTINE may also be a single scalar variable, in which case | |
2037 | the name of the subroutine to execute is taken from the variable. | |
2038 | .Sp | |
2039 | As an alternate (and preferred) form, | |
2040 | you may call a subroutine by prefixing the name with | |
2041 | an ampersand: &foo(@args). | |
2042 | If you aren't passing any arguments, you don't have to use parentheses. | |
2043 | If you omit the parentheses, no @_ array is passed to the subroutine. | |
2044 | The & form is also used to specify subroutines to the defined and undef | |
2045 | operators: | |
2046 | .nf | |
2047 | ||
2048 | if (defined &$var) { &$var($parm); undef &$var; } | |
2049 | ||
2050 | .fi | |
2051 | .Ip "do EXPR" 8 3 | |
2052 | Uses the value of EXPR as a filename and executes the contents of the file | |
2053 | as a | |
2054 | .I perl | |
2055 | script. | |
2056 | Its primary use is to include subroutines from a | |
2057 | .I perl | |
2058 | subroutine library. | |
2059 | .nf | |
2060 | ||
2061 | do \'stat.pl\'; | |
2062 | ||
2063 | is just like | |
2064 | ||
2065 | eval \`cat stat.pl\`; | |
2066 | ||
2067 | .fi | |
2068 | except that it's more efficient, more concise, keeps track of the current | |
2069 | filename for error messages, and searches all the | |
2070 | .B \-I | |
2071 | libraries if the file | |
2072 | isn't in the current directory (see also the @INC array in Predefined Names). | |
2073 | It's the same, however, in that it does reparse the file every time you | |
2074 | call it, so if you are going to use the file inside a loop you might prefer | |
2075 | to use \-P and #include, at the expense of a little more startup time. | |
2076 | (The main problem with #include is that cpp doesn't grok # comments\*(--a | |
2077 | workaround is to use \*(L";#\*(R" for standalone comments.) | |
2078 | Note that the following are NOT equivalent: | |
2079 | .nf | |
2080 | ||
2081 | .ne 2 | |
2082 | do $foo; # eval a file | |
2083 | do $foo(); # call a subroutine | |
2084 | ||
2085 | .fi | |
2086 | Note that inclusion of library routines is better done with | |
2087 | the \*(L"require\*(R" operator. | |
2088 | .Ip "dump LABEL" 8 6 | |
2089 | This causes an immediate core dump. | |
2090 | Primarily this is so that you can use the undump program to turn your | |
2091 | core dump into an executable binary after having initialized all your | |
2092 | variables at the beginning of the program. | |
2093 | When the new binary is executed it will begin by executing a "goto LABEL" | |
2094 | (with all the restrictions that goto suffers). | |
2095 | Think of it as a goto with an intervening core dump and reincarnation. | |
2096 | If LABEL is omitted, restarts the program from the top. | |
2097 | WARNING: any files opened at the time of the dump will NOT be open any more | |
2098 | when the program is reincarnated, with possible resulting confusion on the part | |
2099 | of perl. | |
2100 | See also \-u. | |
2101 | .Sp | |
2102 | Example: | |
2103 | .nf | |
2104 | ||
2105 | .ne 16 | |
2106 | #!/usr/bin/perl | |
2107 | require 'getopt.pl'; | |
2108 | require 'stat.pl'; | |
2109 | %days = ( | |
2110 | 'Sun',1, | |
2111 | 'Mon',2, | |
2112 | 'Tue',3, | |
2113 | 'Wed',4, | |
2114 | 'Thu',5, | |
2115 | 'Fri',6, | |
2116 | 'Sat',7); | |
2117 | ||
2118 | dump QUICKSTART if $ARGV[0] eq '-d'; | |
2119 | ||
2120 | QUICKSTART: | |
2121 | do Getopt('f'); | |
2122 | ||
2123 | .fi | |
2124 | .Ip "each(ASSOC_ARRAY)" 8 6 | |
2125 | .Ip "each ASSOC_ARRAY" 8 | |
2126 | Returns a 2 element array consisting of the key and value for the next | |
2127 | value of an associative array, so that you can iterate over it. | |
2128 | Entries are returned in an apparently random order. | |
2129 | When the array is entirely read, a null array is returned (which when | |
2130 | assigned produces a FALSE (0) value). | |
2131 | The next call to each() after that will start iterating again. | |
2132 | The iterator can be reset only by reading all the elements from the array. | |
2133 | You must not modify the array while iterating over it. | |
2134 | There is a single iterator for each associative array, shared by all | |
2135 | each(), keys() and values() function calls in the program. | |
2136 | The following prints out your environment like the printenv program, only | |
2137 | in a different order: | |
2138 | .nf | |
2139 | ||
2140 | .ne 3 | |
2141 | while (($key,$value) = each %ENV) { | |
2142 | print "$key=$value\en"; | |
2143 | } | |
2144 | ||
2145 | .fi | |
2146 | See also keys() and values(). | |
2147 | .Ip "eof(FILEHANDLE)" 8 8 | |
2148 | .Ip "eof()" 8 | |
2149 | .Ip "eof" 8 | |
2150 | Returns 1 if the next read on FILEHANDLE will return end of file, or if | |
2151 | FILEHANDLE is not open. | |
2152 | FILEHANDLE may be an expression whose value gives the real filehandle name. | |
2153 | (Note that this function actually reads a character and then ungetc's it, | |
2154 | so it is not very useful in an interactive context.) | |
2155 | An eof without an argument returns the eof status for the last file read. | |
2156 | Empty parentheses () may be used to indicate the pseudo file formed of the | |
2157 | files listed on the command line, i.e. eof() is reasonable to use inside | |
2158 | a while (<>) loop to detect the end of only the last file. | |
2159 | Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop. | |
2160 | Examples: | |
2161 | .nf | |
2162 | ||
2163 | .ne 7 | |
2164 | # insert dashes just before last line of last file | |
2165 | while (<>) { | |
2166 | if (eof()) { | |
2167 | print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en"; | |
2168 | } | |
2169 | print; | |
2170 | } | |
2171 | ||
2172 | .ne 7 | |
2173 | # reset line numbering on each input file | |
2174 | while (<>) { | |
2175 | print "$.\et$_"; | |
2176 | if (eof) { # Not eof(). | |
2177 | close(ARGV); | |
2178 | } | |
2179 | } | |
2180 | ||
2181 | .fi | |
2182 | .Ip "eval(EXPR)" 8 6 | |
2183 | .Ip "eval EXPR" 8 6 | |
2184 | .Ip "eval BLOCK" 8 6 | |
2185 | EXPR is parsed and executed as if it were a little | |
2186 | .I perl | |
2187 | program. | |
2188 | It is executed in the context of the current | |
2189 | .I perl | |
2190 | program, so that | |
2191 | any variable settings, subroutine or format definitions remain afterwards. | |
2192 | The value returned is the value of the last expression evaluated, just | |
2193 | as with subroutines. | |
2194 | If there is a syntax error or runtime error, or a die statement is | |
2195 | executed, an undefined value is returned by | |
2196 | eval, and $@ is set to the error message. | |
2197 | If there was no error, $@ is guaranteed to be a null string. | |
2198 | If EXPR is omitted, evaluates $_. | |
2199 | The final semicolon, if any, may be omitted from the expression. | |
2200 | .Sp | |
2201 | Note that, since eval traps otherwise-fatal errors, it is useful for | |
2202 | determining whether a particular feature | |
2203 | (such as dbmopen or symlink) is implemented. | |
2204 | It is also Perl's exception trapping mechanism, where the die operator is | |
2205 | used to raise exceptions. | |
2206 | .Sp | |
2207 | If the code to be executed doesn't vary, you may use | |
2208 | the eval-BLOCK form to trap run-time errors without incurring | |
2209 | the penalty of recompiling each time. | |
2210 | The error, if any, is still returned in $@. | |
2211 | Evaluating a single-quoted string (as EXPR) has the same effect, except that | |
2212 | the eval-EXPR form reports syntax errors at run time via $@, whereas the | |
2213 | eval-BLOCK form reports syntax errors at compile time. The eval-EXPR form | |
2214 | is optimized to eval-BLOCK the first time it succeeds. (Since the replacement | |
2215 | side of a substitution is considered a single-quoted string when you | |
2216 | use the e modifier, the same optimization occurs there.) Examples: | |
2217 | .nf | |
2218 | ||
2219 | .ne 11 | |
2220 | # make divide-by-zero non-fatal | |
2221 | eval { $answer = $a / $b; }; warn $@ if $@; | |
2222 | ||
2223 | # optimized to same thing after first use | |
2224 | eval '$answer = $a / $b'; warn $@ if $@; | |
2225 | ||
2226 | # a compile-time error | |
2227 | eval { $answer = }; | |
2228 | ||
2229 | # a run-time error | |
2230 | eval '$answer ='; # sets $@ | |
2231 | ||
2232 | .fi | |
2233 | .Ip "exec(LIST)" 8 8 | |
2234 | .Ip "exec LIST" 8 6 | |
2235 | If there is more than one argument in LIST, or if LIST is an array with | |
2236 | more than one value, | |
2237 | calls execvp() with the arguments in LIST. | |
2238 | If there is only one scalar argument, the argument is checked for shell metacharacters. | |
2239 | If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing. | |
2240 | If there are none, the argument is split into words and passed directly to | |
2241 | execvp(), which is more efficient. | |
2242 | Note: exec (and system) do not flush your output buffer, so you may need to | |
2243 | set $| to avoid lost output. | |
2244 | Examples: | |
2245 | .nf | |
2246 | ||
2247 | exec \'/bin/echo\', \'Your arguments are: \', @ARGV; | |
2248 | exec "sort $outfile | uniq"; | |
2249 | ||
2250 | .fi | |
2251 | .Sp | |
2252 | If you don't really want to execute the first argument, but want to lie | |
2253 | to the program you are executing about its own name, you can specify | |
2254 | the program you actually want to run by assigning that to a variable and | |
2255 | putting the name of the variable in front of the LIST without a comma. | |
2256 | (This always forces interpretation of the LIST as a multi-valued list, even | |
2257 | if there is only a single scalar in the list.) | |
2258 | Example: | |
2259 | .nf | |
2260 | ||
2261 | .ne 2 | |
2262 | $shell = '/bin/csh'; | |
2263 | exec $shell '-sh'; # pretend it's a login shell | |
2264 | ||
2265 | .fi | |
2266 | .Ip "exit(EXPR)" 8 6 | |
2267 | .Ip "exit EXPR" 8 | |
2268 | Evaluates EXPR and exits immediately with that value. | |
2269 | Example: | |
2270 | .nf | |
2271 | ||
2272 | .ne 2 | |
2273 | $ans = <STDIN>; | |
2274 | exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|; | |
2275 | ||
2276 | .fi | |
2277 | See also | |
2278 | .IR die . | |
2279 | If EXPR is omitted, exits with 0 status. | |
2280 | .Ip "exp(EXPR)" 8 3 | |
2281 | .Ip "exp EXPR" 8 | |
2282 | Returns | |
2283 | .I e | |
2284 | to the power of EXPR. | |
2285 | If EXPR is omitted, gives exp($_). | |
2286 | .Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4 | |
2287 | Implements the fcntl(2) function. | |
2288 | You'll probably have to say | |
2289 | .nf | |
2290 | ||
2291 | require "fcntl.ph"; # probably /usr/local/lib/perl/fcntl.ph | |
2292 | ||
2293 | .fi | |
2294 | first to get the correct function definitions. | |
2295 | If fcntl.ph doesn't exist or doesn't have the correct definitions | |
2296 | you'll have to roll | |
2297 | your own, based on your C header files such as <sys/fcntl.h>. | |
2298 | (There is a perl script called h2ph that comes with the perl kit | |
2299 | which may help you in this.) | |
2300 | Argument processing and value return works just like ioctl below. | |
2301 | Note that fcntl will produce a fatal error if used on a machine that doesn't implement | |
2302 | fcntl(2). | |
2303 | .Ip "fileno(FILEHANDLE)" 8 4 | |
2304 | .Ip "fileno FILEHANDLE" 8 4 | |
2305 | Returns the file descriptor for a filehandle. | |
2306 | Useful for constructing bitmaps for select(). | |
2307 | If FILEHANDLE is an expression, the value is taken as the name of | |
2308 | the filehandle. | |
2309 | .Ip "flock(FILEHANDLE,OPERATION)" 8 4 | |
2310 | Calls flock(2) on FILEHANDLE. | |
2311 | See manual page for flock(2) for definition of OPERATION. | |
2312 | Returns true for success, false on failure. | |
2313 | Will produce a fatal error if used on a machine that doesn't implement | |
2314 | flock(2). | |
2315 | Here's a mailbox appender for BSD systems. | |
2316 | .nf | |
2317 | ||
2318 | .ne 20 | |
2319 | $LOCK_SH = 1; | |
2320 | $LOCK_EX = 2; | |
2321 | $LOCK_NB = 4; | |
2322 | $LOCK_UN = 8; | |
2323 | ||
2324 | sub lock { | |
2325 | flock(MBOX,$LOCK_EX); | |
2326 | # and, in case someone appended | |
2327 | # while we were waiting... | |
2328 | seek(MBOX, 0, 2); | |
2329 | } | |
2330 | ||
2331 | sub unlock { | |
2332 | flock(MBOX,$LOCK_UN); | |
2333 | } | |
2334 | ||
2335 | open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}") | |
2336 | || die "Can't open mailbox: $!"; | |
2337 | ||
2338 | do lock(); | |
2339 | print MBOX $msg,"\en\en"; | |
2340 | do unlock(); | |
2341 | ||
2342 | .fi | |
2343 | .Ip "fork" 8 4 | |
2344 | Does a fork() call. | |
2345 | Returns the child pid to the parent process and 0 to the child process. | |
2346 | Note: unflushed buffers remain unflushed in both processes, which means | |
2347 | you may need to set $| to avoid duplicate output. | |
2348 | .Ip "getc(FILEHANDLE)" 8 4 | |
2349 | .Ip "getc FILEHANDLE" 8 | |
2350 | .Ip "getc" 8 | |
2351 | Returns the next character from the input file attached to FILEHANDLE, or | |
2352 | a null string at EOF. | |
2353 | If FILEHANDLE is omitted, reads from STDIN. | |
2354 | .Ip "getlogin" 8 3 | |
2355 | Returns the current login from /etc/utmp, if any. | |
2356 | If null, use getpwuid. | |
2357 | ||
2358 | $login = getlogin || (getpwuid($<))[0] || "Somebody"; | |
2359 | ||
2360 | .Ip "getpeername(SOCKET)" 8 3 | |
2361 | Returns the packed sockaddr address of other end of the SOCKET connection. | |
2362 | .nf | |
2363 | ||
2364 | .ne 4 | |
2365 | # An internet sockaddr | |
2366 | $sockaddr = 'S n a4 x8'; | |
2367 | $hersockaddr = getpeername(S); | |
2368 | .ie t \{\ | |
2369 | ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr); | |
2370 | 'br\} | |
2371 | .el \{\ | |
2372 | ($family, $port, $heraddr) = | |
2373 | unpack($sockaddr,$hersockaddr); | |
2374 | 'br\} | |
2375 | ||
2376 | .fi | |
2377 | .Ip "getpgrp(PID)" 8 4 | |
2378 | .Ip "getpgrp PID" 8 | |
2379 | Returns the current process group for the specified PID, 0 for the current | |
2380 | process. | |
2381 | Will produce a fatal error if used on a machine that doesn't implement | |
2382 | getpgrp(2). | |
2383 | If EXPR is omitted, returns process group of current process. | |
2384 | .Ip "getppid" 8 4 | |
2385 | Returns the process id of the parent process. | |
2386 | .Ip "getpriority(WHICH,WHO)" 8 4 | |
2387 | Returns the current priority for a process, a process group, or a user. | |
2388 | (See getpriority(2).) | |
2389 | Will produce a fatal error if used on a machine that doesn't implement | |
2390 | getpriority(2). | |
2391 | .Ip "getpwnam(NAME)" 8 | |
2392 | .Ip "getgrnam(NAME)" 8 | |
2393 | .Ip "gethostbyname(NAME)" 8 | |
2394 | .Ip "getnetbyname(NAME)" 8 | |
2395 | .Ip "getprotobyname(NAME)" 8 | |
2396 | .Ip "getpwuid(UID)" 8 | |
2397 | .Ip "getgrgid(GID)" 8 | |
2398 | .Ip "getservbyname(NAME,PROTO)" 8 | |
2399 | .Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8 | |
2400 | .Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8 | |
2401 | .Ip "getprotobynumber(NUMBER)" 8 | |
2402 | .Ip "getservbyport(PORT,PROTO)" 8 | |
2403 | .Ip "getpwent" 8 | |
2404 | .Ip "getgrent" 8 | |
2405 | .Ip "gethostent" 8 | |
2406 | .Ip "getnetent" 8 | |
2407 | .Ip "getprotoent" 8 | |
2408 | .Ip "getservent" 8 | |
2409 | .Ip "setpwent" 8 | |
2410 | .Ip "setgrent" 8 | |
2411 | .Ip "sethostent(STAYOPEN)" 8 | |
2412 | .Ip "setnetent(STAYOPEN)" 8 | |
2413 | .Ip "setprotoent(STAYOPEN)" 8 | |
2414 | .Ip "setservent(STAYOPEN)" 8 | |
2415 | .Ip "endpwent" 8 | |
2416 | .Ip "endgrent" 8 | |
2417 | .Ip "endhostent" 8 | |
2418 | .Ip "endnetent" 8 | |
2419 | .Ip "endprotoent" 8 | |
2420 | .Ip "endservent" 8 | |
2421 | These routines perform the same functions as their counterparts in the | |
2422 | system library. | |
2423 | Within an array context, | |
2424 | the return values from the various get routines are as follows: | |
2425 | .nf | |
2426 | ||
2427 | ($name,$passwd,$uid,$gid, | |
2428 | $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|. | |
2429 | ($name,$passwd,$gid,$members) = getgr.\|.\|. | |
2430 | ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|. | |
2431 | ($name,$aliases,$addrtype,$net) = getnet.\|.\|. | |
2432 | ($name,$aliases,$proto) = getproto.\|.\|. | |
2433 | ($name,$aliases,$port,$proto) = getserv.\|.\|. | |
2434 | ||
2435 | .fi | |
2436 | (If the entry doesn't exist you get a null list.) | |
2437 | .Sp | |
2438 | Within a scalar context, you get the name, unless the function was a | |
2439 | lookup by name, in which case you get the other thing, whatever it is. | |
2440 | (If the entry doesn't exist you get the undefined value.) | |
2441 | For example: | |
2442 | .nf | |
2443 | ||
2444 | $uid = getpwnam | |
2445 | $name = getpwuid | |
2446 | $name = getpwent | |
2447 | $gid = getgrnam | |
2448 | $name = getgrgid | |
2449 | $name = getgrent | |
2450 | etc. | |
2451 | ||
2452 | .fi | |
2453 | The $members value returned by getgr.\|.\|. is a space separated list | |
2454 | of the login names of the members of the group. | |
2455 | .Sp | |
2456 | For the gethost.\|.\|. functions, if the h_errno variable is supported in C, | |
2457 | it will be returned to you via $? if the function call fails. | |
2458 | The @addrs value returned by a successful call is a list of the | |
2459 | raw addresses returned by the corresponding system library call. | |
2460 | In the Internet domain, each address is four bytes long and you can unpack | |
2461 | it by saying something like: | |
2462 | .nf | |
2463 | ||
2464 | ($a,$b,$c,$d) = unpack('C4',$addr[0]); | |
2465 | ||
2466 | .fi | |
2467 | .Ip "getsockname(SOCKET)" 8 3 | |
2468 | Returns the packed sockaddr address of this end of the SOCKET connection. | |
2469 | .nf | |
2470 | ||
2471 | .ne 4 | |
2472 | # An internet sockaddr | |
2473 | $sockaddr = 'S n a4 x8'; | |
2474 | $mysockaddr = getsockname(S); | |
2475 | .ie t \{\ | |
2476 | ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr); | |
2477 | 'br\} | |
2478 | .el \{\ | |
2479 | ($family, $port, $myaddr) = | |
2480 | unpack($sockaddr,$mysockaddr); | |
2481 | 'br\} | |
2482 | ||
2483 | .fi | |
2484 | .Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3 | |
2485 | Returns the socket option requested, or undefined if there is an error. | |
2486 | .Ip "gmtime(EXPR)" 8 4 | |
2487 | .Ip "gmtime EXPR" 8 | |
2488 | Converts a time as returned by the time function to a 9-element array with | |
2489 | the time analyzed for the Greenwich timezone. | |
2490 | Typically used as follows: | |
2491 | .nf | |
2492 | ||
2493 | .ne 3 | |
2494 | .ie t \{\ | |
2495 | ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time); | |
2496 | 'br\} | |
2497 | .el \{\ | |
2498 | ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = | |
2499 | gmtime(time); | |
2500 | 'br\} | |
2501 | ||
2502 | .fi | |
2503 | All array elements are numeric, and come straight out of a struct tm. | |
2504 | In particular this means that $mon has the range 0.\|.11 and $wday has the | |
2505 | range 0.\|.6. | |
2506 | If EXPR is omitted, does gmtime(time). | |
2507 | .Ip "goto LABEL" 8 6 | |
2508 | Finds the statement labeled with LABEL and resumes execution there. | |
2509 | Currently you may only go to statements in the main body of the program | |
2510 | that are not nested inside a do {} construct. | |
2511 | This statement is not implemented very efficiently, and is here only to make | |
2512 | the | |
2513 | .IR sed -to- perl | |
2514 | translator easier. | |
2515 | I may change its semantics at any time, consistent with support for translated | |
2516 | .I sed | |
2517 | scripts. | |
2518 | Use it at your own risk. | |
2519 | Better yet, don't use it at all. | |
2520 | .Ip "grep(EXPR,LIST)" 8 4 | |
2521 | Evaluates EXPR for each element of LIST (locally setting $_ to each element) | |
2522 | and returns the array value consisting of those elements for which the | |
2523 | expression evaluated to true. | |
2524 | In a scalar context, returns the number of times the expression was true. | |
2525 | .nf | |
2526 | ||
2527 | @foo = grep(!/^#/, @bar); # weed out comments | |
2528 | ||
2529 | .fi | |
2530 | Note that, since $_ is a reference into the array value, it can be | |
2531 | used to modify the elements of the array. | |
2532 | While this is useful and supported, it can cause bizarre results if | |
2533 | the LIST is not a named array. | |
2534 | .Ip "hex(EXPR)" 8 4 | |
2535 | .Ip "hex EXPR" 8 | |
2536 | Returns the decimal value of EXPR interpreted as an hex string. | |
2537 | (To interpret strings that might start with 0 or 0x see oct().) | |
2538 | If EXPR is omitted, uses $_. | |
2539 | .Ip "index(STR,SUBSTR,POSITION)" 8 4 | |
2540 | .Ip "index(STR,SUBSTR)" 8 4 | |
2541 | Returns the position of the first occurrence of SUBSTR in STR at or after | |
2542 | POSITION. | |
2543 | If POSITION is omitted, starts searching from the beginning of the string. | |
2544 | The return value is based at 0, or whatever you've | |
2545 | set the $[ variable to. | |
2546 | If the substring is not found, returns one less than the base, ordinarily \-1. | |
2547 | .Ip "int(EXPR)" 8 4 | |
2548 | .Ip "int EXPR" 8 | |
2549 | Returns the integer portion of EXPR. | |
2550 | If EXPR is omitted, uses $_. | |
2551 | .Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4 | |
2552 | Implements the ioctl(2) function. | |
2553 | You'll probably have to say | |
2554 | .nf | |
2555 | ||
2556 | require "ioctl.ph"; # probably /usr/local/lib/perl/ioctl.ph | |
2557 | ||
2558 | .fi | |
2559 | first to get the correct function definitions. | |
2560 | If ioctl.ph doesn't exist or doesn't have the correct definitions | |
2561 | you'll have to roll | |
2562 | your own, based on your C header files such as <sys/ioctl.h>. | |
2563 | (There is a perl script called h2ph that comes with the perl kit | |
2564 | which may help you in this.) | |
2565 | SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer | |
2566 | to the string value of SCALAR will be passed as the third argument of | |
2567 | the actual ioctl call. | |
2568 | (If SCALAR has no string value but does have a numeric value, that value | |
2569 | will be passed rather than a pointer to the string value. | |
2570 | To guarantee this to be true, add a 0 to the scalar before using it.) | |
2571 | The pack() and unpack() functions are useful for manipulating the values | |
2572 | of structures used by ioctl(). | |
2573 | The following example sets the erase character to DEL. | |
2574 | .nf | |
2575 | ||
2576 | .ne 9 | |
2577 | require 'ioctl.ph'; | |
2578 | $sgttyb_t = "ccccs"; # 4 chars and a short | |
2579 | if (ioctl(STDIN,$TIOCGETP,$sgttyb)) { | |
2580 | @ary = unpack($sgttyb_t,$sgttyb); | |
2581 | $ary[2] = 127; | |
2582 | $sgttyb = pack($sgttyb_t,@ary); | |
2583 | ioctl(STDIN,$TIOCSETP,$sgttyb) | |
2584 | || die "Can't ioctl: $!"; | |
2585 | } | |
2586 | ||
2587 | .fi | |
2588 | The return value of ioctl (and fcntl) is as follows: | |
2589 | .nf | |
2590 | ||
2591 | .ne 4 | |
2592 | if OS returns:\h'|3i'perl returns: | |
2593 | -1\h'|3i' undefined value | |
2594 | 0\h'|3i' string "0 but true" | |
2595 | anything else\h'|3i' that number | |
2596 | ||
2597 | .fi | |
2598 | Thus perl returns true on success and false on failure, yet you can still | |
2599 | easily determine the actual value returned by the operating system: | |
2600 | .nf | |
2601 | ||
2602 | ($retval = ioctl(...)) || ($retval = -1); | |
2603 | printf "System returned %d\en", $retval; | |
2604 | .fi | |
2605 | .Ip "join(EXPR,LIST)" 8 8 | |
2606 | .Ip "join(EXPR,ARRAY)" 8 | |
2607 | Joins the separate strings of LIST or ARRAY into a single string with fields | |
2608 | separated by the value of EXPR, and returns the string. | |
2609 | Example: | |
2610 | .nf | |
2611 | ||
2612 | .ie t \{\ | |
2613 | $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell); | |
2614 | 'br\} | |
2615 | .el \{\ | |
2616 | $_ = join(\|\':\', | |
2617 | $login,$passwd,$uid,$gid,$gcos,$home,$shell); | |
2618 | 'br\} | |
2619 | ||
2620 | .fi | |
2621 | See | |
2622 | .IR split . | |
2623 | .Ip "keys(ASSOC_ARRAY)" 8 6 | |
2624 | .Ip "keys ASSOC_ARRAY" 8 | |
2625 | Returns a normal array consisting of all the keys of the named associative | |
2626 | array. | |
2627 | The keys are returned in an apparently random order, but it is the same order | |
2628 | as either the values() or each() function produces (given that the associative array | |
2629 | has not been modified). | |
2630 | Here is yet another way to print your environment: | |
2631 | .nf | |
2632 | ||
2633 | .ne 5 | |
2634 | @keys = keys %ENV; | |
2635 | @values = values %ENV; | |
2636 | while ($#keys >= 0) { | |
2637 | print pop(@keys), \'=\', pop(@values), "\en"; | |
2638 | } | |
2639 | ||
2640 | or how about sorted by key: | |
2641 | ||
2642 | .ne 3 | |
2643 | foreach $key (sort(keys %ENV)) { | |
2644 | print $key, \'=\', $ENV{$key}, "\en"; | |
2645 | } | |
2646 | ||
2647 | .fi | |
2648 | .Ip "kill(LIST)" 8 8 | |
2649 | .Ip "kill LIST" 8 2 | |
2650 | Sends a signal to a list of processes. | |
2651 | The first element of the list must be the signal to send. | |
2652 | Returns the number of processes successfully signaled. | |
2653 | .nf | |
2654 | ||
2655 | $cnt = kill 1, $child1, $child2; | |
2656 | kill 9, @goners; | |
2657 | ||
2658 | .fi | |
2659 | If the signal is negative, kills process groups instead of processes. | |
2660 | (On System V, a negative \fIprocess\fR number will also kill process groups, | |
2661 | but that's not portable.) | |
2662 | You may use a signal name in quotes. | |
2663 | .Ip "last LABEL" 8 8 | |
2664 | .Ip "last" 8 | |
2665 | The | |
2666 | .I last | |
2667 | command is like the | |
2668 | .I break | |
2669 | statement in C (as used in loops); it immediately exits the loop in question. | |
2670 | If the LABEL is omitted, the command refers to the innermost enclosing loop. | |
2671 | The | |
2672 | .I continue | |
2673 | block, if any, is not executed: | |
2674 | .nf | |
2675 | ||
2676 | .ne 4 | |
2677 | line: while (<STDIN>) { | |
2678 | last line if /\|^$/; # exit when done with header | |
2679 | .\|.\|. | |
2680 | } | |
2681 | ||
2682 | .fi | |
2683 | .Ip "length(EXPR)" 8 4 | |
2684 | .Ip "length EXPR" 8 | |
2685 | Returns the length in characters of the value of EXPR. | |
2686 | If EXPR is omitted, returns length of $_. | |
2687 | .Ip "link(OLDFILE,NEWFILE)" 8 2 | |
2688 | Creates a new filename linked to the old filename. | |
2689 | Returns 1 for success, 0 otherwise. | |
2690 | .Ip "listen(SOCKET,QUEUESIZE)" 8 2 | |
2691 | Does the same thing that the listen system call does. | |
2692 | Returns true if it succeeded, false otherwise. | |
2693 | See example in section on Interprocess Communication. | |
2694 | .Ip "local(LIST)" 8 4 | |
2695 | Declares the listed variables to be local to the enclosing block, | |
2696 | subroutine, eval or \*(L"do\*(R". | |
2697 | All the listed elements must be legal lvalues. | |
2698 | This operator works by saving the current values of those variables in LIST | |
2699 | on a hidden stack and restoring them upon exiting the block, subroutine or eval. | |
2700 | This means that called subroutines can also reference the local variable, | |
2701 | but not the global one. | |
2702 | The LIST may be assigned to if desired, which allows you to initialize | |
2703 | your local variables. | |
2704 | (If no initializer is given for a particular variable, it is created with | |
2705 | an undefined value.) | |
2706 | Commonly this is used to name the parameters to a subroutine. | |
2707 | Examples: | |
2708 | .nf | |
2709 | ||
2710 | .ne 13 | |
2711 | sub RANGEVAL { | |
2712 | local($min, $max, $thunk) = @_; | |
2713 | local($result) = \'\'; | |
2714 | local($i); | |
2715 | ||
2716 | # Presumably $thunk makes reference to $i | |
2717 | ||
2718 | for ($i = $min; $i < $max; $i++) { | |
2719 | $result .= eval $thunk; | |
2720 | } | |
2721 | ||
2722 | $result; | |
2723 | } | |
2724 | ||
2725 | .ne 6 | |
2726 | if ($sw eq \'-v\') { | |
2727 | # init local array with global array | |
2728 | local(@ARGV) = @ARGV; | |
2729 | unshift(@ARGV,\'echo\'); | |
2730 | system @ARGV; | |
2731 | } | |
2732 | # @ARGV restored | |
2733 | ||
2734 | .ne 6 | |
2735 | # temporarily add to digits associative array | |
2736 | if ($base12) { | |
2737 | # (NOTE: not claiming this is efficient!) | |
2738 | local(%digits) = (%digits,'t',10,'e',11); | |
2739 | do parse_num(); | |
2740 | } | |
2741 | ||
2742 | .fi | |
2743 | Note that local() is a run-time command, and so gets executed every time | |
2744 | through a loop, using up more stack storage each time until it's all | |
2745 | released at once when the loop is exited. | |
2746 | .Ip "localtime(EXPR)" 8 4 | |
2747 | .Ip "localtime EXPR" 8 | |
2748 | Converts a time as returned by the time function to a 9-element array with | |
2749 | the time analyzed for the local timezone. | |
2750 | Typically used as follows: | |
2751 | .nf | |
2752 | ||
2753 | .ne 3 | |
2754 | .ie t \{\ | |
2755 | ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time); | |
2756 | 'br\} | |
2757 | .el \{\ | |
2758 | ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = | |
2759 | localtime(time); | |
2760 | 'br\} | |
2761 | ||
2762 | .fi | |
2763 | All array elements are numeric, and come straight out of a struct tm. | |
2764 | In particular this means that $mon has the range 0.\|.11 and $wday has the | |
2765 | range 0.\|.6. | |
2766 | If EXPR is omitted, does localtime(time). | |
2767 | .Ip "log(EXPR)" 8 4 | |
2768 | .Ip "log EXPR" 8 | |
2769 | Returns logarithm (base | |
2770 | .IR e ) | |
2771 | of EXPR. | |
2772 | If EXPR is omitted, returns log of $_. | |
2773 | .Ip "lstat(FILEHANDLE)" 8 6 | |
2774 | .Ip "lstat FILEHANDLE" 8 | |
2775 | .Ip "lstat(EXPR)" 8 | |
2776 | .Ip "lstat SCALARVARIABLE" 8 | |
2777 | Does the same thing as the stat() function, but stats a symbolic link | |
2778 | instead of the file the symbolic link points to. | |
2779 | If symbolic links are unimplemented on your system, a normal stat is done. | |
2780 | .Ip "m/PATTERN/gio" 8 4 | |
2781 | .Ip "/PATTERN/gio" 8 | |
2782 | Searches a string for a pattern match, and returns true (1) or false (\'\'). | |
2783 | If no string is specified via the =~ or !~ operator, | |
2784 | the $_ string is searched. | |
2785 | (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.) | |
2786 | See also the section on regular expressions. | |
2787 | .Sp | |
2788 | If / is the delimiter then the initial \*(L'm\*(R' is optional. | |
2789 | With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters | |
2790 | as delimiters. | |
2791 | This is particularly useful for matching Unix path names that contain \*(L'/\*(R'. | |
2792 | If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is | |
2793 | done in a case-insensitive manner. | |
2794 | PATTERN may contain references to scalar variables, which will be interpolated | |
2795 | (and the pattern recompiled) every time the pattern search is evaluated. | |
2796 | (Note that $) and $| may not be interpolated because they look like end-of-string tests.) | |
2797 | If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after | |
2798 | the trailing delimiter. | |
2799 | This avoids expensive run-time recompilations, and | |
2800 | is useful when the value you are interpolating won't change over the | |
2801 | life of the script. | |
2802 | If the PATTERN evaluates to a null string, the most recent successful | |
2803 | regular expression is used instead. | |
2804 | .Sp | |
2805 | If used in a context that requires an array value, a pattern match returns an | |
2806 | array consisting of the subexpressions matched by the parentheses in the | |
2807 | pattern, | |
2808 | i.e. ($1, $2, $3.\|.\|.). | |
2809 | It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $& | |
2810 | or $'. | |
2811 | If the match fails, a null array is returned. | |
2812 | If the match succeeds, but there were no parentheses, an array value of (1) | |
2813 | is returned. | |
2814 | .Sp | |
2815 | Examples: | |
2816 | .nf | |
2817 | ||
2818 | .ne 4 | |
2819 | open(tty, \'/dev/tty\'); | |
2820 | <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|); # do foo if desired | |
2821 | ||
2822 | if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; } | |
2823 | ||
2824 | next if m#^/usr/spool/uucp#; | |
2825 | ||
2826 | .ne 5 | |
2827 | # poor man's grep | |
2828 | $arg = shift; | |
2829 | while (<>) { | |
2830 | print if /$arg/o; # compile only once | |
2831 | } | |
2832 | ||
2833 | if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/)) | |
2834 | ||
2835 | .fi | |
2836 | This last example splits $foo into the first two words and the remainder | |
2837 | of the line, and assigns those three fields to $F1, $F2 and $Etc. | |
2838 | The conditional is true if any variables were assigned, i.e. if the pattern | |
2839 | matched. | |
2840 | .Sp | |
2841 | The \*(L"g\*(R" modifier specifies global pattern matching\*(--that is, | |
2842 | matching as many times as possible within the string. How it behaves | |
2843 | depends on the context. In an array context, it returns a list of | |
2844 | all the substrings matched by all the parentheses in the regular expression. | |
2845 | If there are no parentheses, it returns a list of all the matched strings, | |
2846 | as if there were parentheses around the whole pattern. In a scalar context, | |
2847 | it iterates through the string, returning TRUE each time it matches, and | |
2848 | FALSE when it eventually runs out of matches. (In other words, it remembers | |
2849 | where it left off last time and restarts the search at that point.) It | |
2850 | presumes that you have not modified the string since the last match. | |
2851 | Modifying the string between matches may result in undefined behavior. | |
2852 | (You can actually get away with in-place modifications via substr() | |
2853 | that do not change the length of the entire string. In general, however, | |
2854 | you should be using s///g for such modifications.) Examples: | |
2855 | .nf | |
2856 | ||
2857 | # array context | |
2858 | ($one,$five,$fifteen) = (\`uptime\` =~ /(\ed+\e.\ed+)/g); | |
2859 | ||
2860 | # scalar context | |
2861 | $/ = ""; $* = 1; | |
2862 | while ($paragraph = <>) { | |
2863 | while ($paragraph =~ /[a-z][\'")]*[.!?]+[\'")]*\es/g) { | |
2864 | $sentences++; | |
2865 | } | |
2866 | } | |
2867 | print "$sentences\en"; | |
2868 | ||
2869 | .fi | |
2870 | .Ip "mkdir(FILENAME,MODE)" 8 3 | |
2871 | Creates the directory specified by FILENAME, with permissions specified by | |
2872 | MODE (as modified by umask). | |
2873 | If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno). | |
2874 | .Ip "msgctl(ID,CMD,ARG)" 8 4 | |
2875 | Calls the System V IPC function msgctl. If CMD is &IPC_STAT, then ARG | |
2876 | must be a variable which will hold the returned msqid_ds structure. | |
2877 | Returns like ioctl: the undefined value for error, "0 but true" for | |
2878 | zero, or the actual return value otherwise. | |
2879 | .Ip "msgget(KEY,FLAGS)" 8 4 | |
2880 | Calls the System V IPC function msgget. Returns the message queue id, | |
2881 | or the undefined value if there is an error. | |
2882 | .Ip "msgsnd(ID,MSG,FLAGS)" 8 4 | |
2883 | Calls the System V IPC function msgsnd to send the message MSG to the | |
2884 | message queue ID. MSG must begin with the long integer message type, | |
2885 | which may be created with pack("L", $type). Returns true if | |
2886 | successful, or false if there is an error. | |
2887 | .Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4 | |
2888 | Calls the System V IPC function msgrcv to receive a message from | |
2889 | message queue ID into variable VAR with a maximum message size of | |
2890 | SIZE. Note that if a message is received, the message type will be | |
2891 | the first thing in VAR, and the maximum length of VAR is SIZE plus the | |
2892 | size of the message type. Returns true if successful, or false if | |
2893 | there is an error. | |
2894 | .Ip "next LABEL" 8 8 | |
2895 | .Ip "next" 8 | |
2896 | The | |
2897 | .I next | |
2898 | command is like the | |
2899 | .I continue | |
2900 | statement in C; it starts the next iteration of the loop: | |
2901 | .nf | |
2902 | ||
2903 | .ne 4 | |
2904 | line: while (<STDIN>) { | |
2905 | next line if /\|^#/; # discard comments | |
2906 | .\|.\|. | |
2907 | } | |
2908 | ||
2909 | .fi | |
2910 | Note that if there were a | |
2911 | .I continue | |
2912 | block on the above, it would get executed even on discarded lines. | |
2913 | If the LABEL is omitted, the command refers to the innermost enclosing loop. | |
2914 | .Ip "oct(EXPR)" 8 4 | |
2915 | .Ip "oct EXPR" 8 | |
2916 | Returns the decimal value of EXPR interpreted as an octal string. | |
2917 | (If EXPR happens to start off with 0x, interprets it as a hex string instead.) | |
2918 | The following will handle decimal, octal and hex in the standard notation: | |
2919 | .nf | |
2920 | ||
2921 | $val = oct($val) if $val =~ /^0/; | |
2922 | ||
2923 | .fi | |
2924 | If EXPR is omitted, uses $_. | |
2925 | .Ip "open(FILEHANDLE,EXPR)" 8 8 | |
2926 | .Ip "open(FILEHANDLE)" 8 | |
2927 | .Ip "open FILEHANDLE" 8 | |
2928 | Opens the file whose filename is given by EXPR, and associates it with | |
2929 | FILEHANDLE. | |
2930 | If FILEHANDLE is an expression, its value is used as the name of the | |
2931 | real filehandle wanted. | |
2932 | If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE | |
2933 | contains the filename. | |
2934 | If the filename begins with \*(L"<\*(R" or nothing, the file is opened for | |
2935 | input. | |
2936 | If the filename begins with \*(L">\*(R", the file is opened for output. | |
2937 | If the filename begins with \*(L">>\*(R", the file is opened for appending. | |
2938 | (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you | |
2939 | want both read and write access to the file.) | |
2940 | If the filename begins with \*(L"|\*(R", the filename is interpreted | |
2941 | as a command to which output is to be piped, and if the filename ends | |
2942 | with a \*(L"|\*(R", the filename is interpreted as command which pipes | |
2943 | input to us. | |
2944 | (You may not have a command that pipes both in and out.) | |
2945 | Opening \'\-\' opens | |
2946 | .I STDIN | |
2947 | and opening \'>\-\' opens | |
2948 | .IR STDOUT . | |
2949 | Open returns non-zero upon success, the undefined value otherwise. | |
2950 | If the open involved a pipe, the return value happens to be the pid | |
2951 | of the subprocess. | |
2952 | Examples: | |
2953 | .nf | |
2954 | ||
2955 | .ne 3 | |
2956 | $article = 100; | |
2957 | open article || die "Can't find article $article: $!\en"; | |
2958 | while (<article>) {\|.\|.\|. | |
2959 | ||
2960 | .ie t \{\ | |
2961 | open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved) | |
2962 | 'br\} | |
2963 | .el \{\ | |
2964 | open(LOG, \'>>/usr/spool/news/twitlog\'\|); | |
2965 | # (log is reserved) | |
2966 | 'br\} | |
2967 | ||
2968 | .ie t \{\ | |
2969 | open(article, "caesar <$article |"\|); # decrypt article | |
2970 | 'br\} | |
2971 | .el \{\ | |
2972 | open(article, "caesar <$article |"\|); | |
2973 | # decrypt article | |
2974 | 'br\} | |
2975 | ||
2976 | .ie t \{\ | |
2977 | open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process# | |
2978 | 'br\} | |
2979 | .el \{\ | |
2980 | open(extract, "|sort >/tmp/Tmp$$"\|); | |
2981 | # $$ is our process# | |
2982 | 'br\} | |
2983 | ||
2984 | .ne 7 | |
2985 | # process argument list of files along with any includes | |
2986 | ||
2987 | foreach $file (@ARGV) { | |
2988 | do process($file, \'fh00\'); # no pun intended | |
2989 | } | |
2990 | ||
2991 | sub process { | |
2992 | local($filename, $input) = @_; | |
2993 | $input++; # this is a string increment | |
2994 | unless (open($input, $filename)) { | |
2995 | print STDERR "Can't open $filename: $!\en"; | |
2996 | return; | |
2997 | } | |
2998 | .ie t \{\ | |
2999 | while (<$input>) { # note the use of indirection | |
3000 | 'br\} | |
3001 | .el \{\ | |
3002 | while (<$input>) { # note use of indirection | |
3003 | 'br\} | |
3004 | if (/^#include "(.*)"/) { | |
3005 | do process($1, $input); | |
3006 | next; | |
3007 | } | |
3008 | .\|.\|. # whatever | |
3009 | } | |
3010 | } | |
3011 | ||
3012 | .fi | |
3013 | You may also, in the Bourne shell tradition, specify an EXPR beginning | |
3014 | with \*(L">&\*(R", in which case the rest of the string | |
3015 | is interpreted as the name of a filehandle | |
3016 | (or file descriptor, if numeric) which is to be duped and opened. | |
3017 | You may use & after >, >>, <, +>, +>> and +<. | |
3018 | The mode you specify should match the mode of the original filehandle. | |
3019 | Here is a script that saves, redirects, and restores | |
3020 | .I STDOUT | |
3021 | and | |
3022 | .IR STDERR : | |
3023 | .nf | |
3024 | ||
3025 | .ne 21 | |
3026 | #!/usr/bin/perl | |
3027 | open(SAVEOUT, ">&STDOUT"); | |
3028 | open(SAVEERR, ">&STDERR"); | |
3029 | ||
3030 | open(STDOUT, ">foo.out") || die "Can't redirect stdout"; | |
3031 | open(STDERR, ">&STDOUT") || die "Can't dup stdout"; | |
3032 | ||
3033 | select(STDERR); $| = 1; # make unbuffered | |
3034 | select(STDOUT); $| = 1; # make unbuffered | |
3035 | ||
3036 | print STDOUT "stdout 1\en"; # this works for | |
3037 | print STDERR "stderr 1\en"; # subprocesses too | |
3038 | ||
3039 | close(STDOUT); | |
3040 | close(STDERR); | |
3041 | ||
3042 | open(STDOUT, ">&SAVEOUT"); | |
3043 | open(STDERR, ">&SAVEERR"); | |
3044 | ||
3045 | print STDOUT "stdout 2\en"; | |
3046 | print STDERR "stderr 2\en"; | |
3047 | ||
3048 | .fi | |
3049 | If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R", | |
3050 | then there is an implicit fork done, and the return value of open | |
3051 | is the pid of the child within the parent process, and 0 within the child | |
3052 | process. | |
3053 | (Use defined($pid) to determine if the open was successful.) | |
3054 | The filehandle behaves normally for the parent, but i/o to that | |
3055 | filehandle is piped from/to the | |
3056 | .IR STDOUT / STDIN | |
3057 | of the child process. | |
3058 | In the child process the filehandle isn't opened\*(--i/o happens from/to | |
3059 | the new | |
3060 | .I STDOUT | |
3061 | or | |
3062 | .IR STDIN . | |
3063 | Typically this is used like the normal piped open when you want to exercise | |
3064 | more control over just how the pipe command gets executed, such as when | |
3065 | you are running setuid, and don't want to have to scan shell commands | |
3066 | for metacharacters. | |
3067 | The following pairs are more or less equivalent: | |
3068 | .nf | |
3069 | ||
3070 | .ne 5 | |
3071 | open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'"); | |
3072 | open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\'; | |
3073 | ||
3074 | open(FOO, "cat \-n '$file'|"); | |
3075 | open(FOO, "\-|") || exec \'cat\', \'\-n\', $file; | |
3076 | ||
3077 | .fi | |
3078 | Explicitly closing any piped filehandle causes the parent process to wait for the | |
3079 | child to finish, and returns the status value in $?. | |
3080 | Note: on any operation which may do a fork, | |
3081 | unflushed buffers remain unflushed in both | |
3082 | processes, which means you may need to set $| to | |
3083 | avoid duplicate output. | |
3084 | .Sp | |
3085 | The filename that is passed to open will have leading and trailing | |
3086 | whitespace deleted. | |
3087 | In order to open a file with arbitrary weird characters in it, it's necessary | |
3088 | to protect any leading and trailing whitespace thusly: | |
3089 | .nf | |
3090 | ||
3091 | .ne 2 | |
3092 | $file =~ s#^(\es)#./$1#; | |
3093 | open(FOO, "< $file\e0"); | |
3094 | ||
3095 | .fi | |
3096 | .Ip "opendir(DIRHANDLE,EXPR)" 8 3 | |
3097 | Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(), | |
3098 | rewinddir() and closedir(). | |
3099 | Returns true if successful. | |
3100 | DIRHANDLEs have their own namespace separate from FILEHANDLEs. | |
3101 | .Ip "ord(EXPR)" 8 4 | |
3102 | .Ip "ord EXPR" 8 | |
3103 | Returns the numeric ascii value of the first character of EXPR. | |
3104 | If EXPR is omitted, uses $_. | |
3105 | ''' Comments on f & d by gnb@melba.bby.oz.au 22/11/89 | |
3106 | .Ip "pack(TEMPLATE,LIST)" 8 4 | |
3107 | Takes an array or list of values and packs it into a binary structure, | |
3108 | returning the string containing the structure. | |
3109 | The TEMPLATE is a sequence of characters that give the order and type | |
3110 | of values, as follows: | |
3111 | .nf | |
3112 | ||
3113 | A An ascii string, will be space padded. | |
3114 | a An ascii string, will be null padded. | |
3115 | c A signed char value. | |
3116 | C An unsigned char value. | |
3117 | s A signed short value. | |
3118 | S An unsigned short value. | |
3119 | i A signed integer value. | |
3120 | I An unsigned integer value. | |
3121 | l A signed long value. | |
3122 | L An unsigned long value. | |
3123 | n A short in \*(L"network\*(R" order. | |
3124 | N A long in \*(L"network\*(R" order. | |
3125 | f A single-precision float in the native format. | |
3126 | d A double-precision float in the native format. | |
3127 | p A pointer to a string. | |
3128 | v A short in \*(L"VAX\*(R" (little-endian) order. | |
3129 | V A long in \*(L"VAX\*(R" (little-endian) order. | |
3130 | x A null byte. | |
3131 | X Back up a byte. | |
3132 | @ Null fill to absolute position. | |
3133 | u A uuencoded string. | |
3134 | b A bit string (ascending bit order, like vec()). | |
3135 | B A bit string (descending bit order). | |
3136 | h A hex string (low nybble first). | |
3137 | H A hex string (high nybble first). | |
3138 | ||
3139 | .fi | |
3140 | Each letter may optionally be followed by a number which gives a repeat | |
3141 | count. | |
3142 | With all types except "a", "A", "b", "B", "h" and "H", | |
3143 | the pack function will gobble up that many values | |
3144 | from the LIST. | |
3145 | A * for the repeat count means to use however many items are left. | |
3146 | The "a" and "A" types gobble just one value, but pack it as a string of length | |
3147 | count, | |
3148 | padding with nulls or spaces as necessary. | |
3149 | (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.) | |
3150 | Likewise, the "b" and "B" fields pack a string that many bits long. | |
3151 | The "h" and "H" fields pack a string that many nybbles long. | |
3152 | Real numbers (floats and doubles) are in the native machine format | |
3153 | only; due to the multiplicity of floating formats around, and the lack | |
3154 | of a standard \*(L"network\*(R" representation, no facility for | |
3155 | interchange has been made. | |
3156 | This means that packed floating point data | |
3157 | written on one machine may not be readable on another - even if both | |
3158 | use IEEE floating point arithmetic (as the endian-ness of the memory | |
3159 | representation is not part of the IEEE spec). | |
3160 | Note that perl uses | |
3161 | doubles internally for all numeric calculation, and converting from | |
3162 | double -> float -> double will lose precision (i.e. unpack("f", | |
3163 | pack("f", $foo)) will not in general equal $foo). | |
3164 | .br | |
3165 | Examples: | |
3166 | .nf | |
3167 | ||
3168 | $foo = pack("cccc",65,66,67,68); | |
3169 | # foo eq "ABCD" | |
3170 | $foo = pack("c4",65,66,67,68); | |
3171 | # same thing | |
3172 | ||
3173 | $foo = pack("ccxxcc",65,66,67,68); | |
3174 | # foo eq "AB\e0\e0CD" | |
3175 | ||
3176 | $foo = pack("s2",1,2); | |
3177 | # "\e1\e0\e2\e0" on little-endian | |
3178 | # "\e0\e1\e0\e2" on big-endian | |
3179 | ||
3180 | $foo = pack("a4","abcd","x","y","z"); | |
3181 | # "abcd" | |
3182 | ||
3183 | $foo = pack("aaaa","abcd","x","y","z"); | |
3184 | # "axyz" | |
3185 | ||
3186 | $foo = pack("a14","abcdefg"); | |
3187 | # "abcdefg\e0\e0\e0\e0\e0\e0\e0" | |
3188 | ||
3189 | $foo = pack("i9pl", gmtime); | |
3190 | # a real struct tm (on my system anyway) | |
3191 | ||
3192 | sub bintodec { | |
3193 | unpack("N", pack("B32", substr("0" x 32 . shift, -32))); | |
3194 | } | |
3195 | .fi | |
3196 | The same template may generally also be used in the unpack function. | |
3197 | .Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3 | |
3198 | Opens a pair of connected pipes like the corresponding system call. | |
3199 | Note that if you set up a loop of piped processes, deadlock can occur | |
3200 | unless you are very careful. | |
3201 | In addition, note that perl's pipes use stdio buffering, so you may need | |
3202 | to set $| to flush your WRITEHANDLE after each command, depending on | |
3203 | the application. | |
3204 | [Requires version 3.0 patchlevel 9.] | |
3205 | .Ip "pop(ARRAY)" 8 | |
3206 | .Ip "pop ARRAY" 8 6 | |
3207 | Pops and returns the last value of the array, shortening the array by 1. | |
3208 | Has the same effect as | |
3209 | .nf | |
3210 | ||
3211 | $tmp = $ARRAY[$#ARRAY\-\|\-]; | |
3212 | ||
3213 | .fi | |
3214 | If there are no elements in the array, returns the undefined value. | |
3215 | .Ip "print(FILEHANDLE LIST)" 8 10 | |
3216 | .Ip "print(LIST)" 8 | |
3217 | .Ip "print FILEHANDLE LIST" 8 | |
3218 | .Ip "print LIST" 8 | |
3219 | .Ip "print" 8 | |
3220 | Prints a string or a comma-separated list of strings. | |
3221 | Returns non-zero if successful. | |
3222 | FILEHANDLE may be a scalar variable name, in which case the variable contains | |
3223 | the name of the filehandle, thus introducing one level of indirection. | |
3224 | (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be | |
3225 | misinterpreted as an operator unless you interpose a + or put parens around | |
3226 | the arguments.) | |
3227 | If FILEHANDLE is omitted, prints by default to standard output (or to the | |
3228 | last selected output channel\*(--see select()). | |
3229 | If LIST is also omitted, prints $_ to | |
3230 | .IR STDOUT . | |
3231 | To set the default output channel to something other than | |
3232 | .I STDOUT | |
3233 | use the select operation. | |
3234 | Note that, because print takes a LIST, anything in the LIST is evaluated | |
3235 | in an array context, and any subroutine that you call will have one or more | |
3236 | of its expressions evaluated in an array context. | |
3237 | Also be careful not to follow the print keyword with a left parenthesis | |
3238 | unless you want the corresponding right parenthesis to terminate the | |
3239 | arguments to the print\*(--interpose a + or put parens around all the arguments. | |
3240 | .Ip "printf(FILEHANDLE LIST)" 8 10 | |
3241 | .Ip "printf(LIST)" 8 | |
3242 | .Ip "printf FILEHANDLE LIST" 8 | |
3243 | .Ip "printf LIST" 8 | |
3244 | Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R". | |
3245 | .Ip "push(ARRAY,LIST)" 8 7 | |
3246 | Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST | |
3247 | onto the end of ARRAY. | |
3248 | The length of ARRAY increases by the length of LIST. | |
3249 | Has the same effect as | |
3250 | .nf | |
3251 | ||
3252 | for $value (LIST) { | |
3253 | $ARRAY[++$#ARRAY] = $value; | |
3254 | } | |
3255 | ||
3256 | .fi | |
3257 | but is more efficient. | |
3258 | .Ip "q/STRING/" 8 5 | |
3259 | .Ip "qq/STRING/" 8 | |
3260 | .Ip "qx/STRING/" 8 | |
3261 | These are not really functions, but simply syntactic sugar to let you | |
3262 | avoid putting too many backslashes into quoted strings. | |
3263 | The q operator is a generalized single quote, and the qq operator a | |
3264 | generalized double quote. | |
3265 | The qx operator is a generalized backquote. | |
3266 | Any non-alphanumeric delimiter can be used in place of /, including newline. | |
3267 | If the delimiter is an opening bracket or parenthesis, the final delimiter | |
3268 | will be the corresponding closing bracket or parenthesis. | |
3269 | (Embedded occurrences of the closing bracket need to be backslashed as usual.) | |
3270 | Examples: | |
3271 | .nf | |
3272 | ||
3273 | .ne 5 | |
3274 | $foo = q!I said, "You said, \'She said it.\'"!; | |
3275 | $bar = q(\'This is it.\'); | |
3276 | $today = qx{ date }; | |
3277 | $_ .= qq | |
3278 | *** The previous line contains the naughty word "$&".\en | |
3279 | if /(ibm|apple|awk)/; # :-) | |
3280 | ||
3281 | .fi | |
3282 | .Ip "rand(EXPR)" 8 8 | |
3283 | .Ip "rand EXPR" 8 | |
3284 | .Ip "rand" 8 | |
3285 | Returns a random fractional number between 0 and the value of EXPR. | |
3286 | (EXPR should be positive.) | |
3287 | If EXPR is omitted, returns a value between 0 and 1. | |
3288 | See also srand(). | |
3289 | .Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5 | |
3290 | .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5 | |
3291 | Attempts to read LENGTH bytes of data into variable SCALAR from the specified | |
3292 | FILEHANDLE. | |
3293 | Returns the number of bytes actually read, or undef if there was an error. | |
3294 | SCALAR will be grown or shrunk to the length actually read. | |
3295 | An OFFSET may be specified to place the read data at some other place | |
3296 | than the beginning of the string. | |
3297 | This call is actually implemented in terms of stdio's fread call. To get | |
3298 | a true read system call, see sysread. | |
3299 | .Ip "readdir(DIRHANDLE)" 8 3 | |
3300 | .Ip "readdir DIRHANDLE" 8 | |
3301 | Returns the next directory entry for a directory opened by opendir(). | |
3302 | If used in an array context, returns all the rest of the entries in the | |
3303 | directory. | |
3304 | If there are no more entries, returns an undefined value in a scalar context | |
3305 | or a null list in an array context. | |
3306 | .Ip "readlink(EXPR)" 8 6 | |
3307 | .Ip "readlink EXPR" 8 | |
3308 | Returns the value of a symbolic link, if symbolic links are implemented. | |
3309 | If not, gives a fatal error. | |
3310 | If there is some system error, returns the undefined value and sets $! (errno). | |
3311 | If EXPR is omitted, uses $_. | |
3312 | .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4 | |
3313 | Receives a message on a socket. | |
3314 | Attempts to receive LENGTH bytes of data into variable SCALAR from the specified | |
3315 | SOCKET filehandle. | |
3316 | Returns the address of the sender, or the undefined value if there's an error. | |
3317 | SCALAR will be grown or shrunk to the length actually read. | |
3318 | Takes the same flags as the system call of the same name. | |
3319 | .Ip "redo LABEL" 8 8 | |
3320 | .Ip "redo" 8 | |
3321 | The | |
3322 | .I redo | |
3323 | command restarts the loop block without evaluating the conditional again. | |
3324 | The | |
3325 | .I continue | |
3326 | block, if any, is not executed. | |
3327 | If the LABEL is omitted, the command refers to the innermost enclosing loop. | |
3328 | This command is normally used by programs that want to lie to themselves | |
3329 | about what was just input: | |
3330 | .nf | |
3331 | ||
3332 | .ne 16 | |
3333 | # a simpleminded Pascal comment stripper | |
3334 | # (warning: assumes no { or } in strings) | |
3335 | line: while (<STDIN>) { | |
3336 | while (s|\|({.*}.*\|){.*}|$1 \||) {} | |
3337 | s|{.*}| \||; | |
3338 | if (s|{.*| \||) { | |
3339 | $front = $_; | |
3340 | while (<STDIN>) { | |
3341 | if (\|/\|}/\|) { # end of comment? | |
3342 | s|^|$front{|; | |
3343 | redo line; | |
3344 | } | |
3345 | } | |
3346 | } | |
3347 | print; | |
3348 | } | |
3349 | ||
3350 | .fi | |
3351 | .Ip "rename(OLDNAME,NEWNAME)" 8 2 | |
3352 | Changes the name of a file. | |
3353 | Returns 1 for success, 0 otherwise. | |
3354 | Will not work across filesystem boundaries. | |
3355 | .Ip "require(EXPR)" 8 6 | |
3356 | .Ip "require EXPR" 8 | |
3357 | .Ip "require" 8 | |
3358 | Includes the library file specified by EXPR, or by $_ if EXPR is not supplied. | |
3359 | Has semantics similar to the following subroutine: | |
3360 | .nf | |
3361 | ||
3362 | sub require { | |
3363 | local($filename) = @_; | |
3364 | return 1 if $INC{$filename}; | |
3365 | local($realfilename,$result); | |
3366 | ITER: { | |
3367 | foreach $prefix (@INC) { | |
3368 | $realfilename = "$prefix/$filename"; | |
3369 | if (-f $realfilename) { | |
3370 | $result = do $realfilename; | |
3371 | last ITER; | |
3372 | } | |
3373 | } | |
3374 | die "Can't find $filename in \e@INC"; | |
3375 | } | |
3376 | die $@ if $@; | |
3377 | die "$filename did not return true value" unless $result; | |
3378 | $INC{$filename} = $realfilename; | |
3379 | $result; | |
3380 | } | |
3381 | ||
3382 | .fi | |
3383 | Note that the file will not be included twice under the same specified name. | |
3384 | The file must return true as the last statement to indicate successful | |
3385 | execution of any initialization code, so it's customary to end | |
3386 | such a file with \*(L"1;\*(R" unless you're sure it'll return true otherwise. | |
3387 | .Ip "reset(EXPR)" 8 6 | |
3388 | .Ip "reset EXPR" 8 | |
3389 | .Ip "reset" 8 | |
3390 | Generally used in a | |
3391 | .I continue | |
3392 | block at the end of a loop to clear variables and reset ?? searches | |
3393 | so that they work again. | |
3394 | The expression is interpreted as a list of single characters (hyphens allowed | |
3395 | for ranges). | |
3396 | All variables and arrays beginning with one of those letters are reset to | |
3397 | their pristine state. | |
3398 | If the expression is omitted, one-match searches (?pattern?) are reset to | |
3399 | match again. | |
3400 | Only resets variables or searches in the current package. | |
3401 | Always returns 1. | |
3402 | Examples: | |
3403 | .nf | |
3404 | ||
3405 | .ne 3 | |
3406 | reset \'X\'; \h'|2i'# reset all X variables | |
3407 | reset \'a\-z\';\h'|2i'# reset lower case variables | |
3408 | reset; \h'|2i'# just reset ?? searches | |
3409 | ||
3410 | .fi | |
3411 | Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV | |
3412 | arrays. | |
3413 | .Sp | |
3414 | The use of reset on dbm associative arrays does not change the dbm file. | |
3415 | (It does, however, flush any entries cached by perl, which may be useful if | |
3416 | you are sharing the dbm file. | |
3417 | Then again, maybe not.) | |
3418 | .Ip "return LIST" 8 3 | |
3419 | Returns from a subroutine with the value specified. | |
3420 | (Note that a subroutine can automatically return | |
3421 | the value of the last expression evaluated. | |
3422 | That's the preferred method\*(--use of an explicit | |
3423 | .I return | |
3424 | is a bit slower.) | |
3425 | .Ip "reverse(LIST)" 8 4 | |
3426 | .Ip "reverse LIST" 8 | |
3427 | In an array context, returns an array value consisting of the elements | |
3428 | of LIST in the opposite order. | |
3429 | In a scalar context, returns a string value consisting of the bytes of | |
3430 | the first element of LIST in the opposite order. | |
3431 | .Ip "rewinddir(DIRHANDLE)" 8 5 | |
3432 | .Ip "rewinddir DIRHANDLE" 8 | |
3433 | Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE. | |
3434 | .Ip "rindex(STR,SUBSTR,POSITION)" 8 6 | |
3435 | .Ip "rindex(STR,SUBSTR)" 8 4 | |
3436 | Works just like index except that it | |
3437 | returns the position of the LAST occurrence of SUBSTR in STR. | |
3438 | If POSITION is specified, returns the last occurrence at or before that | |
3439 | position. | |
3440 | .Ip "rmdir(FILENAME)" 8 4 | |
3441 | .Ip "rmdir FILENAME" 8 | |
3442 | Deletes the directory specified by FILENAME if it is empty. | |
3443 | If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno). | |
3444 | If FILENAME is omitted, uses $_. | |
3445 | .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3 | |
3446 | Searches a string for a pattern, and if found, replaces that pattern with the | |
3447 | replacement text and returns the number of substitutions made. | |
3448 | Otherwise it returns false (0). | |
3449 | The \*(L"g\*(R" is optional, and if present, indicates that all occurrences | |
3450 | of the pattern are to be replaced. | |
3451 | The \*(L"i\*(R" is also optional, and if present, indicates that matching | |
3452 | is to be done in a case-insensitive manner. | |
3453 | The \*(L"e\*(R" is likewise optional, and if present, indicates that | |
3454 | the replacement string is to be evaluated as an expression rather than just | |
3455 | as a double-quoted string. | |
3456 | Any non-alphanumeric delimiter may replace the slashes; | |
3457 | if single quotes are used, no | |
3458 | interpretation is done on the replacement string (the e modifier overrides | |
3459 | this, however); if backquotes are used, the replacement string is a command | |
3460 | to execute whose output will be used as the actual replacement text. | |
3461 | If the PATTERN is delimited by bracketing quotes, the REPLACEMENT | |
3462 | has its own pair of quotes, which may or may not be bracketing quotes, e.g. | |
3463 | s(foo)(bar) or s<foo>/bar/. | |
3464 | If no string is specified via the =~ or !~ operator, | |
3465 | the $_ string is searched and modified. | |
3466 | (The string specified with =~ must be a scalar variable, an array element, | |
3467 | or an assignment to one of those, i.e. an lvalue.) | |
3468 | If the pattern contains a $ that looks like a variable rather than an | |
3469 | end-of-string test, the variable will be interpolated into the pattern at | |
3470 | run-time. | |
3471 | If you only want the pattern compiled once the first time the variable is | |
3472 | interpolated, add an \*(L"o\*(R" at the end. | |
3473 | If the PATTERN evaluates to a null string, the most recent successful | |
3474 | regular expression is used instead. | |
3475 | See also the section on regular expressions. | |
3476 | Examples: | |
3477 | .nf | |
3478 | ||
3479 | s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen | |
3480 | ||
3481 | $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|; | |
3482 | ||
3483 | s/Login: $foo/Login: $bar/; # run-time pattern | |
3484 | ||
3485 | ($foo = $bar) =~ s/bar/foo/; | |
3486 | ||
3487 | $_ = \'abc123xyz\'; | |
3488 | s/\ed+/$&*2/e; # yields \*(L'abc246xyz\*(R' | |
3489 | s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc 246xyz\*(R' | |
3490 | s/\ew/$& x 2/eg; # yields \*(L'aabbcc 224466xxyyzz\*(R' | |
3491 | ||
3492 | s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields | |
3493 | ||
3494 | .fi | |
3495 | (Note the use of $ instead of \|\e\| in the last example. See section | |
3496 | on regular expressions.) | |
3497 | .Ip "scalar(EXPR)" 8 3 | |
3498 | Forces EXPR to be interpreted in a scalar context and returns the value | |
3499 | of EXPR. | |
3500 | .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3 | |
3501 | Randomly positions the file pointer for FILEHANDLE, just like the fseek() | |
3502 | call of stdio. | |
3503 | FILEHANDLE may be an expression whose value gives the name of the filehandle. | |
3504 | Returns 1 upon success, 0 otherwise. | |
3505 | .Ip "seekdir(DIRHANDLE,POS)" 8 3 | |
3506 | Sets the current position for the readdir() routine on DIRHANDLE. | |
3507 | POS must be a value returned by telldir(). | |
3508 | Has the same caveats about possible directory compaction as the corresponding | |
3509 | system library routine. | |
3510 | .Ip "select(FILEHANDLE)" 8 3 | |
3511 | .Ip "select" 8 3 | |
3512 | Returns the currently selected filehandle. | |
3513 | Sets the current default filehandle for output, if FILEHANDLE is supplied. | |
3514 | This has two effects: first, a | |
3515 | .I write | |
3516 | or a | |
3517 | .I print | |
3518 | without a filehandle will default to this FILEHANDLE. | |
3519 | Second, references to variables related to output will refer to this output | |
3520 | channel. | |
3521 | For example, if you have to set the top of form format for more than | |
3522 | one output channel, you might do the following: | |
3523 | .nf | |
3524 | ||
3525 | .ne 4 | |
3526 | select(REPORT1); | |
3527 | $^ = \'report1_top\'; | |
3528 | select(REPORT2); | |
3529 | $^ = \'report2_top\'; | |
3530 | ||
3531 | .fi | |
3532 | FILEHANDLE may be an expression whose value gives the name of the actual filehandle. | |
3533 | Thus: | |
3534 | .nf | |
3535 | ||
3536 | $oldfh = select(STDERR); $| = 1; select($oldfh); | |
3537 | ||
3538 | .fi | |
3539 | .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3 | |
3540 | This calls the select system call with the bitmasks specified, which can | |
3541 | be constructed using fileno() and vec(), along these lines: | |
3542 | .nf | |
3543 | ||
3544 | $rin = $win = $ein = ''; | |
3545 | vec($rin,fileno(STDIN),1) = 1; | |
3546 | vec($win,fileno(STDOUT),1) = 1; | |
3547 | $ein = $rin | $win; | |
3548 | ||
3549 | .fi | |
3550 | If you want to select on many filehandles you might wish to write a subroutine: | |
3551 | .nf | |
3552 | ||
3553 | sub fhbits { | |
3554 | local(@fhlist) = split(' ',$_[0]); | |
3555 | local($bits); | |
3556 | for (@fhlist) { | |
3557 | vec($bits,fileno($_),1) = 1; | |
3558 | } | |
3559 | $bits; | |
3560 | } | |
3561 | $rin = &fhbits('STDIN TTY SOCK'); | |
3562 | ||
3563 | .fi | |
3564 | The usual idiom is: | |
3565 | .nf | |
3566 | ||
3567 | ($nfound,$timeleft) = | |
3568 | select($rout=$rin, $wout=$win, $eout=$ein, $timeout); | |
3569 | ||
3570 | or to block until something becomes ready: | |
3571 | ||
3572 | .ie t \{\ | |
3573 | $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef); | |
3574 | 'br\} | |
3575 | .el \{\ | |
3576 | $nfound = select($rout=$rin, $wout=$win, | |
3577 | $eout=$ein, undef); | |
3578 | 'br\} | |
3579 | ||
3580 | .fi | |
3581 | Any of the bitmasks can also be undef. | |
3582 | The timeout, if specified, is in seconds, which may be fractional. | |
3583 | NOTE: not all implementations are capable of returning the $timeleft. | |
3584 | If not, they always return $timeleft equal to the supplied $timeout. | |
3585 | .Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4 | |
3586 | Calls the System V IPC function semctl. If CMD is &IPC_STAT or | |
3587 | &GETALL, then ARG must be a variable which will hold the returned | |
3588 | semid_ds structure or semaphore value array. Returns like ioctl: the | |
3589 | undefined value for error, "0 but true" for zero, or the actual return | |
3590 | value otherwise. | |
3591 | .Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4 | |
3592 | Calls the System V IPC function semget. Returns the semaphore id, or | |
3593 | the undefined value if there is an error. | |
3594 | .Ip "semop(KEY,OPSTRING)" 8 4 | |
3595 | Calls the System V IPC function semop to perform semaphore operations | |
3596 | such as signaling and waiting. OPSTRING must be a packed array of | |
3597 | semop structures. Each semop structure can be generated with | |
3598 | \&'pack("sss", $semnum, $semop, $semflag)'. The number of semaphore | |
3599 | operations is implied by the length of OPSTRING. Returns true if | |
3600 | successful, or false if there is an error. As an example, the | |
3601 | following code waits on semaphore $semnum of semaphore id $semid: | |
3602 | .nf | |
3603 | ||
3604 | $semop = pack("sss", $semnum, -1, 0); | |
3605 | die "Semaphore trouble: $!\en" unless semop($semid, $semop); | |
3606 | ||
3607 | .fi | |
3608 | To signal the semaphore, replace "-1" with "1". | |
3609 | .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4 | |
3610 | .Ip "send(SOCKET,MSG,FLAGS)" 8 | |
3611 | Sends a message on a socket. | |
3612 | Takes the same flags as the system call of the same name. | |
3613 | On unconnected sockets you must specify a destination to send TO. | |
3614 | Returns the number of characters sent, or the undefined value if | |
3615 | there is an error. | |
3616 | .Ip "setpgrp(PID,PGRP)" 8 4 | |
3617 | Sets the current process group for the specified PID, 0 for the current | |
3618 | process. | |
3619 | Will produce a fatal error if used on a machine that doesn't implement | |
3620 | setpgrp(2). | |
3621 | .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4 | |
3622 | Sets the current priority for a process, a process group, or a user. | |
3623 | (See setpriority(2).) | |
3624 | Will produce a fatal error if used on a machine that doesn't implement | |
3625 | setpriority(2). | |
3626 | .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3 | |
3627 | Sets the socket option requested. | |
3628 | Returns undefined if there is an error. | |
3629 | OPTVAL may be specified as undef if you don't want to pass an argument. | |
3630 | .Ip "shift(ARRAY)" 8 6 | |
3631 | .Ip "shift ARRAY" 8 | |
3632 | .Ip "shift" 8 | |
3633 | Shifts the first value of the array off and returns it, | |
3634 | shortening the array by 1 and moving everything down. | |
3635 | If there are no elements in the array, returns the undefined value. | |
3636 | If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_ | |
3637 | array in subroutines. | |
3638 | (This is determined lexically.) | |
3639 | See also unshift(), push() and pop(). | |
3640 | Shift() and unshift() do the same thing to the left end of an array that push() | |
3641 | and pop() do to the right end. | |
3642 | .Ip "shmctl(ID,CMD,ARG)" 8 4 | |
3643 | Calls the System V IPC function shmctl. If CMD is &IPC_STAT, then ARG | |
3644 | must be a variable which will hold the returned shmid_ds structure. | |
3645 | Returns like ioctl: the undefined value for error, "0 but true" for | |
3646 | zero, or the actual return value otherwise. | |
3647 | .Ip "shmget(KEY,SIZE,FLAGS)" 8 4 | |
3648 | Calls the System V IPC function shmget. Returns the shared memory | |
3649 | segment id, or the undefined value if there is an error. | |
3650 | .Ip "shmread(ID,VAR,POS,SIZE)" 8 4 | |
3651 | .Ip "shmwrite(ID,STRING,POS,SIZE)" 8 | |
3652 | Reads or writes the System V shared memory segment ID starting at | |
3653 | position POS for size SIZE by attaching to it, copying in/out, and | |
3654 | detaching from it. When reading, VAR must be a variable which | |
3655 | will hold the data read. When writing, if STRING is too long, | |
3656 | only SIZE bytes are used; if STRING is too short, nulls are | |
3657 | written to fill out SIZE bytes. Return true if successful, or | |
3658 | false if there is an error. | |
3659 | .Ip "shutdown(SOCKET,HOW)" 8 3 | |
3660 | Shuts down a socket connection in the manner indicated by HOW, which has | |
3661 | the same interpretation as in the system call of the same name. | |
3662 | .Ip "sin(EXPR)" 8 4 | |
3663 | .Ip "sin EXPR" 8 | |
3664 | Returns the sine of EXPR (expressed in radians). | |
3665 | If EXPR is omitted, returns sine of $_. | |
3666 | .Ip "sleep(EXPR)" 8 6 | |
3667 | .Ip "sleep EXPR" 8 | |
3668 | .Ip "sleep" 8 | |
3669 | Causes the script to sleep for EXPR seconds, or forever if no EXPR. | |
3670 | May be interrupted by sending the process a SIGALRM. | |
3671 | Returns the number of seconds actually slept. | |
3672 | You probably cannot mix alarm() and sleep() calls, since sleep() is | |
3673 | often implemented using alarm(). | |
3674 | .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3 | |
3675 | Opens a socket of the specified kind and attaches it to filehandle SOCKET. | |
3676 | DOMAIN, TYPE and PROTOCOL are specified the same as for the system call | |
3677 | of the same name. | |
3678 | You may need to run h2ph on sys/socket.h to get the proper values handy | |
3679 | in a perl library file. | |
3680 | Return true if successful. | |
3681 | See the example in the section on Interprocess Communication. | |
3682 | .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3 | |
3683 | Creates an unnamed pair of sockets in the specified domain, of the specified | |
3684 | type. | |
3685 | DOMAIN, TYPE and PROTOCOL are specified the same as for the system call | |
3686 | of the same name. | |
3687 | If unimplemented, yields a fatal error. | |
3688 | Return true if successful. | |
3689 | .Ip "sort(SUBROUTINE LIST)" 8 9 | |
3690 | .Ip "sort(LIST)" 8 | |
3691 | .Ip "sort SUBROUTINE LIST" 8 | |
3692 | .Ip "sort BLOCK LIST" 8 | |
3693 | .Ip "sort LIST" 8 | |
3694 | Sorts the LIST and returns the sorted array value. | |
3695 | Nonexistent values of arrays are stripped out. | |
3696 | If SUBROUTINE or BLOCK is omitted, sorts in standard string comparison order. | |
3697 | If SUBROUTINE is specified, gives the name of a subroutine that returns | |
3698 | an integer less than, equal to, or greater than 0, | |
3699 | depending on how the elements of the array are to be ordered. | |
3700 | (The <=> and cmp operators are extremely useful in such routines.) | |
3701 | SUBROUTINE may be a scalar variable name, in which case the value provides | |
3702 | the name of the subroutine to use. | |
3703 | In place of a SUBROUTINE name, you can provide a BLOCK as an anonymous, | |
3704 | in-line sort subroutine. | |
3705 | .Sp | |
3706 | In the interests of efficiency the normal calling code for subroutines | |
3707 | is bypassed, with the following effects: the subroutine may not be a recursive | |
3708 | subroutine, and the two elements to be compared are passed into the subroutine | |
3709 | not via @_ but as $a and $b (see example below). | |
3710 | They are passed by reference so don't modify $a and $b. | |
3711 | .Sp | |
3712 | Examples: | |
3713 | .nf | |
3714 | ||
3715 | .ne 2 | |
3716 | # sort lexically | |
3717 | @articles = sort @files; | |
3718 | ||
3719 | .ne 2 | |
3720 | # same thing, but with explicit sort routine | |
3721 | @articles = sort {$a cmp $b} @files; | |
3722 | ||
3723 | .ne 2 | |
3724 | # same thing in reversed order | |
3725 | @articles = sort {$b cmp $a} @files; | |
3726 | ||
3727 | .ne 2 | |
3728 | # sort numerically ascending | |
3729 | @articles = sort {$a <=> $b} @files; | |
3730 | ||
3731 | .ne 2 | |
3732 | # sort numerically descending | |
3733 | @articles = sort {$b <=> $a} @files; | |
3734 | ||
3735 | .ne 5 | |
3736 | # sort using explicit subroutine name | |
3737 | sub byage { | |
3738 | $age{$a} <=> $age{$b}; # presuming integers | |
3739 | } | |
3740 | @sortedclass = sort byage @class; | |
3741 | ||
3742 | .ne 9 | |
3743 | sub reverse { $b cmp $a; } | |
3744 | @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\'); | |
3745 | @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\'); | |
3746 | print sort @harry; | |
3747 | # prints AbelCaincatdogx | |
3748 | print sort reverse @harry; | |
3749 | # prints xdogcatCainAbel | |
3750 | print sort @george, \'to\', @harry; | |
3751 | # prints AbelAxedCainPunishedcatchaseddoggonetoxyz | |
3752 | ||
3753 | .fi | |
3754 | .Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8 | |
3755 | .Ip "splice(ARRAY,OFFSET,LENGTH)" 8 | |
3756 | .Ip "splice(ARRAY,OFFSET)" 8 | |
3757 | Removes the elements designated by OFFSET and LENGTH from an array, and | |
3758 | replaces them with the elements of LIST, if any. | |
3759 | Returns the elements removed from the array. | |
3760 | The array grows or shrinks as necessary. | |
3761 | If LENGTH is omitted, removes everything from OFFSET onward. | |
3762 | The following equivalencies hold (assuming $[ == 0): | |
3763 | .nf | |
3764 | ||
3765 | push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y) | |
3766 | pop(@a)\h'|3.5i'splice(@a,-1) | |
3767 | shift(@a)\h'|3.5i'splice(@a,0,1) | |
3768 | unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y) | |
3769 | $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y); | |
3770 | ||
3771 | Example, assuming array lengths are passed before arrays: | |
3772 | ||
3773 | sub aeq { # compare two array values | |
3774 | local(@a) = splice(@_,0,shift); | |
3775 | local(@b) = splice(@_,0,shift); | |
3776 | return 0 unless @a == @b; # same len? | |
3777 | while (@a) { | |
3778 | return 0 if pop(@a) ne pop(@b); | |
3779 | } | |
3780 | return 1; | |
3781 | } | |
3782 | if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... } | |
3783 | ||
3784 | .fi | |
3785 | .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8 | |
3786 | .Ip "split(/PATTERN/,EXPR)" 8 8 | |
3787 | .Ip "split(/PATTERN/)" 8 | |
3788 | .Ip "split" 8 | |
3789 | Splits a string into an array of strings, and returns it. | |
3790 | (If not in an array context, returns the number of fields found and splits | |
3791 | into the @_ array. | |
3792 | (In an array context, you can force the split into @_ | |
3793 | by using ?? as the pattern delimiters, but it still returns the array value.)) | |
3794 | If EXPR is omitted, splits the $_ string. | |
3795 | If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/). | |
3796 | Anything matching PATTERN is taken to be a delimiter separating the fields. | |
3797 | (Note that the delimiter may be longer than one character.) | |
3798 | If LIMIT is specified, splits into no more than that many fields (though it | |
3799 | may split into fewer). | |
3800 | If LIMIT is unspecified, trailing null fields are stripped (which | |
3801 | potential users of pop() would do well to remember). | |
3802 | A pattern matching the null string (not to be confused with a null pattern //, | |
3803 | which is just one member of the set of patterns matching a null string) | |
3804 | will split the value of EXPR into separate characters at each point it | |
3805 | matches that way. | |
3806 | For example: | |
3807 | .nf | |
3808 | ||
3809 | print join(\':\', split(/ */, \'hi there\')); | |
3810 | ||
3811 | .fi | |
3812 | produces the output \*(L'h:i:t:h:e:r:e\*(R'. | |
3813 | .Sp | |
3814 | The LIMIT parameter can be used to partially split a line | |
3815 | .nf | |
3816 | ||
3817 | ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3); | |
3818 | ||
3819 | .fi | |
3820 | (When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one | |
3821 | larger than the number of variables in the list, to avoid unnecessary work. | |
3822 | For the list above LIMIT would have been 4 by default. | |
3823 | In time critical applications it behooves you not to split into | |
3824 | more fields than you really need.) | |
3825 | .Sp | |
3826 | If the PATTERN contains parentheses, additional array elements are created | |
3827 | from each matching substring in the delimiter. | |
3828 | .Sp | |
3829 | split(/([,-])/,"1-10,20"); | |
3830 | .Sp | |
3831 | produces the array value | |
3832 | .Sp | |
3833 | (1,'-',10,',',20) | |
3834 | .Sp | |
3835 | The pattern /PATTERN/ may be replaced with an expression to specify patterns | |
3836 | that vary at runtime. | |
3837 | (To do runtime compilation only once, use /$variable/o.) | |
3838 | As a special case, specifying a space (\'\ \') will split on white space | |
3839 | just as split with no arguments does, but leading white space does NOT | |
3840 | produce a null first field. | |
3841 | Thus, split(\'\ \') can be used to emulate | |
3842 | .IR awk 's | |
3843 | default behavior, whereas | |
3844 | split(/\ /) will give you as many null initial fields as there are | |
3845 | leading spaces. | |
3846 | .Sp | |
3847 | Example: | |
3848 | .nf | |
3849 | ||
3850 | .ne 5 | |
3851 | open(passwd, \'/etc/passwd\'); | |
3852 | while (<passwd>) { | |
3853 | .ie t \{\ | |
3854 | ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|); | |
3855 | 'br\} | |
3856 | .el \{\ | |
3857 | ($login, $passwd, $uid, $gid, $gcos, $home, $shell) | |
3858 | = split(\|/\|:\|/\|); | |
3859 | 'br\} | |
3860 | .\|.\|. | |
3861 | } | |
3862 | ||
3863 | .fi | |
3864 | (Note that $shell above will still have a newline on it. See chop().) | |
3865 | See also | |
3866 | .IR join . | |
3867 | .Ip "sprintf(FORMAT,LIST)" 8 4 | |
3868 | Returns a string formatted by the usual printf conventions. | |
3869 | The * character is not supported. | |
3870 | .Ip "sqrt(EXPR)" 8 4 | |
3871 | .Ip "sqrt EXPR" 8 | |
3872 | Return the square root of EXPR. | |
3873 | If EXPR is omitted, returns square root of $_. | |
3874 | .Ip "srand(EXPR)" 8 4 | |
3875 | .Ip "srand EXPR" 8 | |
3876 | Sets the random number seed for the | |
3877 | .I rand | |
3878 | operator. | |
3879 | If EXPR is omitted, does srand(time). | |
3880 | .Ip "stat(FILEHANDLE)" 8 8 | |
3881 | .Ip "stat FILEHANDLE" 8 | |
3882 | .Ip "stat(EXPR)" 8 | |
3883 | .Ip "stat SCALARVARIABLE" 8 | |
3884 | Returns a 13-element array giving the statistics for a file, either the file | |
3885 | opened via FILEHANDLE, or named by EXPR. | |
3886 | Returns a null list if the stat fails. | |
3887 | Typically used as follows: | |
3888 | .nf | |
3889 | ||
3890 | .ne 3 | |
3891 | ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size, | |
3892 | $atime,$mtime,$ctime,$blksize,$blocks) | |
3893 | = stat($filename); | |
3894 | ||
3895 | .fi | |
3896 | If stat is passed the special filehandle consisting of an underline, | |
3897 | no stat is done, but the current contents of the stat structure from | |
3898 | the last stat or filetest are returned. | |
3899 | Example: | |
3900 | .nf | |
3901 | ||
3902 | .ne 3 | |
3903 | if (-x $file && (($d) = stat(_)) && $d < 0) { | |
3904 | print "$file is executable NFS file\en"; | |
3905 | } | |
3906 | ||
3907 | .fi | |
3908 | (This only works on machines for which the device number is negative under NFS.) | |
3909 | .Ip "study(SCALAR)" 8 6 | |
3910 | .Ip "study SCALAR" 8 | |
3911 | .Ip "study" | |
3912 | Takes extra time to study SCALAR ($_ if unspecified) in anticipation of | |
3913 | doing many pattern matches on the string before it is next modified. | |
3914 | This may or may not save time, depending on the nature and number of patterns | |
3915 | you are searching on, and on the distribution of character frequencies in | |
3916 | the string to be searched\*(--you probably want to compare runtimes with and | |
3917 | without it to see which runs faster. | |
3918 | Those loops which scan for many short constant strings (including the constant | |
3919 | parts of more complex patterns) will benefit most. | |
3920 | You may have only one study active at a time\*(--if you study a different | |
3921 | scalar the first is \*(L"unstudied\*(R". | |
3922 | (The way study works is this: a linked list of every character in the string | |
3923 | to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters | |
3924 | are. | |
3925 | From each search string, the rarest character is selected, based on some | |
3926 | static frequency tables constructed from some C programs and English text. | |
3927 | Only those places that contain this \*(L"rarest\*(R" character are examined.) | |
3928 | .Sp | |
3929 | For example, here is a loop which inserts index producing entries before any line | |
3930 | containing a certain pattern: | |
3931 | .nf | |
3932 | ||
3933 | .ne 8 | |
3934 | while (<>) { | |
3935 | study; | |
3936 | print ".IX foo\en" if /\ebfoo\eb/; | |
3937 | print ".IX bar\en" if /\ebbar\eb/; | |
3938 | print ".IX blurfl\en" if /\ebblurfl\eb/; | |
3939 | .\|.\|. | |
3940 | print; | |
3941 | } | |
3942 | ||
3943 | .fi | |
3944 | In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R' | |
3945 | will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'. | |
3946 | In general, this is a big win except in pathological cases. | |
3947 | The only question is whether it saves you more time than it took to build | |
3948 | the linked list in the first place. | |
3949 | .Sp | |
3950 | Note that if you have to look for strings that you don't know till runtime, | |
3951 | you can build an entire loop as a string and eval that to avoid recompiling | |
3952 | all your patterns all the time. | |
3953 | Together with undefining $/ to input entire files as one record, this can | |
3954 | be very fast, often faster than specialized programs like fgrep. | |
3955 | The following scans a list of files (@files) | |
3956 | for a list of words (@words), and prints out the names of those files that | |
3957 | contain a match: | |
3958 | .nf | |
3959 | ||
3960 | .ne 12 | |
3961 | $search = \'while (<>) { study;\'; | |
3962 | foreach $word (@words) { | |
3963 | $search .= "++\e$seen{\e$ARGV} if /\e\eb$word\e\eb/;\en"; | |
3964 | } | |
3965 | $search .= "}"; | |
3966 | @ARGV = @files; | |
3967 | undef $/; | |
3968 | eval $search; # this screams | |
3969 | $/ = "\en"; # put back to normal input delim | |
3970 | foreach $file (sort keys(%seen)) { | |
3971 | print $file, "\en"; | |
3972 | } | |
3973 | ||
3974 | .fi | |
3975 | .Ip "substr(EXPR,OFFSET,LEN)" 8 2 | |
3976 | .Ip "substr(EXPR,OFFSET)" 8 2 | |
3977 | Extracts a substring out of EXPR and returns it. | |
3978 | First character is at offset 0, or whatever you've set $[ to. | |
3979 | If OFFSET is negative, starts that far from the end of the string. | |
3980 | If LEN is omitted, returns everything to the end of the string. | |
3981 | You can use the substr() function as an lvalue, in which case EXPR must | |
3982 | be an lvalue. | |
3983 | If you assign something shorter than LEN, the string will shrink, and | |
3984 | if you assign something longer than LEN, the string will grow to accommodate it. | |
3985 | To keep the string the same length you may need to pad or chop your value using | |
3986 | sprintf(). | |
3987 | .Ip "symlink(OLDFILE,NEWFILE)" 8 2 | |
3988 | Creates a new filename symbolically linked to the old filename. | |
3989 | Returns 1 for success, 0 otherwise. | |
3990 | On systems that don't support symbolic links, produces a fatal error at | |
3991 | run time. | |
3992 | To check for that, use eval: | |
3993 | .nf | |
3994 | ||
3995 | $symlink_exists = (eval \'symlink("","");\', $@ eq \'\'); | |
3996 | ||
3997 | .fi | |
3998 | .Ip "syscall(LIST)" 8 6 | |
3999 | .Ip "syscall LIST" 8 | |
4000 | Calls the system call specified as the first element of the list, passing | |
4001 | the remaining elements as arguments to the system call. | |
4002 | If unimplemented, produces a fatal error. | |
4003 | The arguments are interpreted as follows: if a given argument is numeric, | |
4004 | the argument is passed as an int. | |
4005 | If not, the pointer to the string value is passed. | |
4006 | You are responsible to make sure a string is pre-extended long enough | |
4007 | to receive any result that might be written into a string. | |
4008 | If your integer arguments are not literals and have never been interpreted | |
4009 | in a numeric context, you may need to add 0 to them to force them to look | |
4010 | like numbers. | |
4011 | .nf | |
4012 | ||
4013 | require 'syscall.ph'; # may need to run h2ph | |
4014 | syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9); | |
4015 | ||
4016 | .fi | |
4017 | .Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5 | |
4018 | .Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5 | |
4019 | Attempts to read LENGTH bytes of data into variable SCALAR from the specified | |
4020 | FILEHANDLE, using the system call read(2). | |
4021 | It bypasses stdio, so mixing this with other kinds of reads may cause | |
4022 | confusion. | |
4023 | Returns the number of bytes actually read, or undef if there was an error. | |
4024 | SCALAR will be grown or shrunk to the length actually read. | |
4025 | An OFFSET may be specified to place the read data at some other place | |
4026 | than the beginning of the string. | |
4027 | .Ip "system(LIST)" 8 6 | |
4028 | .Ip "system LIST" 8 | |
4029 | Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork | |
4030 | is done first, and the parent process waits for the child process to complete. | |
4031 | Note that argument processing varies depending on the number of arguments. | |
4032 | The return value is the exit status of the program as returned by the wait() | |
4033 | call. | |
4034 | To get the actual exit value divide by 256. | |
4035 | See also | |
4036 | .IR exec . | |
4037 | .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5 | |
4038 | .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5 | |
4039 | Attempts to write LENGTH bytes of data from variable SCALAR to the specified | |
4040 | FILEHANDLE, using the system call write(2). | |
4041 | It bypasses stdio, so mixing this with prints may cause | |
4042 | confusion. | |
4043 | Returns the number of bytes actually written, or undef if there was an error. | |
4044 | An OFFSET may be specified to place the read data at some other place | |
4045 | than the beginning of the string. | |
4046 | .Ip "tell(FILEHANDLE)" 8 6 | |
4047 | .Ip "tell FILEHANDLE" 8 6 | |
4048 | .Ip "tell" 8 | |
4049 | Returns the current file position for FILEHANDLE. | |
4050 | FILEHANDLE may be an expression whose value gives the name of the actual | |
4051 | filehandle. | |
4052 | If FILEHANDLE is omitted, assumes the file last read. | |
4053 | .Ip "telldir(DIRHANDLE)" 8 5 | |
4054 | .Ip "telldir DIRHANDLE" 8 | |
4055 | Returns the current position of the readdir() routines on DIRHANDLE. | |
4056 | Value may be given to seekdir() to access a particular location in | |
4057 | a directory. | |
4058 | Has the same caveats about possible directory compaction as the corresponding | |
4059 | system library routine. | |
4060 | .Ip "time" 8 4 | |
4061 | Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970. | |
4062 | Suitable for feeding to gmtime() and localtime(). | |
4063 | .Ip "times" 8 4 | |
4064 | Returns a four-element array giving the user and system times, in seconds, for this | |
4065 | process and the children of this process. | |
4066 | .Sp | |
4067 | ($user,$system,$cuser,$csystem) = times; | |
4068 | .Sp | |
4069 | .Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5 | |
4070 | .Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8 | |
4071 | Translates all occurrences of the characters found in the search list with | |
4072 | the corresponding character in the replacement list. | |
4073 | It returns the number of characters replaced or deleted. | |
4074 | If no string is specified via the =~ or !~ operator, | |
4075 | the $_ string is translated. | |
4076 | (The string specified with =~ must be a scalar variable, an array element, | |
4077 | or an assignment to one of those, i.e. an lvalue.) | |
4078 | For | |
4079 | .I sed | |
4080 | devotees, | |
4081 | .I y | |
4082 | is provided as a synonym for | |
4083 | .IR tr . | |
4084 | If the SEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST | |
4085 | has its own pair of quotes, which may or may not be bracketing quotes, e.g. | |
4086 | tr[A-Z][a-z] or tr(+-*/)/ABCD/. | |
4087 | .Sp | |
4088 | If the c modifier is specified, the SEARCHLIST character set is complemented. | |
4089 | If the d modifier is specified, any characters specified by SEARCHLIST that | |
4090 | are not found in REPLACEMENTLIST are deleted. | |
4091 | (Note that this is slightly more flexible than the behavior of some | |
4092 | .I tr | |
4093 | programs, which delete anything they find in the SEARCHLIST, period.) | |
4094 | If the s modifier is specified, sequences of characters that were translated | |
4095 | to the same character are squashed down to 1 instance of the character. | |
4096 | .Sp | |
4097 | If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly | |
4098 | as specified. | |
4099 | Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST, | |
4100 | the final character is replicated till it is long enough. | |
4101 | If the REPLACEMENTLIST is null, the SEARCHLIST is replicated. | |
4102 | This latter is useful for counting characters in a class, or for squashing | |
4103 | character sequences in a class. | |
4104 | .Sp | |
4105 | Examples: | |
4106 | .nf | |
4107 | ||
4108 | $ARGV[1] \|=~ \|y/A\-Z/a\-z/; \h'|3i'# canonicalize to lower case | |
4109 | ||
4110 | $cnt = tr/*/*/; \h'|3i'# count the stars in $_ | |
4111 | ||
4112 | $cnt = tr/0\-9//; \h'|3i'# count the digits in $_ | |
4113 | ||
4114 | tr/a\-zA\-Z//s; \h'|3i'# bookkeeper \-> bokeper | |
4115 | ||
4116 | ($HOST = $host) =~ tr/a\-z/A\-Z/; | |
4117 | ||
4118 | y/a\-zA\-Z/ /cs; \h'|3i'# change non-alphas to single space | |
4119 | ||
4120 | tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit | |
4121 | ||
4122 | .fi | |
4123 | .Ip "truncate(FILEHANDLE,LENGTH)" 8 4 | |
4124 | .Ip "truncate(EXPR,LENGTH)" 8 | |
4125 | Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified | |
4126 | length. | |
4127 | Produces a fatal error if truncate isn't implemented on your system. | |
4128 | .Ip "umask(EXPR)" 8 4 | |
4129 | .Ip "umask EXPR" 8 | |
4130 | .Ip "umask" 8 | |
4131 | Sets the umask for the process and returns the old one. | |
4132 | If EXPR is omitted, merely returns current umask. | |
4133 | .Ip "undef(EXPR)" 8 6 | |
4134 | .Ip "undef EXPR" 8 | |
4135 | .Ip "undef" 8 | |
4136 | Undefines the value of EXPR, which must be an lvalue. | |
4137 | Use only on a scalar value, an entire array, or a subroutine name (using &). | |
4138 | (Undef will probably not do what you expect on most predefined variables or | |
4139 | dbm array values.) | |
4140 | Always returns the undefined value. | |
4141 | You can omit the EXPR, in which case nothing is undefined, but you still | |
4142 | get an undefined value that you could, for instance, return from a subroutine. | |
4143 | Examples: | |
4144 | .nf | |
4145 | ||
4146 | .ne 6 | |
4147 | undef $foo; | |
4148 | undef $bar{'blurfl'}; | |
4149 | undef @ary; | |
4150 | undef %assoc; | |
4151 | undef &mysub; | |
4152 | return (wantarray ? () : undef) if $they_blew_it; | |
4153 | ||
4154 | .fi | |
4155 | .Ip "unlink(LIST)" 8 4 | |
4156 | .Ip "unlink LIST" 8 | |
4157 | Deletes a list of files. | |
4158 | Returns the number of files successfully deleted. | |
4159 | .nf | |
4160 | ||
4161 | .ne 2 | |
4162 | $cnt = unlink \'a\', \'b\', \'c\'; | |
4163 | unlink @goners; | |
4164 | unlink <*.bak>; | |
4165 | ||
4166 | .fi | |
4167 | Note: unlink will not delete directories unless you are superuser and the | |
4168 | .B \-U | |
4169 | flag is supplied to | |
4170 | .IR perl . | |
4171 | Even if these conditions are met, be warned that unlinking a directory | |
4172 | can inflict damage on your filesystem. | |
4173 | Use rmdir instead. | |
4174 | .Ip "unpack(TEMPLATE,EXPR)" 8 4 | |
4175 | Unpack does the reverse of pack: it takes a string representing | |
4176 | a structure and expands it out into an array value, returning the array | |
4177 | value. | |
4178 | (In a scalar context, it merely returns the first value produced.) | |
4179 | The TEMPLATE has the same format as in the pack function. | |
4180 | Here's a subroutine that does substring: | |
4181 | .nf | |
4182 | ||
4183 | .ne 4 | |
4184 | sub substr { | |
4185 | local($what,$where,$howmuch) = @_; | |
4186 | unpack("x$where a$howmuch", $what); | |
4187 | } | |
4188 | ||
4189 | .ne 3 | |
4190 | and then there's | |
4191 | ||
4192 | sub ord { unpack("c",$_[0]); } | |
4193 | ||
4194 | .fi | |
4195 | In addition, you may prefix a field with a %<number> to indicate that | |
4196 | you want a <number>-bit checksum of the items instead of the items themselves. | |
4197 | Default is a 16-bit checksum. | |
4198 | For example, the following computes the same number as the System V sum program: | |
4199 | .nf | |
4200 | ||
4201 | .ne 4 | |
4202 | while (<>) { | |
4203 | $checksum += unpack("%16C*", $_); | |
4204 | } | |
4205 | $checksum %= 65536; | |
4206 | ||
4207 | .fi | |
4208 | .Ip "unshift(ARRAY,LIST)" 8 4 | |
4209 | Does the opposite of a | |
4210 | .IR shift . | |
4211 | Or the opposite of a | |
4212 | .IR push , | |
4213 | depending on how you look at it. | |
4214 | Prepends list to the front of the array, and returns the number of elements | |
4215 | in the new array. | |
4216 | .nf | |
4217 | ||
4218 | unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/; | |
4219 | ||
4220 | .fi | |
4221 | .Ip "utime(LIST)" 8 2 | |
4222 | .Ip "utime LIST" 8 2 | |
4223 | Changes the access and modification times on each file of a list of files. | |
4224 | The first two elements of the list must be the NUMERICAL access and | |
4225 | modification times, in that order. | |
4226 | Returns the number of files successfully changed. | |
4227 | The inode modification time of each file is set to the current time. | |
4228 | Example of a \*(L"touch\*(R" command: | |
4229 | .nf | |
4230 | ||
4231 | .ne 3 | |
4232 | #!/usr/bin/perl | |
4233 | $now = time; | |
4234 | utime $now, $now, @ARGV; | |
4235 | ||
4236 | .fi | |
4237 | .Ip "values(ASSOC_ARRAY)" 8 6 | |
4238 | .Ip "values ASSOC_ARRAY" 8 | |
4239 | Returns a normal array consisting of all the values of the named associative | |
4240 | array. | |
4241 | The values are returned in an apparently random order, but it is the same order | |
4242 | as either the keys() or each() function would produce on the same array. | |
4243 | See also keys() and each(). | |
4244 | .Ip "vec(EXPR,OFFSET,BITS)" 8 2 | |
4245 | Treats a string as a vector of unsigned integers, and returns the value | |
4246 | of the bitfield specified. | |
4247 | May also be assigned to. | |
4248 | BITS must be a power of two from 1 to 32. | |
4249 | .Sp | |
4250 | Vectors created with vec() can also be manipulated with the logical operators | |
4251 | |, & and ^, | |
4252 | which will assume a bit vector operation is desired when both operands are | |
4253 | strings. | |
4254 | This interpretation is not enabled unless there is at least one vec() in | |
4255 | your program, to protect older programs. | |
4256 | .Sp | |
4257 | To transform a bit vector into a string or array of 0's and 1's, use these: | |
4258 | .nf | |
4259 | ||
4260 | $bits = unpack("b*", $vector); | |
4261 | @bits = split(//, unpack("b*", $vector)); | |
4262 | ||
4263 | .fi | |
4264 | If you know the exact length in bits, it can be used in place of the *. | |
4265 | .Ip "wait" 8 6 | |
4266 | Waits for a child process to terminate and returns the pid of the deceased | |
4267 | process, or -1 if there are no child processes. | |
4268 | The status is returned in $?. | |
4269 | .Ip "waitpid(PID,FLAGS)" 8 6 | |
4270 | Waits for a particular child process to terminate and returns the pid of the deceased | |
4271 | process, or -1 if there is no such child process. | |
4272 | The status is returned in $?. | |
4273 | If you say | |
4274 | .nf | |
4275 | ||
4276 | require "sys/wait.h"; | |
4277 | .\|.\|. | |
4278 | waitpid(-1,&WNOHANG); | |
4279 | ||
4280 | .fi | |
4281 | then you can do a non-blocking wait for any process. Non-blocking wait | |
4282 | is only available on machines supporting either the | |
4283 | .I waitpid (2) | |
4284 | or | |
4285 | .I wait4 (2) | |
4286 | system calls. | |
4287 | However, waiting for a particular pid with FLAGS of 0 is implemented | |
4288 | everywhere. (Perl emulates the system call by remembering the status | |
4289 | values of processes that have exited but have not been harvested by the | |
4290 | Perl script yet.) | |
4291 | .Ip "wantarray" 8 4 | |
4292 | Returns true if the context of the currently executing subroutine | |
4293 | is looking for an array value. | |
4294 | Returns false if the context is looking for a scalar. | |
4295 | .nf | |
4296 | ||
4297 | return wantarray ? () : undef; | |
4298 | ||
4299 | .fi | |
4300 | .Ip "warn(LIST)" 8 4 | |
4301 | .Ip "warn LIST" 8 | |
4302 | Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit. | |
4303 | .Ip "write(FILEHANDLE)" 8 6 | |
4304 | .Ip "write(EXPR)" 8 | |
4305 | .Ip "write" 8 | |
4306 | Writes a formatted record (possibly multi-line) to the specified file, | |
4307 | using the format associated with that file. | |
4308 | By default the format for a file is the one having the same name is the | |
4309 | filehandle, but the format for the current output channel (see | |
4310 | .IR select ) | |
4311 | may be set explicitly | |
4312 | by assigning the name of the format to the $~ variable. | |
4313 | .Sp | |
4314 | Top of form processing is handled automatically: | |
4315 | if there is insufficient room on the current page for the formatted | |
4316 | record, the page is advanced by writing a form feed, | |
4317 | a special top-of-page format is used | |
4318 | to format the new page header, and then the record is written. | |
4319 | By default the top-of-page format is the name of the filehandle with | |
4320 | \*(L"_TOP\*(R" appended, but it may be dynamicallly set to the | |
4321 | format of your choice by assigning the name to the $^ variable while | |
4322 | the filehandle is selected. | |
4323 | The number of lines remaining on the current page is in variable $-, which | |
4324 | can be set to 0 to force a new page. | |
4325 | .Sp | |
4326 | If FILEHANDLE is unspecified, output goes to the current default output channel, | |
4327 | which starts out as | |
4328 | .I STDOUT | |
4329 | but may be changed by the | |
4330 | .I select | |
4331 | operator. | |
4332 | If the FILEHANDLE is an EXPR, then the expression is evaluated and the | |
4333 | resulting string is used to look up the name of the FILEHANDLE at run time. | |
4334 | For more on formats, see the section on formats later on. | |
4335 | .Sp | |
4336 | Note that write is NOT the opposite of read. | |
4337 | .Sh "Precedence" | |
4338 | .I Perl | |
4339 | operators have the following associativity and precedence: | |
4340 | .nf | |
4341 | ||
4342 | nonassoc\h'|1i'print printf exec system sort reverse | |
4343 | \h'1.5i'chmod chown kill unlink utime die return | |
4344 | left\h'|1i', | |
4345 | right\h'|1i'= += \-= *= etc. | |
4346 | right\h'|1i'?: | |
4347 | nonassoc\h'|1i'.\|. | |
4348 | left\h'|1i'|| | |
4349 | left\h'|1i'&& | |
4350 | left\h'|1i'| ^ | |
4351 | left\h'|1i'& | |
4352 | nonassoc\h'|1i'== != <=> eq ne cmp | |
4353 | nonassoc\h'|1i'< > <= >= lt gt le ge | |
4354 | nonassoc\h'|1i'chdir exit eval reset sleep rand umask | |
4355 | nonassoc\h'|1i'\-r \-w \-x etc. | |
4356 | left\h'|1i'<< >> | |
4357 | left\h'|1i'+ \- . | |
4358 | left\h'|1i'* / % x | |
4359 | left\h'|1i'=~ !~ | |
4360 | right\h'|1i'! ~ and unary minus | |
4361 | right\h'|1i'** | |
4362 | nonassoc\h'|1i'++ \-\|\- | |
4363 | left\h'|1i'\*(L'(\*(R' | |
4364 | ||
4365 | .fi | |
4366 | As mentioned earlier, if any list operator (print, etc.) or | |
4367 | any unary operator (chdir, etc.) | |
4368 | is followed by a left parenthesis as the next token on the same line, | |
4369 | the operator and arguments within parentheses are taken to | |
4370 | be of highest precedence, just like a normal function call. | |
4371 | Examples: | |
4372 | .nf | |
4373 | ||
4374 | chdir $foo || die;\h'|3i'# (chdir $foo) || die | |
4375 | chdir($foo) || die;\h'|3i'# (chdir $foo) || die | |
4376 | chdir ($foo) || die;\h'|3i'# (chdir $foo) || die | |
4377 | chdir +($foo) || die;\h'|3i'# (chdir $foo) || die | |
4378 | ||
4379 | but, because * is higher precedence than ||: | |
4380 | ||
4381 | chdir $foo * 20;\h'|3i'# chdir ($foo * 20) | |
4382 | chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20 | |
4383 | chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20 | |
4384 | chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20) | |
4385 | ||
4386 | rand 10 * 20;\h'|3i'# rand (10 * 20) | |
4387 | rand(10) * 20;\h'|3i'# (rand 10) * 20 | |
4388 | rand (10) * 20;\h'|3i'# (rand 10) * 20 | |
4389 | rand +(10) * 20;\h'|3i'# rand (10 * 20) | |
4390 | ||
4391 | .fi | |
4392 | In the absence of parentheses, | |
4393 | the precedence of list operators such as print, sort or chmod is | |
4394 | either very high or very low depending on whether you look at the left | |
4395 | side of operator or the right side of it. | |
4396 | For example, in | |
4397 | .nf | |
4398 | ||
4399 | @ary = (1, 3, sort 4, 2); | |
4400 | print @ary; # prints 1324 | |
4401 | ||
4402 | .fi | |
4403 | the commas on the right of the sort are evaluated before the sort, but | |
4404 | the commas on the left are evaluated after. | |
4405 | In other words, list operators tend to gobble up all the arguments that | |
4406 | follow them, and then act like a simple term with regard to the preceding | |
4407 | expression. | |
4408 | Note that you have to be careful with parens: | |
4409 | .nf | |
4410 | ||
4411 | .ne 3 | |
4412 | # These evaluate exit before doing the print: | |
4413 | print($foo, exit); # Obviously not what you want. | |
4414 | print $foo, exit; # Nor is this. | |
4415 | ||
4416 | .ne 4 | |
4417 | # These do the print before evaluating exit: | |
4418 | (print $foo), exit; # This is what you want. | |
4419 | print($foo), exit; # Or this. | |
4420 | print ($foo), exit; # Or even this. | |
4421 | ||
4422 | Also note that | |
4423 | ||
4424 | print ($foo & 255) + 1, "\en"; | |
4425 | ||
4426 | .fi | |
4427 | probably doesn't do what you expect at first glance. | |
4428 | .Sh "Subroutines" | |
4429 | A subroutine may be declared as follows: | |
4430 | .nf | |
4431 | ||
4432 | sub NAME BLOCK | |
4433 | ||
4434 | .fi | |
4435 | .PP | |
4436 | Any arguments passed to the routine come in as array @_, | |
4437 | that is ($_[0], $_[1], .\|.\|.). | |
4438 | The array @_ is a local array, but its values are references to the | |
4439 | actual scalar parameters. | |
4440 | The return value of the subroutine is the value of the last expression | |
4441 | evaluated, and can be either an array value or a scalar value. | |
4442 | Alternately, a return statement may be used to specify the returned value and | |
4443 | exit the subroutine. | |
4444 | To create local variables see the | |
4445 | .I local | |
4446 | operator. | |
4447 | .PP | |
4448 | A subroutine is called using the | |
4449 | .I do | |
4450 | operator or the & operator. | |
4451 | .nf | |
4452 | ||
4453 | .ne 12 | |
4454 | Example: | |
4455 | ||
4456 | sub MAX { | |
4457 | local($max) = pop(@_); | |
4458 | foreach $foo (@_) { | |
4459 | $max = $foo \|if \|$max < $foo; | |
4460 | } | |
4461 | $max; | |
4462 | } | |
4463 | ||
4464 | .\|.\|. | |
4465 | $bestday = &MAX($mon,$tue,$wed,$thu,$fri); | |
4466 | ||
4467 | .ne 21 | |
4468 | Example: | |
4469 | ||
4470 | # get a line, combining continuation lines | |
4471 | # that start with whitespace | |
4472 | sub get_line { | |
4473 | $thisline = $lookahead; | |
4474 | line: while ($lookahead = <STDIN>) { | |
4475 | if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) { | |
4476 | $thisline \|.= \|$lookahead; | |
4477 | } | |
4478 | else { | |
4479 | last line; | |
4480 | } | |
4481 | } | |
4482 | $thisline; | |
4483 | } | |
4484 | ||
4485 | $lookahead = <STDIN>; # get first line | |
4486 | while ($_ = do get_line(\|)) { | |
4487 | .\|.\|. | |
4488 | } | |
4489 | ||
4490 | .fi | |
4491 | .nf | |
4492 | .ne 6 | |
4493 | Use array assignment to a local list to name your formal arguments: | |
4494 | ||
4495 | sub maybeset { | |
4496 | local($key, $value) = @_; | |
4497 | $foo{$key} = $value unless $foo{$key}; | |
4498 | } | |
4499 | ||
4500 | .fi | |
4501 | This also has the effect of turning call-by-reference into call-by-value, | |
4502 | since the assignment copies the values. | |
4503 | .Sp | |
4504 | Subroutines may be called recursively. | |
4505 | If a subroutine is called using the & form, the argument list is optional. | |
4506 | If omitted, no @_ array is set up for the subroutine; the @_ array at the | |
4507 | time of the call is visible to subroutine instead. | |
4508 | .nf | |
4509 | ||
4510 | do foo(1,2,3); # pass three arguments | |
4511 | &foo(1,2,3); # the same | |
4512 | ||
4513 | do foo(); # pass a null list | |
4514 | &foo(); # the same | |
4515 | &foo; # pass no arguments\*(--more efficient | |
4516 | ||
4517 | .fi | |
4518 | .Sh "Passing By Reference" | |
4519 | Sometimes you don't want to pass the value of an array to a subroutine but | |
4520 | rather the name of it, so that the subroutine can modify the global copy | |
4521 | of it rather than working with a local copy. | |
4522 | In perl you can refer to all the objects of a particular name by prefixing | |
4523 | the name with a star: *foo. | |
4524 | When evaluated, it produces a scalar value that represents all the objects | |
4525 | of that name, including any filehandle, format or subroutine. | |
4526 | When assigned to within a local() operation, it causes the name mentioned | |
4527 | to refer to whatever * value was assigned to it. | |
4528 | Example: | |
4529 | .nf | |
4530 | ||
4531 | sub doubleary { | |
4532 | local(*someary) = @_; | |
4533 | foreach $elem (@someary) { | |
4534 | $elem *= 2; | |
4535 | } | |
4536 | } | |
4537 | do doubleary(*foo); | |
4538 | do doubleary(*bar); | |
4539 | ||
4540 | .fi | |
4541 | Assignment to *name is currently recommended only inside a local(). | |
4542 | You can actually assign to *name anywhere, but the previous referent of | |
4543 | *name may be stranded forever. | |
4544 | This may or may not bother you. | |
4545 | .Sp | |
4546 | Note that scalars are already passed by reference, so you can modify scalar | |
4547 | arguments without using this mechanism by referring explicitly to the $_[nnn] | |
4548 | in question. | |
4549 | You can modify all the elements of an array by passing all the elements | |
4550 | as scalars, but you have to use the * mechanism to push, pop or change the | |
4551 | size of an array. | |
4552 | The * mechanism will probably be more efficient in any case. | |
4553 | .Sp | |
4554 | Since a *name value contains unprintable binary data, if it is used as | |
4555 | an argument in a print, or as a %s argument in a printf or sprintf, it | |
4556 | then has the value '*name', just so it prints out pretty. | |
4557 | .Sp | |
4558 | Even if you don't want to modify an array, this mechanism is useful for | |
4559 | passing multiple arrays in a single LIST, since normally the LIST mechanism | |
4560 | will merge all the array values so that you can't extract out the | |
4561 | individual arrays. | |
4562 | .Sh "Regular Expressions" | |
4563 | The patterns used in pattern matching are regular expressions such as | |
4564 | those supplied in the Version 8 regexp routines. | |
4565 | (In fact, the routines are derived from Henry Spencer's freely redistributable | |
4566 | reimplementation of the V8 routines.) | |
4567 | In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric. | |
4568 | Word boundaries may be matched by \eb, and non-boundaries by \eB. | |
4569 | A whitespace character is matched by \es, non-whitespace by \eS. | |
4570 | A numeric character is matched by \ed, non-numeric by \eD. | |
4571 | You may use \ew, \es and \ed within character classes. | |
4572 | Also, \en, \er, \ef, \et and \eNNN have their normal interpretations. | |
4573 | Within character classes \eb represents backspace rather than a word boundary. | |
4574 | Alternatives may be separated by |. | |
4575 | The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit> | |
4576 | matches the digit'th substring. | |
4577 | (Outside of the pattern, always use $ instead of \e in front of the digit. | |
4578 | The scope of $<digit> (and $\`, $& and $\') | |
4579 | extends to the end of the enclosing BLOCK or eval string, or to | |
4580 | the next pattern match with subexpressions. | |
4581 | The \e<digit> notation sometimes works outside the current pattern, but should | |
4582 | not be relied upon.) | |
4583 | You may have as many parentheses as you wish. If you have more than 9 | |
4584 | substrings, the variables $10, $11, ... refer to the corresponding | |
4585 | substring. Within the pattern, \e10, \e11, | |
4586 | etc. refer back to substrings if there have been at least that many left parens | |
4587 | before the backreference. Otherwise (for backward compatibilty) \e10 | |
4588 | is the same as \e010, a backspace, | |
4589 | and \e11 the same as \e011, a tab. | |
4590 | And so on. | |
4591 | (\e1 through \e9 are always backreferences.) | |
4592 | .PP | |
4593 | $+ returns whatever the last bracket match matched. | |
4594 | $& returns the entire matched string. | |
4595 | ($0 used to return the same thing, but not any more.) | |
4596 | $\` returns everything before the matched string. | |
4597 | $\' returns everything after the matched string. | |
4598 | Examples: | |
4599 | .nf | |
4600 | ||
4601 | s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words | |
4602 | ||
4603 | .ne 5 | |
4604 | if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) { | |
4605 | $hours = $1; | |
4606 | $minutes = $2; | |
4607 | $seconds = $3; | |
4608 | } | |
4609 | ||
4610 | .fi | |
4611 | By default, the ^ character is only guaranteed to match at the beginning | |
4612 | of the string, | |
4613 | the $ character only at the end (or before the newline at the end) | |
4614 | and | |
4615 | .I perl | |
4616 | does certain optimizations with the assumption that the string contains | |
4617 | only one line. | |
4618 | The behavior of ^ and $ on embedded newlines will be inconsistent. | |
4619 | You may, however, wish to treat a string as a multi-line buffer, such that | |
4620 | the ^ will match after any newline within the string, and $ will match | |
4621 | before any newline. | |
4622 | At the cost of a little more overhead, you can do this by setting the variable | |
4623 | $* to 1. | |
4624 | Setting it back to 0 makes | |
4625 | .I perl | |
4626 | revert to its old behavior. | |
4627 | .PP | |
4628 | To facilitate multi-line substitutions, the . character never matches a newline | |
4629 | (even when $* is 0). | |
4630 | In particular, the following leaves a newline on the $_ string: | |
4631 | .nf | |
4632 | ||
4633 | $_ = <STDIN>; | |
4634 | s/.*(some_string).*/$1/; | |
4635 | ||
4636 | If the newline is unwanted, try one of | |
4637 | ||
4638 | s/.*(some_string).*\en/$1/; | |
4639 | s/.*(some_string)[^\e000]*/$1/; | |
4640 | s/.*(some_string)(.|\en)*/$1/; | |
4641 | chop; s/.*(some_string).*/$1/; | |
4642 | /(some_string)/ && ($_ = $1); | |
4643 | ||
4644 | .fi | |
4645 | Any item of a regular expression may be followed with digits in curly brackets | |
4646 | of the form {n,m}, where n gives the minimum number of times to match the item | |
4647 | and m gives the maximum. | |
4648 | The form {n} is equivalent to {n,n} and matches exactly n times. | |
4649 | The form {n,} matches n or more times. | |
4650 | (If a curly bracket occurs in any other context, it is treated as a regular | |
4651 | character.) | |
4652 | The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier | |
4653 | to {0,1}. | |
4654 | There is no limit to the size of n or m, but large numbers will chew up | |
4655 | more memory. | |
4656 | .Sp | |
4657 | You will note that all backslashed metacharacters in | |
4658 | .I perl | |
4659 | are alphanumeric, | |
4660 | such as \eb, \ew, \en. | |
4661 | Unlike some other regular expression languages, there are no backslashed | |
4662 | symbols that aren't alphanumeric. | |
4663 | So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always | |
4664 | interpreted as a literal character, not a metacharacter. | |
4665 | This makes it simple to quote a string that you want to use for a pattern | |
4666 | but that you are afraid might contain metacharacters. | |
4667 | Simply quote all the non-alphanumeric characters: | |
4668 | .nf | |
4669 | ||
4670 | $pattern =~ s/(\eW)/\e\e$1/g; | |
4671 | ||
4672 | .fi | |
4673 | .Sh "Formats" | |
4674 | Output record formats for use with the | |
4675 | .I write | |
4676 | operator may declared as follows: | |
4677 | .nf | |
4678 | ||
4679 | .ne 3 | |
4680 | format NAME = | |
4681 | FORMLIST | |
4682 | . | |
4683 | ||
4684 | .fi | |
4685 | If name is omitted, format \*(L"STDOUT\*(R" is defined. | |
4686 | FORMLIST consists of a sequence of lines, each of which may be of one of three | |
4687 | types: | |
4688 | .Ip 1. 4 | |
4689 | A comment. | |
4690 | .Ip 2. 4 | |
4691 | A \*(L"picture\*(R" line giving the format for one output line. | |
4692 | .Ip 3. 4 | |
4693 | An argument line supplying values to plug into a picture line. | |
4694 | .PP | |
4695 | Picture lines are printed exactly as they look, except for certain fields | |
4696 | that substitute values into the line. | |
4697 | Each picture field starts with either @ or ^. | |
4698 | The @ field (not to be confused with the array marker @) is the normal | |
4699 | case; ^ fields are used | |
4700 | to do rudimentary multi-line text block filling. | |
4701 | The length of the field is supplied by padding out the field | |
4702 | with multiple <, >, or | characters to specify, respectively, left justification, | |
4703 | right justification, or centering. | |
4704 | As an alternate form of right justification, | |
4705 | you may also use # characters (with an optional .) to specify a numeric field. | |
4706 | (Use of ^ instead of @ causes the field to be blanked if undefined.) | |
4707 | If any of the values supplied for these fields contains a newline, only | |
4708 | the text up to the newline is printed. | |
4709 | The special field @* can be used for printing multi-line values. | |
4710 | It should appear by itself on a line. | |
4711 | .PP | |
4712 | The values are specified on the following line, in the same order as | |
4713 | the picture fields. | |
4714 | The values should be separated by commas. | |
4715 | .PP | |
4716 | Picture fields that begin with ^ rather than @ are treated specially. | |
4717 | The value supplied must be a scalar variable name which contains a text | |
4718 | string. | |
4719 | .I Perl | |
4720 | puts as much text as it can into the field, and then chops off the front | |
4721 | of the string so that the next time the variable is referenced, | |
4722 | more of the text can be printed. | |
4723 | Normally you would use a sequence of fields in a vertical stack to print | |
4724 | out a block of text. | |
4725 | If you like, you can end the final field with .\|.\|., which will appear in the | |
4726 | output if the text was too long to appear in its entirety. | |
4727 | You can change which characters are legal to break on by changing the | |
4728 | variable $: to a list of the desired characters. | |
4729 | .PP | |
4730 | Since use of ^ fields can produce variable length records if the text to be | |
4731 | formatted is short, you can suppress blank lines by putting the tilde (~) | |
4732 | character anywhere in the line. | |
4733 | (Normally you should put it in the front if possible, for visibility.) | |
4734 | The tilde will be translated to a space upon output. | |
4735 | If you put a second tilde contiguous to the first, the line will be repeated | |
4736 | until all the fields on the line are exhausted. | |
4737 | (If you use a field of the @ variety, the expression you supply had better | |
4738 | not give the same value every time forever!) | |
4739 | .PP | |
4740 | Examples: | |
4741 | .nf | |
4742 | .lg 0 | |
4743 | .cs R 25 | |
4744 | .ft C | |
4745 | ||
4746 | .ne 10 | |
4747 | # a report on the /etc/passwd file | |
4748 | format STDOUT_TOP = | |
4749 | \& Passwd File | |
4750 | Name Login Office Uid Gid Home | |
4751 | ------------------------------------------------------------------ | |
4752 | \&. | |
4753 | format STDOUT = | |
4754 | @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<< | |
4755 | $name, $login, $office,$uid,$gid, $home | |
4756 | \&. | |
4757 | ||
4758 | .ne 29 | |
4759 | # a report from a bug report form | |
4760 | format STDOUT_TOP = | |
4761 | \& Bug Reports | |
4762 | @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>> | |
4763 | $system, $%, $date | |
4764 | ------------------------------------------------------------------ | |
4765 | \&. | |
4766 | format STDOUT = | |
4767 | Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4768 | \& $subject | |
4769 | Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4770 | \& $index, $description | |
4771 | Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4772 | \& $priority, $date, $description | |
4773 | From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4774 | \& $from, $description | |
4775 | Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4776 | \& $programmer, $description | |
4777 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4778 | \& $description | |
4779 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4780 | \& $description | |
4781 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4782 | \& $description | |
4783 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< | |
4784 | \& $description | |
4785 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<... | |
4786 | \& $description | |
4787 | \&. | |
4788 | ||
4789 | .ft R | |
4790 | .cs R | |
4791 | .lg | |
4792 | .fi | |
4793 | It is possible to intermix prints with writes on the same output channel, | |
4794 | but you'll have to handle $\- (lines left on the page) yourself. | |
4795 | .PP | |
4796 | If you are printing lots of fields that are usually blank, you should consider | |
4797 | using the reset operator between records. | |
4798 | Not only is it more efficient, but it can prevent the bug of adding another | |
4799 | field and forgetting to zero it. | |
4800 | .Sh "Interprocess Communication" | |
4801 | The IPC facilities of perl are built on the Berkeley socket mechanism. | |
4802 | If you don't have sockets, you can ignore this section. | |
4803 | The calls have the same names as the corresponding system calls, | |
4804 | but the arguments tend to differ, for two reasons. | |
4805 | First, perl file handles work differently than C file descriptors. | |
4806 | Second, perl already knows the length of its strings, so you don't need | |
4807 | to pass that information. | |
4808 | Here is a sample client (untested): | |
4809 | .nf | |
4810 | ||
4811 | ($them,$port) = @ARGV; | |
4812 | $port = 2345 unless $port; | |
4813 | $them = 'localhost' unless $them; | |
4814 | ||
4815 | $SIG{'INT'} = 'dokill'; | |
4816 | sub dokill { kill 9,$child if $child; } | |
4817 | ||
4818 | require 'sys/socket.ph'; | |
4819 | ||
4820 | $sockaddr = 'S n a4 x8'; | |
4821 | chop($hostname = `hostname`); | |
4822 | ||
4823 | ($name, $aliases, $proto) = getprotobyname('tcp'); | |
4824 | ($name, $aliases, $port) = getservbyname($port, 'tcp') | |
4825 | unless $port =~ /^\ed+$/; | |
4826 | .ie t \{\ | |
4827 | ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname); | |
4828 | 'br\} | |
4829 | .el \{\ | |
4830 | ($name, $aliases, $type, $len, $thisaddr) = | |
4831 | gethostbyname($hostname); | |
4832 | 'br\} | |
4833 | ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them); | |
4834 | ||
4835 | $this = pack($sockaddr, &AF_INET, 0, $thisaddr); | |
4836 | $that = pack($sockaddr, &AF_INET, $port, $thataddr); | |
4837 | ||
4838 | socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!"; | |
4839 | bind(S, $this) || die "bind: $!"; | |
4840 | connect(S, $that) || die "connect: $!"; | |
4841 | ||
4842 | select(S); $| = 1; select(stdout); | |
4843 | ||
4844 | if ($child = fork) { | |
4845 | while (<>) { | |
4846 | print S; | |
4847 | } | |
4848 | sleep 3; | |
4849 | do dokill(); | |
4850 | } | |
4851 | else { | |
4852 | while (<S>) { | |
4853 | print; | |
4854 | } | |
4855 | } | |
4856 | ||
4857 | .fi | |
4858 | And here's a server: | |
4859 | .nf | |
4860 | ||
4861 | ($port) = @ARGV; | |
4862 | $port = 2345 unless $port; | |
4863 | ||
4864 | require 'sys/socket.ph'; | |
4865 | ||
4866 | $sockaddr = 'S n a4 x8'; | |
4867 | ||
4868 | ($name, $aliases, $proto) = getprotobyname('tcp'); | |
4869 | ($name, $aliases, $port) = getservbyname($port, 'tcp') | |
4870 | unless $port =~ /^\ed+$/; | |
4871 | ||
4872 | $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0"); | |
4873 | ||
4874 | select(NS); $| = 1; select(stdout); | |
4875 | ||
4876 | socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!"; | |
4877 | bind(S, $this) || die "bind: $!"; | |
4878 | listen(S, 5) || die "connect: $!"; | |
4879 | ||
4880 | select(S); $| = 1; select(stdout); | |
4881 | ||
4882 | for (;;) { | |
4883 | print "Listening again\en"; | |
4884 | ($addr = accept(NS,S)) || die $!; | |
4885 | print "accept ok\en"; | |
4886 | ||
4887 | ($af,$port,$inetaddr) = unpack($sockaddr,$addr); | |
4888 | @inetaddr = unpack('C4',$inetaddr); | |
4889 | print "$af $port @inetaddr\en"; | |
4890 | ||
4891 | while (<NS>) { | |
4892 | print; | |
4893 | print NS; | |
4894 | } | |
4895 | } | |
4896 | ||
4897 | .fi | |
4898 | .Sh "Predefined Names" | |
4899 | The following names have special meaning to | |
4900 | .IR perl . | |
4901 | I could have used alphabetic symbols for some of these, but I didn't want | |
4902 | to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all | |
4903 | out. | |
4904 | You'll just have to suffer along with these silly symbols. | |
4905 | Most of them have reasonable mnemonics, or analogues in one of the shells. | |
4906 | .Ip $_ 8 | |
4907 | The default input and pattern-searching space. | |
4908 | The following pairs are equivalent: | |
4909 | .nf | |
4910 | ||
4911 | .ne 2 | |
4912 | while (<>) {\|.\|.\|. # only equivalent in while! | |
4913 | while ($_ = <>) {\|.\|.\|. | |
4914 | ||
4915 | .ne 2 | |
4916 | /\|^Subject:/ | |
4917 | $_ \|=~ \|/\|^Subject:/ | |
4918 | ||
4919 | .ne 2 | |
4920 | y/a\-z/A\-Z/ | |
4921 | $_ =~ y/a\-z/A\-Z/ | |
4922 | ||
4923 | .ne 2 | |
4924 | chop | |
4925 | chop($_) | |
4926 | ||
4927 | .fi | |
4928 | (Mnemonic: underline is understood in certain operations.) | |
4929 | .Ip $. 8 | |
4930 | The current input line number of the last filehandle that was read. | |
4931 | Readonly. | |
4932 | Remember that only an explicit close on the filehandle resets the line number. | |
4933 | Since <> never does an explicit close, line numbers increase across ARGV files | |
4934 | (but see examples under eof). | |
4935 | (Mnemonic: many programs use . to mean the current line number.) | |
4936 | .Ip $/ 8 | |
4937 | The input record separator, newline by default. | |
4938 | Works like | |
4939 | .IR awk 's | |
4940 | RS variable, including treating blank lines as delimiters | |
4941 | if set to the null string. | |
4942 | You may set it to a multicharacter string to match a multi-character | |
4943 | delimiter. | |
4944 | Note that setting it to "\en\en" means something slightly different | |
4945 | than setting it to "", if the file contains consecutive blank lines. | |
4946 | Setting it to "" will treat two or more consecutive blank lines as a single | |
4947 | blank line. | |
4948 | Setting it to "\en\en" will blindly assume that the next input character | |
4949 | belongs to the next paragraph, even if it's a newline. | |
4950 | (Mnemonic: / is used to delimit line boundaries when quoting poetry.) | |
4951 | .Ip $, 8 | |
4952 | The output field separator for the print operator. | |
4953 | Ordinarily the print operator simply prints out the comma separated fields | |
4954 | you specify. | |
4955 | In order to get behavior more like | |
4956 | .IR awk , | |
4957 | set this variable as you would set | |
4958 | .IR awk 's | |
4959 | OFS variable to specify what is printed between fields. | |
4960 | (Mnemonic: what is printed when there is a , in your print statement.) | |
4961 | .Ip $"" 8 | |
4962 | This is like $, except that it applies to array values interpolated into | |
4963 | a double-quoted string (or similar interpreted string). | |
4964 | Default is a space. | |
4965 | (Mnemonic: obvious, I think.) | |
4966 | .Ip $\e 8 | |
4967 | The output record separator for the print operator. | |
4968 | Ordinarily the print operator simply prints out the comma separated fields | |
4969 | you specify, with no trailing newline or record separator assumed. | |
4970 | In order to get behavior more like | |
4971 | .IR awk , | |
4972 | set this variable as you would set | |
4973 | .IR awk 's | |
4974 | ORS variable to specify what is printed at the end of the print. | |
4975 | (Mnemonic: you set $\e instead of adding \en at the end of the print. | |
4976 | Also, it's just like /, but it's what you get \*(L"back\*(R" from | |
4977 | .IR perl .) | |
4978 | .Ip $# 8 | |
4979 | The output format for printed numbers. | |
4980 | This variable is a half-hearted attempt to emulate | |
4981 | .IR awk 's | |
4982 | OFMT variable. | |
4983 | There are times, however, when | |
4984 | .I awk | |
4985 | and | |
4986 | .I perl | |
4987 | have differing notions of what | |
4988 | is in fact numeric. | |
4989 | Also, the initial value is %.20g rather than %.6g, so you need to set $# | |
4990 | explicitly to get | |
4991 | .IR awk 's | |
4992 | value. | |
4993 | (Mnemonic: # is the number sign.) | |
4994 | .Ip $% 8 | |
4995 | The current page number of the currently selected output channel. | |
4996 | (Mnemonic: % is page number in nroff.) | |
4997 | .Ip $= 8 | |
4998 | The current page length (printable lines) of the currently selected output | |
4999 | channel. | |
5000 | Default is 60. | |
5001 | (Mnemonic: = has horizontal lines.) | |
5002 | .Ip $\- 8 | |
5003 | The number of lines left on the page of the currently selected output channel. | |
5004 | (Mnemonic: lines_on_page \- lines_printed.) | |
5005 | .Ip $~ 8 | |
5006 | The name of the current report format for the currently selected output | |
5007 | channel. | |
5008 | Default is name of the filehandle. | |
5009 | (Mnemonic: brother to $^.) | |
5010 | .Ip $^ 8 | |
5011 | The name of the current top-of-page format for the currently selected output | |
5012 | channel. | |
5013 | Default is name of the filehandle with \*(L"_TOP\*(R" appended. | |
5014 | (Mnemonic: points to top of page.) | |
5015 | .Ip $| 8 | |
5016 | If set to nonzero, forces a flush after every write or print on the currently | |
5017 | selected output channel. | |
5018 | Default is 0. | |
5019 | Note that | |
5020 | .I STDOUT | |
5021 | will typically be line buffered if output is to the | |
5022 | terminal and block buffered otherwise. | |
5023 | Setting this variable is useful primarily when you are outputting to a pipe, | |
5024 | such as when you are running a | |
5025 | .I perl | |
5026 | script under rsh and want to see the | |
5027 | output as it's happening. | |
5028 | (Mnemonic: when you want your pipes to be piping hot.) | |
5029 | .Ip $$ 8 | |
5030 | The process number of the | |
5031 | .I perl | |
5032 | running this script. | |
5033 | (Mnemonic: same as shells.) | |
5034 | .Ip $? 8 | |
5035 | The status returned by the last pipe close, backtick (\`\`) command or | |
5036 | .I system | |
5037 | operator. | |
5038 | Note that this is the status word returned by the wait() system | |
5039 | call, so the exit value of the subprocess is actually ($? >> 8). | |
5040 | $? & 255 gives which signal, if any, the process died from, and whether | |
5041 | there was a core dump. | |
5042 | (Mnemonic: similar to sh and ksh.) | |
5043 | .Ip $& 8 4 | |
5044 | The string matched by the last successful pattern match | |
5045 | (not counting any matches hidden | |
5046 | within a BLOCK or eval enclosed by the current BLOCK). | |
5047 | (Mnemonic: like & in some editors.) | |
5048 | .Ip $\` 8 4 | |
5049 | The string preceding whatever was matched by the last successful pattern match | |
5050 | (not counting any matches hidden within a BLOCK or eval enclosed by the current | |
5051 | BLOCK). | |
5052 | (Mnemonic: \` often precedes a quoted string.) | |
5053 | .Ip $\' 8 4 | |
5054 | The string following whatever was matched by the last successful pattern match | |
5055 | (not counting any matches hidden within a BLOCK or eval enclosed by the current | |
5056 | BLOCK). | |
5057 | (Mnemonic: \' often follows a quoted string.) | |
5058 | Example: | |
5059 | .nf | |
5060 | ||
5061 | .ne 3 | |
5062 | $_ = \'abcdefghi\'; | |
5063 | /def/; | |
5064 | print "$\`:$&:$\'\en"; # prints abc:def:ghi | |
5065 | ||
5066 | .fi | |
5067 | .Ip $+ 8 4 | |
5068 | The last bracket matched by the last search pattern. | |
5069 | This is useful if you don't know which of a set of alternative patterns | |
5070 | matched. | |
5071 | For example: | |
5072 | .nf | |
5073 | ||
5074 | /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+); | |
5075 | ||
5076 | .fi | |
5077 | (Mnemonic: be positive and forward looking.) | |
5078 | .Ip $* 8 2 | |
5079 | Set to 1 to do multiline matching within a string, 0 to tell | |
5080 | .I perl | |
5081 | that it can assume that strings contain a single line, for the purpose | |
5082 | of optimizing pattern matches. | |
5083 | Pattern matches on strings containing multiple newlines can produce confusing | |
5084 | results when $* is 0. | |
5085 | Default is 0. | |
5086 | (Mnemonic: * matches multiple things.) | |
5087 | Note that this variable only influences the interpretation of ^ and $. | |
5088 | A literal newline can be searched for even when $* == 0. | |
5089 | .Ip $0 8 | |
5090 | Contains the name of the file containing the | |
5091 | .I perl | |
5092 | script being executed. | |
5093 | Assigning to $0 modifies the argument area that the ps(1) program sees. | |
5094 | (Mnemonic: same as sh and ksh.) | |
5095 | .Ip $<digit> 8 | |
5096 | Contains the subpattern from the corresponding set of parentheses in the last | |
5097 | pattern matched, not counting patterns matched in nested blocks that have | |
5098 | been exited already. | |
5099 | (Mnemonic: like \edigit.) | |
5100 | .Ip $[ 8 2 | |
5101 | The index of the first element in an array, and of the first character in | |
5102 | a substring. | |
5103 | Default is 0, but you could set it to 1 to make | |
5104 | .I perl | |
5105 | behave more like | |
5106 | .I awk | |
5107 | (or Fortran) | |
5108 | when subscripting and when evaluating the index() and substr() functions. | |
5109 | (Mnemonic: [ begins subscripts.) | |
5110 | .Ip $] 8 2 | |
5111 | The string printed out when you say \*(L"perl -v\*(R". | |
5112 | It can be used to determine at the beginning of a script whether the perl | |
5113 | interpreter executing the script is in the right range of versions. | |
5114 | If used in a numeric context, returns the version + patchlevel / 1000. | |
5115 | Example: | |
5116 | .nf | |
5117 | ||
5118 | .ne 8 | |
5119 | # see if getc is available | |
5120 | ($version,$patchlevel) = | |
5121 | $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/; | |
5122 | print STDERR "(No filename completion available.)\en" | |
5123 | if $version * 1000 + $patchlevel < 2016; | |
5124 | ||
5125 | or, used numerically, | |
5126 | ||
5127 | warn "No checksumming!\en" if $] < 3.019; | |
5128 | ||
5129 | .fi | |
5130 | (Mnemonic: Is this version of perl in the right bracket?) | |
5131 | .Ip $; 8 2 | |
5132 | The subscript separator for multi-dimensional array emulation. | |
5133 | If you refer to an associative array element as | |
5134 | .nf | |
5135 | $foo{$a,$b,$c} | |
5136 | ||
5137 | it really means | |
5138 | ||
5139 | $foo{join($;, $a, $b, $c)} | |
5140 | ||
5141 | But don't put | |
5142 | ||
5143 | @foo{$a,$b,$c} # a slice\*(--note the @ | |
5144 | ||
5145 | which means | |
5146 | ||
5147 | ($foo{$a},$foo{$b},$foo{$c}) | |
5148 | ||
5149 | .fi | |
5150 | Default is "\e034", the same as SUBSEP in | |
5151 | .IR awk . | |
5152 | Note that if your keys contain binary data there might not be any safe | |
5153 | value for $;. | |
5154 | (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon. | |
5155 | Yeah, I know, it's pretty lame, but $, is already taken for something more | |
5156 | important.) | |
5157 | .Ip $! 8 2 | |
5158 | If used in a numeric context, yields the current value of errno, with all the | |
5159 | usual caveats. | |
5160 | (This means that you shouldn't depend on the value of $! to be anything | |
5161 | in particular unless you've gotten a specific error return indicating a | |
5162 | system error.) | |
5163 | If used in a string context, yields the corresponding system error string. | |
5164 | You can assign to $! in order to set errno | |
5165 | if, for instance, you want $! to return the string for error n, or you want | |
5166 | to set the exit value for the die operator. | |
5167 | (Mnemonic: What just went bang?) | |
5168 | .Ip $@ 8 2 | |
5169 | The perl syntax error message from the last eval command. | |
5170 | If null, the last eval parsed and executed correctly (although the operations | |
5171 | you invoked may have failed in the normal fashion). | |
5172 | (Mnemonic: Where was the syntax error \*(L"at\*(R"?) | |
5173 | .Ip $< 8 2 | |
5174 | The real uid of this process. | |
5175 | (Mnemonic: it's the uid you came FROM, if you're running setuid.) | |
5176 | .Ip $> 8 2 | |
5177 | The effective uid of this process. | |
5178 | Example: | |
5179 | .nf | |
5180 | ||
5181 | .ne 2 | |
5182 | $< = $>; # set real uid to the effective uid | |
5183 | ($<,$>) = ($>,$<); # swap real and effective uid | |
5184 | ||
5185 | .fi | |
5186 | (Mnemonic: it's the uid you went TO, if you're running setuid.) | |
5187 | Note: $< and $> can only be swapped on machines supporting setreuid(). | |
5188 | .Ip $( 8 2 | |
5189 | The real gid of this process. | |
5190 | If you are on a machine that supports membership in multiple groups | |
5191 | simultaneously, gives a space separated list of groups you are in. | |
5192 | The first number is the one returned by getgid(), and the subsequent ones | |
5193 | by getgroups(), one of which may be the same as the first number. | |
5194 | (Mnemonic: parentheses are used to GROUP things. | |
5195 | The real gid is the group you LEFT, if you're running setgid.) | |
5196 | .Ip $) 8 2 | |
5197 | The effective gid of this process. | |
5198 | If you are on a machine that supports membership in multiple groups | |
5199 | simultaneously, gives a space separated list of groups you are in. | |
5200 | The first number is the one returned by getegid(), and the subsequent ones | |
5201 | by getgroups(), one of which may be the same as the first number. | |
5202 | (Mnemonic: parentheses are used to GROUP things. | |
5203 | The effective gid is the group that's RIGHT for you, if you're running setgid.) | |
5204 | .Sp | |
5205 | Note: $<, $>, $( and $) can only be set on machines that support the | |
5206 | corresponding set[re][ug]id() routine. | |
5207 | $( and $) can only be swapped on machines supporting setregid(). | |
5208 | .Ip $: 8 2 | |
5209 | The current set of characters after which a string may be broken to | |
5210 | fill continuation fields (starting with ^) in a format. | |
5211 | Default is "\ \en-", to break on whitespace or hyphens. | |
5212 | (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.) | |
5213 | .Ip $^D 8 2 | |
5214 | The current value of the debugging flags. | |
5215 | (Mnemonic: value of | |
5216 | .B \-D | |
5217 | switch.) | |
5218 | .Ip $^F 8 2 | |
5219 | The maximum system file descriptor, ordinarily 2. System file descriptors | |
5220 | are passed to subprocesses, while higher file descriptors are not. | |
5221 | During an open, system file descriptors are preserved even if the open | |
5222 | fails. Ordinary file descriptors are closed before the open is attempted. | |
5223 | .Ip $^I 8 2 | |
5224 | The current value of the inplace-edit extension. | |
5225 | Use undef to disable inplace editing. | |
5226 | (Mnemonic: value of | |
5227 | .B \-i | |
5228 | switch.) | |
5229 | .Ip $^L 8 2 | |
5230 | What formats output to perform a formfeed. Default is \ef. | |
5231 | .Ip $^P 8 2 | |
5232 | The internal flag that the debugger clears so that it doesn't | |
5233 | debug itself. You could conceivable disable debugging yourself | |
5234 | by clearing it. | |
5235 | .Ip $^T 8 2 | |
5236 | The time at which the script began running, in seconds since the epoch. | |
5237 | The values returned by the | |
5238 | .B \-M , | |
5239 | .B \-A | |
5240 | and | |
5241 | .B \-C | |
5242 | filetests are based on this value. | |
5243 | .Ip $^W 8 2 | |
5244 | The current value of the warning switch. | |
5245 | (Mnemonic: related to the | |
5246 | .B \-w | |
5247 | switch.) | |
5248 | .Ip $^X 8 2 | |
5249 | The name that Perl itself was executed as, from argv[0]. | |
5250 | .Ip $ARGV 8 3 | |
5251 | contains the name of the current file when reading from <>. | |
5252 | .Ip @ARGV 8 3 | |
5253 | The array ARGV contains the command line arguments intended for the script. | |
5254 | Note that $#ARGV is the generally number of arguments minus one, since | |
5255 | $ARGV[0] is the first argument, NOT the command name. | |
5256 | See $0 for the command name. | |
5257 | .Ip @INC 8 3 | |
5258 | The array INC contains the list of places to look for | |
5259 | .I perl | |
5260 | scripts to be | |
5261 | evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command. | |
5262 | It initially consists of the arguments to any | |
5263 | .B \-I | |
5264 | command line switches, followed | |
5265 | by the default | |
5266 | .I perl | |
5267 | library, probably \*(L"/usr/local/lib/perl\*(R", | |
5268 | followed by \*(L".\*(R", to represent the current directory. | |
5269 | .Ip %INC 8 3 | |
5270 | The associative array INC contains entries for each filename that has | |
5271 | been included via \*(L"do\*(R" or \*(L"require\*(R". | |
5272 | The key is the filename you specified, and the value is the location of | |
5273 | the file actually found. | |
5274 | The \*(L"require\*(R" command uses this array to determine whether | |
5275 | a given file has already been included. | |
5276 | .Ip $ENV{expr} 8 2 | |
5277 | The associative array ENV contains your current environment. | |
5278 | Setting a value in ENV changes the environment for child processes. | |
5279 | .Ip $SIG{expr} 8 2 | |
5280 | The associative array SIG is used to set signal handlers for various signals. | |
5281 | Example: | |
5282 | .nf | |
5283 | ||
5284 | .ne 12 | |
5285 | sub handler { # 1st argument is signal name | |
5286 | local($sig) = @_; | |
5287 | print "Caught a SIG$sig\-\|\-shutting down\en"; | |
5288 | close(LOG); | |
5289 | exit(0); | |
5290 | } | |
5291 | ||
5292 | $SIG{\'INT\'} = \'handler\'; | |
5293 | $SIG{\'QUIT\'} = \'handler\'; | |
5294 | .\|.\|. | |
5295 | $SIG{\'INT\'} = \'DEFAULT\'; # restore default action | |
5296 | $SIG{\'QUIT\'} = \'IGNORE\'; # ignore SIGQUIT | |
5297 | ||
5298 | .fi | |
5299 | The SIG array only contains values for the signals actually set within | |
5300 | the perl script. | |
5301 | .Sh "Packages" | |
5302 | Perl provides a mechanism for alternate namespaces to protect packages from | |
5303 | stomping on each others variables. | |
5304 | By default, a perl script starts compiling into the package known as \*(L"main\*(R". | |
5305 | By use of the | |
5306 | .I package | |
5307 | declaration, you can switch namespaces. | |
5308 | The scope of the package declaration is from the declaration itself to the end | |
5309 | of the enclosing block (the same scope as the local() operator). | |
5310 | Typically it would be the first declaration in a file to be included by | |
5311 | the \*(L"require\*(R" operator. | |
5312 | You can switch into a package in more than one place; it merely influences | |
5313 | which symbol table is used by the compiler for the rest of that block. | |
5314 | You can refer to variables and filehandles in other packages by prefixing | |
5315 | the identifier with the package name and a single quote. | |
5316 | If the package name is null, the \*(L"main\*(R" package as assumed. | |
5317 | .PP | |
5318 | Only identifiers starting with letters are stored in the packages symbol | |
5319 | table. | |
5320 | All other symbols are kept in package \*(L"main\*(R". | |
5321 | In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC | |
5322 | and SIG are forced to be in package \*(L"main\*(R", even when used for | |
5323 | other purposes than their built-in one. | |
5324 | Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R" | |
5325 | or \*(L"y\*(R", the you can't use the qualified form of an identifier since it | |
5326 | will be interpreted instead as a pattern match, a substitution | |
5327 | or a translation. | |
5328 | .PP | |
5329 | Eval'ed strings are compiled in the package in which the eval was compiled | |
5330 | in. | |
5331 | (Assignments to $SIG{}, however, assume the signal handler specified is in the | |
5332 | main package. | |
5333 | Qualify the signal handler name if you wish to have a signal handler in | |
5334 | a package.) | |
5335 | For an example, examine perldb.pl in the perl library. | |
5336 | It initially switches to the DB package so that the debugger doesn't interfere | |
5337 | with variables in the script you are trying to debug. | |
5338 | At various points, however, it temporarily switches back to the main package | |
5339 | to evaluate various expressions in the context of the main package. | |
5340 | .PP | |
5341 | The symbol table for a package happens to be stored in the associative array | |
5342 | of that name prepended with an underscore. | |
5343 | The value in each entry of the associative array is | |
5344 | what you are referring to when you use the *name notation. | |
5345 | In fact, the following have the same effect (in package main, anyway), | |
5346 | though the first is more | |
5347 | efficient because it does the symbol table lookups at compile time: | |
5348 | .nf | |
5349 | ||
5350 | .ne 2 | |
5351 | local(*foo) = *bar; | |
5352 | local($_main{'foo'}) = $_main{'bar'}; | |
5353 | ||
5354 | .fi | |
5355 | You can use this to print out all the variables in a package, for instance. | |
5356 | Here is dumpvar.pl from the perl library: | |
5357 | .nf | |
5358 | .ne 11 | |
5359 | package dumpvar; | |
5360 | ||
5361 | sub main'dumpvar { | |
5362 | \& ($package) = @_; | |
5363 | \& local(*stab) = eval("*_$package"); | |
5364 | \& while (($key,$val) = each(%stab)) { | |
5365 | \& { | |
5366 | \& local(*entry) = $val; | |
5367 | \& if (defined $entry) { | |
5368 | \& print "\e$$key = '$entry'\en"; | |
5369 | \& } | |
5370 | .ne 7 | |
5371 | \& if (defined @entry) { | |
5372 | \& print "\e@$key = (\en"; | |
5373 | \& foreach $num ($[ .. $#entry) { | |
5374 | \& print " $num\et'",$entry[$num],"'\en"; | |
5375 | \& } | |
5376 | \& print ")\en"; | |
5377 | \& } | |
5378 | .ne 10 | |
5379 | \& if ($key ne "_$package" && defined %entry) { | |
5380 | \& print "\e%$key = (\en"; | |
5381 | \& foreach $key (sort keys(%entry)) { | |
5382 | \& print " $key\et'",$entry{$key},"'\en"; | |
5383 | \& } | |
5384 | \& print ")\en"; | |
5385 | \& } | |
5386 | \& } | |
5387 | \& } | |
5388 | } | |
5389 | ||
5390 | .fi | |
5391 | Note that, even though the subroutine is compiled in package dumpvar, the | |
5392 | name of the subroutine is qualified so that its name is inserted into package | |
5393 | \*(L"main\*(R". | |
5394 | .Sh "Style" | |
5395 | Each programmer will, of course, have his or her own preferences in regards | |
5396 | to formatting, but there are some general guidelines that will make your | |
5397 | programs easier to read. | |
5398 | .Ip 1. 4 4 | |
5399 | Just because you CAN do something a particular way doesn't mean that | |
5400 | you SHOULD do it that way. | |
5401 | .I Perl | |
5402 | is designed to give you several ways to do anything, so consider picking | |
5403 | the most readable one. | |
5404 | For instance | |
5405 | ||
5406 | open(FOO,$foo) || die "Can't open $foo: $!"; | |
5407 | ||
5408 | is better than | |
5409 | ||
5410 | die "Can't open $foo: $!" unless open(FOO,$foo); | |
5411 | ||
5412 | because the second way hides the main point of the statement in a | |
5413 | modifier. | |
5414 | On the other hand | |
5415 | ||
5416 | print "Starting analysis\en" if $verbose; | |
5417 | ||
5418 | is better than | |
5419 | ||
5420 | $verbose && print "Starting analysis\en"; | |
5421 | ||
5422 | since the main point isn't whether the user typed -v or not. | |
5423 | .Sp | |
5424 | Similarly, just because an operator lets you assume default arguments | |
5425 | doesn't mean that you have to make use of the defaults. | |
5426 | The defaults are there for lazy systems programmers writing one-shot | |
5427 | programs. | |
5428 | If you want your program to be readable, consider supplying the argument. | |
5429 | .Sp | |
5430 | Along the same lines, just because you | |
5431 | .I can | |
5432 | omit parentheses in many places doesn't mean that you ought to: | |
5433 | .nf | |
5434 | ||
5435 | return print reverse sort num values array; | |
5436 | return print(reverse(sort num (values(%array)))); | |
5437 | ||
5438 | .fi | |
5439 | When in doubt, parenthesize. | |
5440 | At the very least it will let some poor schmuck bounce on the % key in vi. | |
5441 | .Sp | |
5442 | Even if you aren't in doubt, consider the mental welfare of the person who | |
5443 | has to maintain the code after you, and who will probably put parens in | |
5444 | the wrong place. | |
5445 | .Ip 2. 4 4 | |
5446 | Don't go through silly contortions to exit a loop at the top or the | |
5447 | bottom, when | |
5448 | .I perl | |
5449 | provides the "last" operator so you can exit in the middle. | |
5450 | Just outdent it a little to make it more visible: | |
5451 | .nf | |
5452 | ||
5453 | .ne 7 | |
5454 | line: | |
5455 | for (;;) { | |
5456 | statements; | |
5457 | last line if $foo; | |
5458 | next line if /^#/; | |
5459 | statements; | |
5460 | } | |
5461 | ||
5462 | .fi | |
5463 | .Ip 3. 4 4 | |
5464 | Don't be afraid to use loop labels\*(--they're there to enhance readability as | |
5465 | well as to allow multi-level loop breaks. | |
5466 | See last example. | |
5467 | .Ip 4. 4 4 | |
5468 | For portability, when using features that may not be implemented on every | |
5469 | machine, test the construct in an eval to see if it fails. | |
5470 | If you know what version or patchlevel a particular feature was implemented, | |
5471 | you can test $] to see if it will be there. | |
5472 | .Ip 5. 4 4 | |
5473 | Choose mnemonic identifiers. | |
5474 | .Ip 6. 4 4 | |
5475 | Be consistent. | |
5476 | .Sh "Debugging" | |
5477 | If you invoke | |
5478 | .I perl | |
5479 | with a | |
5480 | .B \-d | |
5481 | switch, your script will be run under a debugging monitor. | |
5482 | It will halt before the first executable statement and ask you for a | |
5483 | command, such as: | |
5484 | .Ip "h" 12 4 | |
5485 | Prints out a help message. | |
5486 | .Ip "T" 12 4 | |
5487 | Stack trace. | |
5488 | .Ip "s" 12 4 | |
5489 | Single step. | |
5490 | Executes until it reaches the beginning of another statement. | |
5491 | .Ip "n" 12 4 | |
5492 | Next. | |
5493 | Executes over subroutine calls, until it reaches the beginning of the | |
5494 | next statement. | |
5495 | .Ip "f" 12 4 | |
5496 | Finish. | |
5497 | Executes statements until it has finished the current subroutine. | |
5498 | .Ip "c" 12 4 | |
5499 | Continue. | |
5500 | Executes until the next breakpoint is reached. | |
5501 | .Ip "c line" 12 4 | |
5502 | Continue to the specified line. | |
5503 | Inserts a one-time-only breakpoint at the specified line. | |
5504 | .Ip "<CR>" 12 4 | |
5505 | Repeat last n or s. | |
5506 | .Ip "l min+incr" 12 4 | |
5507 | List incr+1 lines starting at min. | |
5508 | If min is omitted, starts where last listing left off. | |
5509 | If incr is omitted, previous value of incr is used. | |
5510 | .Ip "l min-max" 12 4 | |
5511 | List lines in the indicated range. | |
5512 | .Ip "l line" 12 4 | |
5513 | List just the indicated line. | |
5514 | .Ip "l" 12 4 | |
5515 | List next window. | |
5516 | .Ip "-" 12 4 | |
5517 | List previous window. | |
5518 | .Ip "w line" 12 4 | |
5519 | List window around line. | |
5520 | .Ip "l subname" 12 4 | |
5521 | List subroutine. | |
5522 | If it's a long subroutine it just lists the beginning. | |
5523 | Use \*(L"l\*(R" to list more. | |
5524 | .Ip "/pattern/" 12 4 | |
5525 | Regular expression search forward for pattern; the final / is optional. | |
5526 | .Ip "?pattern?" 12 4 | |
5527 | Regular expression search backward for pattern; the final ? is optional. | |
5528 | .Ip "L" 12 4 | |
5529 | List lines that have breakpoints or actions. | |
5530 | .Ip "S" 12 4 | |
5531 | Lists the names of all subroutines. | |
5532 | .Ip "t" 12 4 | |
5533 | Toggle trace mode on or off. | |
5534 | .Ip "b line condition" 12 4 | |
5535 | Set a breakpoint. | |
5536 | If line is omitted, sets a breakpoint on the | |
5537 | line that is about to be executed. | |
5538 | If a condition is specified, it is evaluated each time the statement is | |
5539 | reached and a breakpoint is taken only if the condition is true. | |
5540 | Breakpoints may only be set on lines that begin an executable statement. | |
5541 | .Ip "b subname condition" 12 4 | |
5542 | Set breakpoint at first executable line of subroutine. | |
5543 | .Ip "d line" 12 4 | |
5544 | Delete breakpoint. | |
5545 | If line is omitted, deletes the breakpoint on the | |
5546 | line that is about to be executed. | |
5547 | .Ip "D" 12 4 | |
5548 | Delete all breakpoints. | |
5549 | .Ip "a line command" 12 4 | |
5550 | Set an action for line. | |
5551 | A multi-line command may be entered by backslashing the newlines. | |
5552 | .Ip "A" 12 4 | |
5553 | Delete all line actions. | |
5554 | .Ip "< command" 12 4 | |
5555 | Set an action to happen before every debugger prompt. | |
5556 | A multi-line command may be entered by backslashing the newlines. | |
5557 | .Ip "> command" 12 4 | |
5558 | Set an action to happen after the prompt when you've just given a command | |
5559 | to return to executing the script. | |
5560 | A multi-line command may be entered by backslashing the newlines. | |
5561 | .Ip "V package" 12 4 | |
5562 | List all variables in package. | |
5563 | Default is main package. | |
5564 | .Ip "! number" 12 4 | |
5565 | Redo a debugging command. | |
5566 | If number is omitted, redoes the previous command. | |
5567 | .Ip "! -number" 12 4 | |
5568 | Redo the command that was that many commands ago. | |
5569 | .Ip "H -number" 12 4 | |
5570 | Display last n commands. | |
5571 | Only commands longer than one character are listed. | |
5572 | If number is omitted, lists them all. | |
5573 | .Ip "q or ^D" 12 4 | |
5574 | Quit. | |
5575 | .Ip "command" 12 4 | |
5576 | Execute command as a perl statement. | |
5577 | A missing semicolon will be supplied. | |
5578 | .Ip "p expr" 12 4 | |
5579 | Same as \*(L"print DB'OUT expr\*(R". | |
5580 | The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT | |
5581 | may be redirected to. | |
5582 | .PP | |
5583 | If you want to modify the debugger, copy perldb.pl from the perl library | |
5584 | to your current directory and modify it as necessary. | |
5585 | (You'll also have to put -I. on your command line.) | |
5586 | You can do some customization by setting up a .perldb file which contains | |
5587 | initialization code. | |
5588 | For instance, you could make aliases like these: | |
5589 | .nf | |
5590 | ||
5591 | $DB'alias{'len'} = 's/^len(.*)/p length($1)/'; | |
5592 | $DB'alias{'stop'} = 's/^stop (at|in)/b/'; | |
5593 | $DB'alias{'.'} = | |
5594 | 's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/'; | |
5595 | ||
5596 | .fi | |
5597 | .Sh "Setuid Scripts" | |
5598 | .I Perl | |
5599 | is designed to make it easy to write secure setuid and setgid scripts. | |
5600 | Unlike shells, which are based on multiple substitution passes on each line | |
5601 | of the script, | |
5602 | .I perl | |
5603 | uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R". | |
5604 | Additionally, since the language has more built-in functionality, it | |
5605 | has to rely less upon external (and possibly untrustworthy) programs to | |
5606 | accomplish its purposes. | |
5607 | .PP | |
5608 | In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically | |
5609 | insecure, but this kernel feature can be disabled. | |
5610 | If it is, | |
5611 | .I perl | |
5612 | can emulate the setuid and setgid mechanism when it notices the otherwise | |
5613 | useless setuid/gid bits on perl scripts. | |
5614 | If the kernel feature isn't disabled, | |
5615 | .I perl | |
5616 | will complain loudly that your setuid script is insecure. | |
5617 | You'll need to either disable the kernel setuid script feature, or put | |
5618 | a C wrapper around the script. | |
5619 | .PP | |
5620 | When perl is executing a setuid script, it takes special precautions to | |
5621 | prevent you from falling into any obvious traps. | |
5622 | (In some ways, a perl script is more secure than the corresponding | |
5623 | C program.) | |
5624 | Any command line argument, environment variable, or input is marked as | |
5625 | \*(L"tainted\*(R", and may not be used, directly or indirectly, in any | |
5626 | command that invokes a subshell, or in any command that modifies files, | |
5627 | directories or processes. | |
5628 | Any variable that is set within an expression that has previously referenced | |
5629 | a tainted value also becomes tainted (even if it is logically impossible | |
5630 | for the tainted value to influence the variable). | |
5631 | For example: | |
5632 | .nf | |
5633 | ||
5634 | .ne 5 | |
5635 | $foo = shift; # $foo is tainted | |
5636 | $bar = $foo,\'bar\'; # $bar is also tainted | |
5637 | $xxx = <>; # Tainted | |
5638 | $path = $ENV{\'PATH\'}; # Tainted, but see below | |
5639 | $abc = \'abc\'; # Not tainted | |
5640 | ||
5641 | .ne 4 | |
5642 | system "echo $foo"; # Insecure | |
5643 | system "/bin/echo", $foo; # Secure (doesn't use sh) | |
5644 | system "echo $bar"; # Insecure | |
5645 | system "echo $abc"; # Insecure until PATH set | |
5646 | ||
5647 | .ne 5 | |
5648 | $ENV{\'PATH\'} = \'/bin:/usr/bin\'; | |
5649 | $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\'; | |
5650 | ||
5651 | $path = $ENV{\'PATH\'}; # Not tainted | |
5652 | system "echo $abc"; # Is secure now! | |
5653 | ||
5654 | .ne 5 | |
5655 | open(FOO,"$foo"); # OK | |
5656 | open(FOO,">$foo"); # Not OK | |
5657 | ||
5658 | open(FOO,"echo $foo|"); # Not OK, but... | |
5659 | open(FOO,"-|") || exec \'echo\', $foo; # OK | |
5660 | ||
5661 | $zzz = `echo $foo`; # Insecure, zzz tainted | |
5662 | ||
5663 | unlink $abc,$foo; # Insecure | |
5664 | umask $foo; # Insecure | |
5665 | ||
5666 | .ne 3 | |
5667 | exec "echo $foo"; # Insecure | |
5668 | exec "echo", $foo; # Secure (doesn't use sh) | |
5669 | exec "sh", \'-c\', $foo; # Considered secure, alas | |
5670 | ||
5671 | .fi | |
5672 | The taintedness is associated with each scalar value, so some elements | |
5673 | of an array can be tainted, and others not. | |
5674 | .PP | |
5675 | If you try to do something insecure, you will get a fatal error saying | |
5676 | something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R". | |
5677 | Note that you can still write an insecure system call or exec, | |
5678 | but only by explicitly doing something like the last example above. | |
5679 | You can also bypass the tainting mechanism by referencing | |
5680 | subpatterns\*(--\c | |
5681 | .I perl | |
5682 | presumes that if you reference a substring using $1, $2, etc, you knew | |
5683 | what you were doing when you wrote the pattern: | |
5684 | .nf | |
5685 | ||
5686 | $ARGV[0] =~ /^\-P(\ew+)$/; | |
5687 | $printer = $1; # Not tainted | |
5688 | ||
5689 | .fi | |
5690 | This is fairly secure since \ew+ doesn't match shell metacharacters. | |
5691 | Use of .+ would have been insecure, but | |
5692 | .I perl | |
5693 | doesn't check for that, so you must be careful with your patterns. | |
5694 | This is the ONLY mechanism for untainting user supplied filenames if you | |
5695 | want to do file operations on them (unless you make $> equal to $<). | |
5696 | .PP | |
5697 | It's also possible to get into trouble with other operations that don't care | |
5698 | whether they use tainted values. | |
5699 | Make judicious use of the file tests in dealing with any user-supplied | |
5700 | filenames. | |
5701 | When possible, do opens and such after setting $> = $<. | |
5702 | .I Perl | |
5703 | doesn't prevent you from opening tainted filenames for reading, so be | |
5704 | careful what you print out. | |
5705 | The tainting mechanism is intended to prevent stupid mistakes, not to remove | |
5706 | the need for thought. | |
5707 | .SH ENVIRONMENT | |
5708 | .Ip HOME 12 4 | |
5709 | Used if chdir has no argument. | |
5710 | .Ip LOGDIR 12 4 | |
5711 | Used if chdir has no argument and HOME is not set. | |
5712 | .Ip PATH 12 4 | |
5713 | Used in executing subprocesses, and in finding the script if \-S | |
5714 | is used. | |
5715 | .Ip PERLLIB 12 4 | |
5716 | A colon-separated list of directories in which to look for Perl library | |
5717 | files before looking in the standard library and the current directory. | |
5718 | .Ip PERLDB 12 4 | |
5719 | The command used to get the debugger code. If unset, uses | |
5720 | .br | |
5721 | ||
5722 | require 'perldb.pl' | |
5723 | ||
5724 | .PP | |
5725 | Apart from these, | |
5726 | .I perl | |
5727 | uses no other environment variables, except to make them available | |
5728 | to the script being executed, and to child processes. | |
5729 | However, scripts running setuid would do well to execute the following lines | |
5730 | before doing anything else, just to keep people honest: | |
5731 | .nf | |
5732 | ||
5733 | .ne 3 | |
5734 | $ENV{\'PATH\'} = \'/bin:/usr/bin\'; # or whatever you need | |
5735 | $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\'; | |
5736 | $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\'; | |
5737 | ||
5738 | .fi | |
5739 | .SH AUTHOR | |
5740 | Larry Wall <lwall@netlabs.com> | |
5741 | .br | |
5742 | MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk> | |
5743 | .SH FILES | |
5744 | /tmp/perl\-eXXXXXX temporary file for | |
5745 | .B \-e | |
5746 | commands. | |
5747 | .SH SEE ALSO | |
5748 | a2p awk to perl translator | |
5749 | .br | |
5750 | s2p sed to perl translator | |
5751 | .SH DIAGNOSTICS | |
5752 | Compilation errors will tell you the line number of the error, with an | |
5753 | indication of the next token or token type that was to be examined. | |
5754 | (In the case of a script passed to | |
5755 | .I perl | |
5756 | via | |
5757 | .B \-e | |
5758 | switches, each | |
5759 | .B \-e | |
5760 | is counted as one line.) | |
5761 | .PP | |
5762 | Setuid scripts have additional constraints that can produce error messages | |
5763 | such as \*(L"Insecure dependency\*(R". | |
5764 | See the section on setuid scripts. | |
5765 | .SH TRAPS | |
5766 | Accustomed | |
5767 | .IR awk | |
5768 | users should take special note of the following: | |
5769 | .Ip * 4 2 | |
5770 | Semicolons are required after all simple statements in | |
5771 | .I perl | |
5772 | (except at the end of a block). | |
5773 | Newline is not a statement delimiter. | |
5774 | .Ip * 4 2 | |
5775 | Curly brackets are required on ifs and whiles. | |
5776 | .Ip * 4 2 | |
5777 | Variables begin with $ or @ in | |
5778 | .IR perl . | |
5779 | .Ip * 4 2 | |
5780 | Arrays index from 0 unless you set $[. | |
5781 | Likewise string positions in substr() and index(). | |
5782 | .Ip * 4 2 | |
5783 | You have to decide whether your array has numeric or string indices. | |
5784 | .Ip * 4 2 | |
5785 | Associative array values do not spring into existence upon mere reference. | |
5786 | .Ip * 4 2 | |
5787 | You have to decide whether you want to use string or numeric comparisons. | |
5788 | .Ip * 4 2 | |
5789 | Reading an input line does not split it for you. You get to split it yourself | |
5790 | to an array. | |
5791 | And the | |
5792 | .I split | |
5793 | operator has different arguments. | |
5794 | .Ip * 4 2 | |
5795 | The current input line is normally in $_, not $0. | |
5796 | It generally does not have the newline stripped. | |
5797 | ($0 is the name of the program executed.) | |
5798 | .Ip * 4 2 | |
5799 | $<digit> does not refer to fields\*(--it refers to substrings matched by the last | |
5800 | match pattern. | |
5801 | .Ip * 4 2 | |
5802 | The | |
5803 | .I print | |
5804 | statement does not add field and record separators unless you set | |
5805 | $, and $\e. | |
5806 | .Ip * 4 2 | |
5807 | You must open your files before you print to them. | |
5808 | .Ip * 4 2 | |
5809 | The range operator is \*(L".\|.\*(R", not comma. | |
5810 | (The comma operator works as in C.) | |
5811 | .Ip * 4 2 | |
5812 | The match operator is \*(L"=~\*(R", not \*(L"~\*(R". | |
5813 | (\*(L"~\*(R" is the one's complement operator, as in C.) | |
5814 | .Ip * 4 2 | |
5815 | The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R". | |
5816 | (\*(L"^\*(R" is the XOR operator, as in C.) | |
5817 | .Ip * 4 2 | |
5818 | The concatenation operator is \*(L".\*(R", not the null string. | |
5819 | (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable, | |
5820 | since the third slash would be interpreted as a division operator\*(--the | |
5821 | tokener is in fact slightly context sensitive for operators like /, ?, and <. | |
5822 | And in fact, . itself can be the beginning of a number.) | |
5823 | .Ip * 4 2 | |
5824 | .IR Next , | |
5825 | .I exit | |
5826 | and | |
5827 | .I continue | |
5828 | work differently. | |
5829 | .Ip * 4 2 | |
5830 | The following variables work differently | |
5831 | .nf | |
5832 | ||
5833 | Awk \h'|2.5i'Perl | |
5834 | ARGC \h'|2.5i'$#ARGV | |
5835 | ARGV[0] \h'|2.5i'$0 | |
5836 | FILENAME\h'|2.5i'$ARGV | |
5837 | FNR \h'|2.5i'$. \- something | |
5838 | FS \h'|2.5i'(whatever you like) | |
5839 | NF \h'|2.5i'$#Fld, or some such | |
5840 | NR \h'|2.5i'$. | |
5841 | OFMT \h'|2.5i'$# | |
5842 | OFS \h'|2.5i'$, | |
5843 | ORS \h'|2.5i'$\e | |
5844 | RLENGTH \h'|2.5i'length($&) | |
5845 | RS \h'|2.5i'$/ | |
5846 | RSTART \h'|2.5i'length($\`) | |
5847 | SUBSEP \h'|2.5i'$; | |
5848 | ||
5849 | .fi | |
5850 | .Ip * 4 2 | |
5851 | When in doubt, run the | |
5852 | .I awk | |
5853 | construct through a2p and see what it gives you. | |
5854 | .PP | |
5855 | Cerebral C programmers should take note of the following: | |
5856 | .Ip * 4 2 | |
5857 | Curly brackets are required on ifs and whiles. | |
5858 | .Ip * 4 2 | |
5859 | You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R" | |
5860 | .Ip * 4 2 | |
5861 | .I Break | |
5862 | and | |
5863 | .I continue | |
5864 | become | |
5865 | .I last | |
5866 | and | |
5867 | .IR next , | |
5868 | respectively. | |
5869 | .Ip * 4 2 | |
5870 | There's no switch statement. | |
5871 | .Ip * 4 2 | |
5872 | Variables begin with $ or @ in | |
5873 | .IR perl . | |
5874 | .Ip * 4 2 | |
5875 | Printf does not implement *. | |
5876 | .Ip * 4 2 | |
5877 | Comments begin with #, not /*. | |
5878 | .Ip * 4 2 | |
5879 | You can't take the address of anything. | |
5880 | .Ip * 4 2 | |
5881 | ARGV must be capitalized. | |
5882 | .Ip * 4 2 | |
5883 | The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0. | |
5884 | .Ip * 4 2 | |
5885 | Signal handlers deal with signal names, not numbers. | |
5886 | .PP | |
5887 | Seasoned | |
5888 | .I sed | |
5889 | programmers should take note of the following: | |
5890 | .Ip * 4 2 | |
5891 | Backreferences in substitutions use $ rather than \e. | |
5892 | .Ip * 4 2 | |
5893 | The pattern matching metacharacters (, ), and | do not have backslashes in front. | |
5894 | .Ip * 4 2 | |
5895 | The range operator is .\|. rather than comma. | |
5896 | .PP | |
5897 | Sharp shell programmers should take note of the following: | |
5898 | .Ip * 4 2 | |
5899 | The backtick operator does variable interpretation without regard to the | |
5900 | presence of single quotes in the command. | |
5901 | .Ip * 4 2 | |
5902 | The backtick operator does no translation of the return value, unlike csh. | |
5903 | .Ip * 4 2 | |
5904 | Shells (especially csh) do several levels of substitution on each command line. | |
5905 | .I Perl | |
5906 | does substitution only in certain constructs such as double quotes, | |
5907 | backticks, angle brackets and search patterns. | |
5908 | .Ip * 4 2 | |
5909 | Shells interpret scripts a little bit at a time. | |
5910 | .I Perl | |
5911 | compiles the whole program before executing it. | |
5912 | .Ip * 4 2 | |
5913 | The arguments are available via @ARGV, not $1, $2, etc. | |
5914 | .Ip * 4 2 | |
5915 | The environment is not automatically made available as variables. | |
5916 | .SH ERRATA\0AND\0ADDENDA | |
5917 | The Perl book, | |
5918 | .I Programming\0Perl , | |
5919 | has the following omissions and goofs. | |
5920 | .PP | |
5921 | On page 5, the examples which read | |
5922 | .nf | |
5923 | ||
5924 | eval "/usr/bin/perl | |
5925 | ||
5926 | should read | |
5927 | ||
5928 | eval "exec /usr/bin/perl | |
5929 | ||
5930 | .fi | |
5931 | .PP | |
5932 | On page 195, the equivalent to the System V sum program only works for | |
5933 | very small files. To do larger files, use | |
5934 | .nf | |
5935 | ||
5936 | undef $/; | |
5937 | $checksum = unpack("%32C*",<>) % 32767; | |
5938 | ||
5939 | .fi | |
5940 | .PP | |
5941 | The descriptions of alarm and sleep refer to signal SIGALARM. These | |
5942 | should refer to SIGALRM. | |
5943 | .PP | |
5944 | The | |
5945 | .B \-0 | |
5946 | switch to set the initial value of $/ was added to Perl after the book | |
5947 | went to press. | |
5948 | .PP | |
5949 | The | |
5950 | .B \-l | |
5951 | switch now does automatic line ending processing. | |
5952 | .PP | |
5953 | The qx// construct is now a synonym for backticks. | |
5954 | .PP | |
5955 | $0 may now be assigned to set the argument displayed by | |
5956 | .I ps (1). | |
5957 | .PP | |
5958 | The new @###.## format was omitted accidentally from the description | |
5959 | on formats. | |
5960 | .PP | |
5961 | It wasn't known at press time that s///ee caused multiple evaluations of | |
5962 | the replacement expression. This is to be construed as a feature. | |
5963 | .PP | |
5964 | (LIST) x $count now does array replication. | |
5965 | .PP | |
5966 | There is now no limit on the number of parentheses in a regular expression. | |
5967 | .PP | |
5968 | In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[, | |
5969 | \el, \eL, \eu, \eU, \eE. The latter five control up/lower case translation. | |
5970 | .PP | |
5971 | The | |
5972 | .B $/ | |
5973 | variable may now be set to a multi-character delimiter. | |
5974 | .PP | |
5975 | There is now a g modifier on ordinary pattern matching that causes it | |
5976 | to iterate through a string finding multiple matches. | |
5977 | .PP | |
5978 | All of the $^X variables are new except for $^T. | |
5979 | .PP | |
5980 | The default top-of-form format for FILEHANDLE is now FILEHANDLE_TOP rather | |
5981 | than top. | |
5982 | .PP | |
5983 | The eval {} and sort {} constructs were added in version 4.018. | |
5984 | .PP | |
5985 | The v and V (little-endian) template options for pack and unpack were | |
5986 | added in 4.019. | |
5987 | .SH BUGS | |
5988 | .PP | |
5989 | .I Perl | |
5990 | is at the mercy of your machine's definitions of various operations | |
5991 | such as type casting, atof() and sprintf(). | |
5992 | .PP | |
5993 | If your stdio requires an seek or eof between reads and writes on a particular | |
5994 | stream, so does | |
5995 | .IR perl . | |
5996 | (This doesn't apply to sysread() and syswrite().) | |
5997 | .PP | |
5998 | While none of the built-in data types have any arbitrary size limits (apart | |
5999 | from memory size), there are still a few arbitrary limits: | |
6000 | a given identifier may not be longer than 255 characters, | |
6001 | and no component of your PATH may be longer than 255 if you use \-S. | |
6002 | A regular expression may not compile to more than 32767 bytes internally. | |
6003 | .PP | |
6004 | .I Perl | |
6005 | actually stands for Pathologically Eclectic Rubbish Lister, but don't tell | |
6006 | anyone I said that. | |
6007 | .rn }` '' |