BSD 4_4_Lite2 development
[unix-history] / usr / src / contrib / perl-4.036 / perl.man
CommitLineData
ca2dddd6
C
1.rn '' }`
2''' $RCSfile: perl.man,v $$Revision: 4.0.1.6 $$Date: 92/06/08 15:07:29 $
3'''
4''' $Log: perl.man,v $
5''' Revision 4.0.1.6 92/06/08 15:07:29 lwall
6''' patch20: documented that numbers may contain underline
7''' patch20: clarified that DATA may only be read from main script
8''' patch20: relaxed requirement for semicolon at the end of a block
9''' patch20: added ... as variant on ..
10''' patch20: documented need for 1; at the end of a required file
11''' patch20: extended bracket-style quotes to two-arg operators: s()() and tr()()
12''' patch20: paragraph mode now skips extra newlines automatically
13''' patch20: documented PERLLIB and PERLDB
14''' patch20: documented limit on size of regexp
15'''
16''' Revision 4.0.1.5 91/11/11 16:42:00 lwall
17''' patch19: added little-endian pack/unpack options
18'''
19''' Revision 4.0.1.4 91/11/05 18:11:05 lwall
20''' patch11: added sort {} LIST
21''' patch11: added eval {}
22''' patch11: documented meaning of scalar(%foo)
23''' patch11: sprintf() now supports any length of s field
24'''
25''' Revision 4.0.1.3 91/06/10 01:26:02 lwall
26''' patch10: documented some newer features in addenda
27'''
28''' Revision 4.0.1.2 91/06/07 11:41:23 lwall
29''' patch4: added global modifier for pattern matches
30''' patch4: default top-of-form format is now FILEHANDLE_TOP
31''' patch4: added $^P variable to control calling of perldb routines
32''' patch4: added $^F variable to specify maximum system fd, default 2
33''' patch4: changed old $^P to $^X
34'''
35''' Revision 4.0.1.1 91/04/11 17:50:44 lwall
36''' patch1: fixed some typos
37'''
38''' Revision 4.0 91/03/20 01:38:08 lwall
39''' 4.0 baseline.
40'''
41'''
42.de Sh
43.br
44.ne 5
45.PP
46\fB\\$1\fR
47.PP
48..
49.de Sp
50.if t .sp .5v
51.if n .sp
52..
53.de Ip
54.br
55.ie \\n(.$>=3 .ne \\$3
56.el .ne 3
57.IP "\\$1" \\$2
58..
59'''
60''' Set up \*(-- to give an unbreakable dash;
61''' string Tr holds user defined translation string.
62''' Bell System Logo is used as a dummy character.
63'''
64.tr \(*W-|\(bv\*(Tr
65.ie n \{\
66.ds -- \(*W-
67.if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
68.if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
69.ds L" ""
70.ds R" ""
71.ds L' '
72.ds R' '
73'br\}
74.el\{\
75.ds -- \(em\|
76.tr \*(Tr
77.ds L" ``
78.ds R" ''
79.ds L' `
80.ds R' '
81'br\}
82.TH PERL 1 "\*(RP"
83.UC
84.SH NAME
85perl \- Practical Extraction and Report Language
86.SH SYNOPSIS
87.B perl
88[options] filename args
89.SH DESCRIPTION
90.I Perl
91is an interpreted language optimized for scanning arbitrary text files,
92extracting information from those text files, and printing reports based
93on that information.
94It's also a good language for many system management tasks.
95The language is intended to be practical (easy to use, efficient, complete)
96rather than beautiful (tiny, elegant, minimal).
97It combines (in the author's opinion, anyway) some of the best features of C,
98\fIsed\fR, \fIawk\fR, and \fIsh\fR,
99so people familiar with those languages should have little difficulty with it.
100(Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
101even BASIC-PLUS.)
102Expression syntax corresponds quite closely to C expression syntax.
103Unlike most Unix utilities,
104.I perl
105does not arbitrarily limit the size of your data\*(--if you've got
106the memory,
107.I perl
108can slurp in your whole file as a single string.
109Recursion is of unlimited depth.
110And the hash tables used by associative arrays grow as necessary to prevent
111degraded performance.
112.I Perl
113uses sophisticated pattern matching techniques to scan large amounts of
114data very quickly.
115Although optimized for scanning text,
116.I perl
117can also deal with binary data, and can make dbm files look like associative
118arrays (where dbm is available).
119Setuid
120.I perl
121scripts are safer than C programs
122through a dataflow tracing mechanism which prevents many stupid security holes.
123If you have a problem that would ordinarily use \fIsed\fR
124or \fIawk\fR or \fIsh\fR, but it
125exceeds their capabilities or must run a little faster,
126and you don't want to write the silly thing in C, then
127.I perl
128may be for you.
129There are also translators to turn your
130.I sed
131and
132.I awk
133scripts into
134.I perl
135scripts.
136OK, enough hype.
137.PP
138Upon startup,
139.I perl
140looks for your script in one of the following places:
141.Ip 1. 4 2
142Specified line by line via
143.B \-e
144switches on the command line.
145.Ip 2. 4 2
146Contained in the file specified by the first filename on the command line.
147(Note that systems supporting the #! notation invoke interpreters this way.)
148.Ip 3. 4 2
149Passed in implicitly via standard input.
150This only works if there are no filename arguments\*(--to pass
151arguments to a
152.I stdin
153script you must explicitly specify a \- for the script name.
154.PP
155After locating your script,
156.I perl
157compiles it to an internal form.
158If the script is syntactically correct, it is executed.
159.Sh "Options"
160Note: on first reading this section may not make much sense to you. It's here
161at the front for easy reference.
162.PP
163A single-character option may be combined with the following option, if any.
164This is particularly useful when invoking a script using the #! construct which
165only allows one argument. Example:
166.nf
167
168.ne 2
169 #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak
170 .\|.\|.
171
172.fi
173Options include:
174.TP 5
175.BI \-0 digits
176specifies the record separator ($/) as an octal number.
177If there are no digits, the null character is the separator.
178Other switches may precede or follow the digits.
179For example, if you have a version of
180.I find
181which can print filenames terminated by the null character, you can say this:
182.nf
183
184 find . \-name '*.bak' \-print0 | perl \-n0e unlink
185
186.fi
187The special value 00 will cause Perl to slurp files in paragraph mode.
188The value 0777 will cause Perl to slurp files whole since there is no
189legal character with that value.
190.TP 5
191.B \-a
192turns on autosplit mode when used with a
193.B \-n
194or
195.BR \-p .
196An implicit split command to the @F array
197is done as the first thing inside the implicit while loop produced by
198the
199.B \-n
200or
201.BR \-p .
202.nf
203
204 perl \-ane \'print pop(@F), "\en";\'
205
206is equivalent to
207
208 while (<>) {
209 @F = split(\' \');
210 print pop(@F), "\en";
211 }
212
213.fi
214.TP 5
215.B \-c
216causes
217.I perl
218to check the syntax of the script and then exit without executing it.
219.TP 5
220.BI \-d
221runs the script under the perl debugger.
222See the section on Debugging.
223.TP 5
224.BI \-D number
225sets debugging flags.
226To watch how it executes your script, use
227.BR \-D14 .
228(This only works if debugging is compiled into your
229.IR perl .)
230Another nice value is \-D1024, which lists your compiled syntax tree.
231And \-D512 displays compiled regular expressions.
232.TP 5
233.BI \-e " commandline"
234may be used to enter one line of script.
235Multiple
236.B \-e
237commands may be given to build up a multi-line script.
238If
239.B \-e
240is given,
241.I perl
242will not look for a script filename in the argument list.
243.TP 5
244.BI \-i extension
245specifies that files processed by the <> construct are to be edited
246in-place.
247It does this by renaming the input file, opening the output file by the
248same name, and selecting that output file as the default for print statements.
249The extension, if supplied, is added to the name of the
250old file to make a backup copy.
251If no extension is supplied, no backup is made.
252Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
253the script:
254.nf
255
256.ne 2
257 #!/usr/bin/perl \-pi.bak
258 s/foo/bar/;
259
260which is equivalent to
261
262.ne 14
263 #!/usr/bin/perl
264 while (<>) {
265 if ($ARGV ne $oldargv) {
266 rename($ARGV, $ARGV . \'.bak\');
267 open(ARGVOUT, ">$ARGV");
268 select(ARGVOUT);
269 $oldargv = $ARGV;
270 }
271 s/foo/bar/;
272 }
273 continue {
274 print; # this prints to original filename
275 }
276 select(STDOUT);
277
278.fi
279except that the
280.B \-i
281form doesn't need to compare $ARGV to $oldargv to know when
282the filename has changed.
283It does, however, use ARGVOUT for the selected filehandle.
284Note that
285.I STDOUT
286is restored as the default output filehandle after the loop.
287.Sp
288You can use eof to locate the end of each input file, in case you want
289to append to each file, or reset line numbering (see example under eof).
290.TP 5
291.BI \-I directory
292may be used in conjunction with
293.B \-P
294to tell the C preprocessor where to look for include files.
295By default /usr/include and /usr/lib/perl are searched.
296.TP 5
297.BI \-l octnum
298enables automatic line-ending processing. It has two effects:
299first, it automatically chops the line terminator when used with
300.B \-n
301or
302.B \-p ,
303and second, it assigns $\e to have the value of
304.I octnum
305so that any print statements will have that line terminator added back on. If
306.I octnum
307is omitted, sets $\e to the current value of $/.
308For instance, to trim lines to 80 columns:
309.nf
310
311 perl -lpe \'substr($_, 80) = ""\'
312
313.fi
314Note that the assignment $\e = $/ is done when the switch is processed,
315so the input record separator can be different than the output record
316separator if the
317.B \-l
318switch is followed by a
319.B \-0
320switch:
321.nf
322
323 gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
324
325.fi
326This sets $\e to newline and then sets $/ to the null character.
327.TP 5
328.B \-n
329causes
330.I perl
331to assume the following loop around your script, which makes it iterate
332over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
333.nf
334
335.ne 3
336 while (<>) {
337 .\|.\|. # your script goes here
338 }
339
340.fi
341Note that the lines are not printed by default.
342See
343.B \-p
344to have lines printed.
345Here is an efficient way to delete all files older than a week:
346.nf
347
348 find . \-mtime +7 \-print | perl \-nle \'unlink;\'
349
350.fi
351This is faster than using the \-exec switch of find because you don't have to
352start a process on every filename found.
353.TP 5
354.B \-p
355causes
356.I perl
357to assume the following loop around your script, which makes it iterate
358over filename arguments somewhat like \fIsed\fR:
359.nf
360
361.ne 5
362 while (<>) {
363 .\|.\|. # your script goes here
364 } continue {
365 print;
366 }
367
368.fi
369Note that the lines are printed automatically.
370To suppress printing use the
371.B \-n
372switch.
373A
374.B \-p
375overrides a
376.B \-n
377switch.
378.TP 5
379.B \-P
380causes your script to be run through the C preprocessor before
381compilation by
382.IR perl .
383(Since both comments and cpp directives begin with the # character,
384you should avoid starting comments with any words recognized
385by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
386.TP 5
387.B \-s
388enables some rudimentary switch parsing for switches on the command line
389after the script name but before any filename arguments (or before a \-\|\-).
390Any switch found there is removed from @ARGV and sets the corresponding variable in the
391.I perl
392script.
393The following script prints \*(L"true\*(R" if and only if the script is
394invoked with a \-xyz switch.
395.nf
396
397.ne 2
398 #!/usr/bin/perl \-s
399 if ($xyz) { print "true\en"; }
400
401.fi
402.TP 5
403.B \-S
404makes
405.I perl
406use the PATH environment variable to search for the script
407(unless the name of the script starts with a slash).
408Typically this is used to emulate #! startup on machines that don't
409support #!, in the following manner:
410.nf
411
412 #!/usr/bin/perl
413 eval "exec /usr/bin/perl \-S $0 $*"
414 if $running_under_some_shell;
415
416.fi
417The system ignores the first line and feeds the script to /bin/sh,
418which proceeds to try to execute the
419.I perl
420script as a shell script.
421The shell executes the second line as a normal shell command, and thus
422starts up the
423.I perl
424interpreter.
425On some systems $0 doesn't always contain the full pathname,
426so the
427.B \-S
428tells
429.I perl
430to search for the script if necessary.
431After
432.I perl
433locates the script, it parses the lines and ignores them because
434the variable $running_under_some_shell is never true.
435A better construct than $* would be ${1+"$@"}, which handles embedded spaces
436and such in the filenames, but doesn't work if the script is being interpreted
437by csh.
438In order to start up sh rather than csh, some systems may have to replace the
439#! line with a line containing just
440a colon, which will be politely ignored by perl.
441Other systems can't control that, and need a totally devious construct that
442will work under any of csh, sh or perl, such as the following:
443.nf
444
445.ne 3
446 eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
447 & eval 'exec /usr/bin/perl -S $0 $argv:q'
448 if 0;
449
450.fi
451.TP 5
452.B \-u
453causes
454.I perl
455to dump core after compiling your script.
456You can then take this core dump and turn it into an executable file
457by using the undump program (not supplied).
458This speeds startup at the expense of some disk space (which you can
459minimize by stripping the executable).
460(Still, a "hello world" executable comes out to about 200K on my machine.)
461If you are going to run your executable as a set-id program then you
462should probably compile it using taintperl rather than normal perl.
463If you want to execute a portion of your script before dumping, use the
464dump operator instead.
465Note: availability of undump is platform specific and may not be available
466for a specific port of perl.
467.TP 5
468.B \-U
469allows
470.I perl
471to do unsafe operations.
472Currently the only \*(L"unsafe\*(R" operations are the unlinking of directories while
473running as superuser, and running setuid programs with fatal taint checks
474turned into warnings.
475.TP 5
476.B \-v
477prints the version and patchlevel of your
478.I perl
479executable.
480.TP 5
481.B \-w
482prints warnings about identifiers that are mentioned only once, and scalar
483variables that are used before being set.
484Also warns about redefined subroutines, and references to undefined
485filehandles or filehandles opened readonly that you are attempting to
486write on.
487Also warns you if you use == on values that don't look like numbers, and if
488your subroutines recurse more than 100 deep.
489.TP 5
490.BI \-x directory
491tells
492.I perl
493that the script is embedded in a message.
494Leading garbage will be discarded until the first line that starts
495with #! and contains the string "perl".
496Any meaningful switches on that line will be applied (but only one
497group of switches, as with normal #! processing).
498If a directory name is specified, Perl will switch to that directory
499before running the script.
500The
501.B \-x
502switch only controls the the disposal of leading garbage.
503The script must be terminated with _\|_END_\|_ if there is trailing garbage
504to be ignored (the script can process any or all of the trailing garbage
505via the DATA filehandle if desired).
506.Sh "Data Types and Objects"
507.PP
508.I Perl
509has three data types: scalars, arrays of scalars, and
510associative arrays of scalars.
511Normal arrays are indexed by number, and associative arrays by string.
512.PP
513The interpretation of operations and values in perl sometimes
514depends on the requirements
515of the context around the operation or value.
516There are three major contexts: string, numeric and array.
517Certain operations return array values
518in contexts wanting an array, and scalar values otherwise.
519(If this is true of an operation it will be mentioned in the documentation
520for that operation.)
521Operations which return scalars don't care whether the context is looking
522for a string or a number, but
523scalar variables and values are interpreted as strings or numbers
524as appropriate to the context.
525A scalar is interpreted as TRUE in the boolean sense if it is not the null
526string or 0.
527Booleans returned by operators are 1 for true and 0 or \'\' (the null
528string) for false.
529.PP
530There are actually two varieties of null string: defined and undefined.
531Undefined null strings are returned when there is no real value for something,
532such as when there was an error, or at end of file, or when you refer
533to an uninitialized variable or element of an array.
534An undefined null string may become defined the first time you access it, but
535prior to that you can use the defined() operator to determine whether the
536value is defined or not.
537.PP
538References to scalar variables always begin with \*(L'$\*(R', even when referring
539to a scalar that is part of an array.
540Thus:
541.nf
542
543.ne 3
544 $days \h'|2i'# a simple scalar variable
545 $days[28] \h'|2i'# 29th element of array @days
546 $days{\'Feb\'}\h'|2i'# one value from an associative array
547 $#days \h'|2i'# last index of array @days
548
549but entire arrays or array slices are denoted by \*(L'@\*(R':
550
551 @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
552 @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
553 @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
554
555and entire associative arrays are denoted by \*(L'%\*(R':
556
557 %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
558.fi
559.PP
560Any of these eight constructs may serve as an lvalue,
561that is, may be assigned to.
562(It also turns out that an assignment is itself an lvalue in
563certain contexts\*(--see examples under s, tr and chop.)
564Assignment to a scalar evaluates the righthand side in a scalar context,
565while assignment to an array or array slice evaluates the righthand side
566in an array context.
567.PP
568You may find the length of array @days by evaluating
569\*(L"$#days\*(R", as in
570.IR csh .
571(Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
572Assigning to $#days changes the length of the array.
573Shortening an array by this method does not actually destroy any values.
574Lengthening an array that was previously shortened recovers the values that
575were in those elements.
576You can also gain some measure of efficiency by preextending an array that
577is going to get big.
578(You can also extend an array by assigning to an element that is off the
579end of the array.
580This differs from assigning to $#whatever in that intervening values
581are set to null rather than recovered.)
582You can truncate an array down to nothing by assigning the null list () to
583it.
584The following are exactly equivalent
585.nf
586
587 @whatever = ();
588 $#whatever = $[ \- 1;
589
590.fi
591.PP
592If you evaluate an array in a scalar context, it returns the length of
593the array.
594The following is always true:
595.nf
596
597 scalar(@whatever) == $#whatever \- $[ + 1;
598
599.fi
600If you evaluate an associative array in a scalar context, it returns
601a value which is true if and only if the array contains any elements.
602(If there are any elements, the value returned is a string consisting
603of the number of used buckets and the number of allocated buckets, separated
604by a slash.)
605.PP
606Multi-dimensional arrays are not directly supported, but see the discussion
607of the $; variable later for a means of emulating multiple subscripts with
608an associative array.
609You could also write a subroutine to turn multiple subscripts into a single
610subscript.
611.PP
612Every data type has its own namespace.
613You can, without fear of conflict, use the same name for a scalar variable,
614an array, an associative array, a filehandle, a subroutine name, and/or
615a label.
616Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
617or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
618with respect to variable names.
619(They ARE reserved with respect to labels and filehandles, however, which
620don't have an initial special character.
621Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
622Using uppercase filehandles also improves readability and protects you
623from conflict with future reserved words.)
624Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
625different names.
626Names which start with a letter may also contain digits and underscores.
627Names which do not start with a letter are limited to one character,
628e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
629(Most of the one character names have a predefined significance to
630.IR perl .
631More later.)
632.PP
633Numeric literals are specified in any of the usual floating point or
634integer formats:
635.nf
636
637.ne 6
638 12345
639 12345.67
640 .23E-10
641 0xffff # hex
642 0377 # octal
643 4_294_967_296
644
645.fi
646String literals are delimited by either single or double quotes.
647They work much like shell quotes:
648double-quoted string literals are subject to backslash and variable
649substitution; single-quoted strings are not (except for \e\' and \e\e).
650The usual backslash rules apply for making characters such as newline, tab,
651etc., as well as some more exotic forms:
652.nf
653
654 \et tab
655 \en newline
656 \er return
657 \ef form feed
658 \eb backspace
659 \ea alarm (bell)
660 \ee escape
661 \e033 octal char
662 \ex1b hex char
663 \ec[ control char
664 \el lowercase next char
665 \eu uppercase next char
666 \eL lowercase till \eE
667 \eU uppercase till \eE
668 \eE end case modification
669
670.fi
671You can also embed newlines directly in your strings, i.e. they can end on
672a different line than they begin.
673This is nice, but if you forget your trailing quote, the error will not be
674reported until
675.I perl
676finds another line containing the quote character, which
677may be much further on in the script.
678Variable substitution inside strings is limited to scalar variables, normal
679array values, and array slices.
680(In other words, identifiers beginning with $ or @, followed by an optional
681bracketed expression as a subscript.)
682The following code segment prints out \*(L"The price is $100.\*(R"
683.nf
684
685.ne 2
686 $Price = \'$100\';\h'|3.5i'# not interpreted
687 print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
688
689.fi
690Note that you can put curly brackets around the identifier to delimit it
691from following alphanumerics.
692Also note that a single quoted string must be separated from a preceding
693word by a space, since single quote is a valid character in an identifier
694(see Packages).
695.PP
696Two special literals are _\|_LINE_\|_ and _\|_FILE_\|_, which represent the current
697line number and filename at that point in your program.
698They may only be used as separate tokens; they will not be interpolated
699into strings.
700In addition, the token _\|_END_\|_ may be used to indicate the logical end of the
701script before the actual end of file.
702Any following text is ignored, but may be read via the DATA filehandle.
703(The DATA filehandle may read data only from the main script, but not from
704any required file or evaluated string.)
705The two control characters ^D and ^Z are synonyms for _\|_END_\|_.
706.PP
707A word that doesn't have any other interpretation in the grammar will be
708treated as if it had single quotes around it.
709For this purpose, a word consists only of alphanumeric characters and underline,
710and must start with an alphabetic character.
711As with filehandles and labels, a bare word that consists entirely of
712lowercase letters risks conflict with future reserved words, and if you
713use the
714.B \-w
715switch, Perl will warn you about any such words.
716.PP
717Array values are interpolated into double-quoted strings by joining all the
718elements of the array with the delimiter specified in the $" variable,
719space by default.
720(Since in versions of perl prior to 3.0 the @ character was not a metacharacter
721in double-quoted strings, the interpolation of @array, $array[EXPR],
722@array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
723referenced elsewhere in the program or is predefined.)
724The following are equivalent:
725.nf
726
727.ne 4
728 $temp = join($",@ARGV);
729 system "echo $temp";
730
731 system "echo @ARGV";
732
733.fi
734Within search patterns (which also undergo double-quotish substitution)
735there is a bad ambiguity: Is /$foo[bar]/ to be
736interpreted as /${foo}[bar]/ (where [bar] is a character class for the
737regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
738array @foo)?
739If @foo doesn't otherwise exist, then it's obviously a character class.
740If @foo exists, perl takes a good guess about [bar], and is almost always right.
741If it does guess wrong, or if you're just plain paranoid,
742you can force the correct interpretation with curly brackets as above.
743.PP
744A line-oriented form of quoting is based on the shell here-is syntax.
745Following a << you specify a string to terminate the quoted material, and all lines
746following the current line down to the terminating string are the value
747of the item.
748The terminating string may be either an identifier (a word), or some
749quoted text.
750If quoted, the type of quotes you use determines the treatment of the text,
751just as in regular quoting.
752An unquoted identifier works like double quotes.
753There must be no space between the << and the identifier.
754(If you put a space it will be treated as a null identifier, which is
755valid, and matches the first blank line\*(--see Merry Christmas example below.)
756The terminating string must appear by itself (unquoted and with no surrounding
757whitespace) on the terminating line.
758.nf
759
760 print <<EOF; # same as above
761The price is $Price.
762EOF
763
764 print <<"EOF"; # same as above
765The price is $Price.
766EOF
767
768 print << x 10; # null identifier is delimiter
769Merry Christmas!
770
771 print <<`EOC`; # execute commands
772echo hi there
773echo lo there
774EOC
775
776 print <<foo, <<bar; # you can stack them
777I said foo.
778foo
779I said bar.
780bar
781
782.fi
783Array literals are denoted by separating individual values by commas, and
784enclosing the list in parentheses:
785.nf
786
787 (LIST)
788
789.fi
790In a context not requiring an array value, the value of the array literal
791is the value of the final element, as in the C comma operator.
792For example,
793.nf
794
795.ne 4
796 @foo = (\'cc\', \'\-E\', $bar);
797
798assigns the entire array value to array foo, but
799
800 $foo = (\'cc\', \'\-E\', $bar);
801
802.fi
803assigns the value of variable bar to variable foo.
804Note that the value of an actual array in a scalar context is the length
805of the array; the following assigns to $foo the value 3:
806.nf
807
808.ne 2
809 @foo = (\'cc\', \'\-E\', $bar);
810 $foo = @foo; # $foo gets 3
811
812.fi
813You may have an optional comma before the closing parenthesis of an
814array literal, so that you can say:
815.nf
816
817 @foo = (
818 1,
819 2,
820 3,
821 );
822
823.fi
824When a LIST is evaluated, each element of the list is evaluated in
825an array context, and the resulting array value is interpolated into LIST
826just as if each individual element were a member of LIST. Thus arrays
827lose their identity in a LIST\*(--the list
828
829 (@foo,@bar,&SomeSub)
830
831contains all the elements of @foo followed by all the elements of @bar,
832followed by all the elements returned by the subroutine named SomeSub.
833.PP
834A list value may also be subscripted like a normal array.
835Examples:
836.nf
837
838 $time = (stat($file))[8]; # stat returns array value
839 $digit = ('a','b','c','d','e','f')[$digit-10];
840 return (pop(@foo),pop(@foo))[0];
841
842.fi
843.PP
844Array lists may be assigned to if and only if each element of the list
845is an lvalue:
846.nf
847
848 ($a, $b, $c) = (1, 2, 3);
849
850 ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
851
852The final element may be an array or an associative array:
853
854 ($a, $b, @rest) = split;
855 local($a, $b, %rest) = @_;
856
857.fi
858You can actually put an array anywhere in the list, but the first array
859in the list will soak up all the values, and anything after it will get
860a null value.
861This may be useful in a local().
862.PP
863An associative array literal contains pairs of values to be interpreted
864as a key and a value:
865.nf
866
867.ne 2
868 # same as map assignment above
869 %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
870
871.fi
872Array assignment in a scalar context returns the number of elements
873produced by the expression on the right side of the assignment:
874.nf
875
876 $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
877
878.fi
879.PP
880There are several other pseudo-literals that you should know about.
881If a string is enclosed by backticks (grave accents), it first undergoes
882variable substitution just like a double quoted string.
883It is then interpreted as a command, and the output of that command
884is the value of the pseudo-literal, like in a shell.
885In a scalar context, a single string consisting of all the output is
886returned.
887In an array context, an array of values is returned, one for each line
888of output.
889(You can set $/ to use a different line terminator.)
890The command is executed each time the pseudo-literal is evaluated.
891The status value of the command is returned in $? (see Predefined Names
892for the interpretation of $?).
893Unlike in \f2csh\f1, no translation is done on the return
894data\*(--newlines remain newlines.
895Unlike in any of the shells, single quotes do not hide variable names
896in the command from interpretation.
897To pass a $ through to the shell you need to hide it with a backslash.
898.PP
899Evaluating a filehandle in angle brackets yields the next line
900from that file (newline included, so it's never false until EOF, at
901which time an undefined value is returned).
902Ordinarily you must assign that value to a variable,
903but there is one situation where an automatic assignment happens.
904If (and only if) the input symbol is the only thing inside the conditional of a
905.I while
906loop, the value is
907automatically assigned to the variable \*(L"$_\*(R".
908(This may seem like an odd thing to you, but you'll use the construct
909in almost every
910.I perl
911script you write.)
912Anyway, the following lines are equivalent to each other:
913.nf
914
915.ne 5
916 while ($_ = <STDIN>) { print; }
917 while (<STDIN>) { print; }
918 for (\|;\|<STDIN>;\|) { print; }
919 print while $_ = <STDIN>;
920 print while <STDIN>;
921
922.fi
923The filehandles
924.IR STDIN ,
925.I STDOUT
926and
927.I STDERR
928are predefined.
929(The filehandles
930.IR stdin ,
931.I stdout
932and
933.I stderr
934will also work except in packages, where they would be interpreted as
935local identifiers rather than global.)
936Additional filehandles may be created with the
937.I open
938function.
939.PP
940If a <FILEHANDLE> is used in a context that is looking for an array, an array
941consisting of all the input lines is returned, one line per array element.
942It's easy to make a LARGE data space this way, so use with care.
943.PP
944The null filehandle <> is special and can be used to emulate the behavior of
945\fIsed\fR and \fIawk\fR.
946Input from <> comes either from standard input, or from each file listed on
947the command line.
948Here's how it works: the first time <> is evaluated, the ARGV array is checked,
949and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
950input.
951The ARGV array is then processed as a list of filenames.
952The loop
953.nf
954
955.ne 3
956 while (<>) {
957 .\|.\|. # code for each line
958 }
959
960.ne 10
961is equivalent to the following Perl-like pseudo code:
962
963 unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
964 while ($ARGV = shift) {
965 open(ARGV, $ARGV);
966 while (<ARGV>) {
967 .\|.\|. # code for each line
968 }
969 }
970
971.fi
972except that it isn't as cumbersome to say, and will actually work.
973It really does shift array ARGV and put the current filename into
974variable ARGV.
975It also uses filehandle ARGV internally\*(--<> is just a synonym for
976<ARGV>, which is magical.
977(The pseudo code above doesn't work because it treats <ARGV> as non-magical.)
978.PP
979You can modify @ARGV before the first <> as long as the array ends up
980containing the list of filenames you really want.
981Line numbers ($.) continue as if the input was one big happy file.
982(But see example under eof for how to reset line numbers on each file.)
983.PP
984.ne 5
985If you want to set @ARGV to your own list of files, go right ahead.
986If you want to pass switches into your script, you can
987put a loop on the front like this:
988.nf
989
990.ne 10
991 while ($_ = $ARGV[0], /\|^\-/\|) {
992 shift;
993 last if /\|^\-\|\-$\|/\|;
994 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
995 /\|^\-v\|/ \|&& \|$verbose++;
996 .\|.\|. # other switches
997 }
998 while (<>) {
999 .\|.\|. # code for each line
1000 }
1001
1002.fi
1003The <> symbol will return FALSE only once.
1004If you call it again after this it will assume you are processing another
1005@ARGV list, and if you haven't set @ARGV, will input from
1006.IR STDIN .
1007.PP
1008If the string inside the angle brackets is a reference to a scalar variable
1009(e.g. <$foo>),
1010then that variable contains the name of the filehandle to input from.
1011.PP
1012If the string inside angle brackets is not a filehandle, it is interpreted
1013as a filename pattern to be globbed, and either an array of filenames or the
1014next filename in the list is returned, depending on context.
1015One level of $ interpretation is done first, but you can't say <$foo>
1016because that's an indirect filehandle as explained in the previous
1017paragraph.
1018You could insert curly brackets to force interpretation as a
1019filename glob: <${foo}>.
1020Example:
1021.nf
1022
1023.ne 3
1024 while (<*.c>) {
1025 chmod 0644, $_;
1026 }
1027
1028is equivalent to
1029
1030.ne 5
1031 open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
1032 while (<foo>) {
1033 chop;
1034 chmod 0644, $_;
1035 }
1036
1037.fi
1038In fact, it's currently implemented that way.
1039(Which means it will not work on filenames with spaces in them unless
1040you have /bin/csh on your machine.)
1041Of course, the shortest way to do the above is:
1042.nf
1043
1044 chmod 0644, <*.c>;
1045
1046.fi
1047.Sh "Syntax"
1048.PP
1049A
1050.I perl
1051script consists of a sequence of declarations and commands.
1052The only things that need to be declared in
1053.I perl
1054are report formats and subroutines.
1055See the sections below for more information on those declarations.
1056All uninitialized user-created objects are assumed to
1057start with a null or 0 value until they
1058are defined by some explicit operation such as assignment.
1059The sequence of commands is executed just once, unlike in
1060.I sed
1061and
1062.I awk
1063scripts, where the sequence of commands is executed for each input line.
1064While this means that you must explicitly loop over the lines of your input file
1065(or files), it also means you have much more control over which files and which
1066lines you look at.
1067(Actually, I'm lying\*(--it is possible to do an implicit loop with either the
1068.B \-n
1069or
1070.B \-p
1071switch.)
1072.PP
1073A declaration can be put anywhere a command can, but has no effect on the
1074execution of the primary sequence of commands\*(--declarations all take effect
1075at compile time.
1076Typically all the declarations are put at the beginning or the end of the script.
1077.PP
1078.I Perl
1079is, for the most part, a free-form language.
1080(The only exception to this is format declarations, for fairly obvious reasons.)
1081Comments are indicated by the # character, and extend to the end of the line.
1082If you attempt to use /* */ C comments, it will be interpreted either as
1083division or pattern matching, depending on the context.
1084So don't do that.
1085.Sh "Compound statements"
1086In
1087.IR perl ,
1088a sequence of commands may be treated as one command by enclosing it
1089in curly brackets.
1090We will call this a BLOCK.
1091.PP
1092The following compound commands may be used to control flow:
1093.nf
1094
1095.ne 4
1096 if (EXPR) BLOCK
1097 if (EXPR) BLOCK else BLOCK
1098 if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1099 LABEL while (EXPR) BLOCK
1100 LABEL while (EXPR) BLOCK continue BLOCK
1101 LABEL for (EXPR; EXPR; EXPR) BLOCK
1102 LABEL foreach VAR (ARRAY) BLOCK
1103 LABEL BLOCK continue BLOCK
1104
1105.fi
1106Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1107statements.
1108This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1109If you want to write conditionals without curly brackets there are several
1110other ways to do it.
1111The following all do the same thing:
1112.nf
1113
1114.ne 5
1115 if (!open(foo)) { die "Can't open $foo: $!"; }
1116 die "Can't open $foo: $!" unless open(foo);
1117 open(foo) || die "Can't open $foo: $!"; # foo or bust!
1118 open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1119 # a bit exotic, that last one
1120
1121.fi
1122.PP
1123The
1124.I if
1125statement is straightforward.
1126Since BLOCKs are always bounded by curly brackets, there is never any
1127ambiguity about which
1128.I if
1129an
1130.I else
1131goes with.
1132If you use
1133.I unless
1134in place of
1135.IR if ,
1136the sense of the test is reversed.
1137.PP
1138The
1139.I while
1140statement executes the block as long as the expression is true
1141(does not evaluate to the null string or 0).
1142The LABEL is optional, and if present, consists of an identifier followed by
1143a colon.
1144The LABEL identifies the loop for the loop control statements
1145.IR next ,
1146.IR last ,
1147and
1148.I redo
1149(see below).
1150If there is a
1151.I continue
1152BLOCK, it is always executed just before
1153the conditional is about to be evaluated again, similarly to the third part
1154of a
1155.I for
1156loop in C.
1157Thus it can be used to increment a loop variable, even when the loop has
1158been continued via the
1159.I next
1160statement (similar to the C \*(L"continue\*(R" statement).
1161.PP
1162If the word
1163.I while
1164is replaced by the word
1165.IR until ,
1166the sense of the test is reversed, but the conditional is still tested before
1167the first iteration.
1168.PP
1169In either the
1170.I if
1171or the
1172.I while
1173statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1174is true if the value of the last command in that block is true.
1175.PP
1176The
1177.I for
1178loop works exactly like the corresponding
1179.I while
1180loop:
1181.nf
1182
1183.ne 12
1184 for ($i = 1; $i < 10; $i++) {
1185 .\|.\|.
1186 }
1187
1188is the same as
1189
1190 $i = 1;
1191 while ($i < 10) {
1192 .\|.\|.
1193 } continue {
1194 $i++;
1195 }
1196.fi
1197.PP
1198The foreach loop iterates over a normal array value and sets the variable
1199VAR to be each element of the array in turn.
1200The variable is implicitly local to the loop, and regains its former value
1201upon exiting the loop.
1202The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1203so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1204If VAR is omitted, $_ is set to each value.
1205If ARRAY is an actual array (as opposed to an expression returning an array
1206value), you can modify each element of the array
1207by modifying VAR inside the loop.
1208Examples:
1209.nf
1210
1211.ne 5
1212 for (@ary) { s/foo/bar/; }
1213
1214 foreach $elem (@elements) {
1215 $elem *= 2;
1216 }
1217
1218.ne 3
1219 for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1220 print $_, "\en"; sleep(1);
1221 }
1222
1223 for (1..15) { print "Merry Christmas\en"; }
1224
1225.ne 3
1226 foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1227 print "Item: $item\en";
1228 }
1229
1230.fi
1231.PP
1232The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1233once.
1234Thus you can use any of the loop control statements in it to leave or
1235restart the block.
1236The
1237.I continue
1238block is optional.
1239This construct is particularly nice for doing case structures.
1240.nf
1241
1242.ne 6
1243 foo: {
1244 if (/^abc/) { $abc = 1; last foo; }
1245 if (/^def/) { $def = 1; last foo; }
1246 if (/^xyz/) { $xyz = 1; last foo; }
1247 $nothing = 1;
1248 }
1249
1250.fi
1251There is no official switch statement in perl, because there
1252are already several ways to write the equivalent.
1253In addition to the above, you could write
1254.nf
1255
1256.ne 6
1257 foo: {
1258 $abc = 1, last foo if /^abc/;
1259 $def = 1, last foo if /^def/;
1260 $xyz = 1, last foo if /^xyz/;
1261 $nothing = 1;
1262 }
1263
1264or
1265
1266.ne 6
1267 foo: {
1268 /^abc/ && do { $abc = 1; last foo; };
1269 /^def/ && do { $def = 1; last foo; };
1270 /^xyz/ && do { $xyz = 1; last foo; };
1271 $nothing = 1;
1272 }
1273
1274or
1275
1276.ne 6
1277 foo: {
1278 /^abc/ && ($abc = 1, last foo);
1279 /^def/ && ($def = 1, last foo);
1280 /^xyz/ && ($xyz = 1, last foo);
1281 $nothing = 1;
1282 }
1283
1284or even
1285
1286.ne 8
1287 if (/^abc/)
1288 { $abc = 1; }
1289 elsif (/^def/)
1290 { $def = 1; }
1291 elsif (/^xyz/)
1292 { $xyz = 1; }
1293 else
1294 {$nothing = 1;}
1295
1296.fi
1297As it happens, these are all optimized internally to a switch structure,
1298so perl jumps directly to the desired statement, and you needn't worry
1299about perl executing a lot of unnecessary statements when you have a string
1300of 50 elsifs, as long as you are testing the same simple scalar variable
1301using ==, eq, or pattern matching as above.
1302(If you're curious as to whether the optimizer has done this for a particular
1303case statement, you can use the \-D1024 switch to list the syntax tree
1304before execution.)
1305.Sh "Simple statements"
1306The only kind of simple statement is an expression evaluated for its side
1307effects.
1308Every simple statement must be terminated with a semicolon, unless it is the
1309final statement in a block, in which case the semicolon is optional.
1310(Semicolon is still encouraged there if the block takes up more than one line).
1311.PP
1312Any simple statement may optionally be followed by a
1313single modifier, just before the terminating semicolon.
1314The possible modifiers are:
1315.nf
1316
1317.ne 4
1318 if EXPR
1319 unless EXPR
1320 while EXPR
1321 until EXPR
1322
1323.fi
1324The
1325.I if
1326and
1327.I unless
1328modifiers have the expected semantics.
1329The
1330.I while
1331and
1332.I until
1333modifiers also have the expected semantics (conditional evaluated first),
1334except when applied to a do-BLOCK or a do-SUBROUTINE command,
1335in which case the block executes once before the conditional is evaluated.
1336This is so that you can write loops like:
1337.nf
1338
1339.ne 4
1340 do {
1341 $_ = <STDIN>;
1342 .\|.\|.
1343 } until $_ \|eq \|".\|\e\|n";
1344
1345.fi
1346(See the
1347.I do
1348operator below. Note also that the loop control commands described later will
1349NOT work in this construct, since modifiers don't take loop labels.
1350Sorry.)
1351.Sh "Expressions"
1352Since
1353.I perl
1354expressions work almost exactly like C expressions, only the differences
1355will be mentioned here.
1356.PP
1357Here's what
1358.I perl
1359has that C doesn't:
1360.Ip ** 8 2
1361The exponentiation operator.
1362.Ip **= 8
1363The exponentiation assignment operator.
1364.Ip (\|) 8 3
1365The null list, used to initialize an array to null.
1366.Ip . 8
1367Concatenation of two strings.
1368.Ip .= 8
1369The concatenation assignment operator.
1370.Ip eq 8
1371String equality (== is numeric equality).
1372For a mnemonic just think of \*(L"eq\*(R" as a string.
1373(If you are used to the
1374.I awk
1375behavior of using == for either string or numeric equality
1376based on the current form of the comparands, beware!
1377You must be explicit here.)
1378.Ip ne 8
1379String inequality (!= is numeric inequality).
1380.Ip lt 8
1381String less than.
1382.Ip gt 8
1383String greater than.
1384.Ip le 8
1385String less than or equal.
1386.Ip ge 8
1387String greater than or equal.
1388.Ip cmp 8
1389String comparison, returning -1, 0, or 1.
1390.Ip <=> 8
1391Numeric comparison, returning -1, 0, or 1.
1392.Ip =~ 8 2
1393Certain operations search or modify the string \*(L"$_\*(R" by default.
1394This operator makes that kind of operation work on some other string.
1395The right argument is a search pattern, substitution, or translation.
1396The left argument is what is supposed to be searched, substituted, or
1397translated instead of the default \*(L"$_\*(R".
1398The return value indicates the success of the operation.
1399(If the right argument is an expression other than a search pattern,
1400substitution, or translation, it is interpreted as a search pattern
1401at run time.
1402This is less efficient than an explicit search, since the pattern must
1403be compiled every time the expression is evaluated.)
1404The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1405.Ip !~ 8
1406Just like =~ except the return value is negated.
1407.Ip x 8
1408The repetition operator.
1409Returns a string consisting of the left operand repeated the
1410number of times specified by the right operand.
1411In an array context, if the left operand is a list in parens, it repeats
1412the list.
1413.nf
1414
1415 print \'\-\' x 80; # print row of dashes
1416 print \'\-\' x80; # illegal, x80 is identifier
1417
1418 print "\et" x ($tab/8), \' \' x ($tab%8); # tab over
1419
1420 @ones = (1) x 80; # an array of 80 1's
1421 @ones = (5) x @ones; # set all elements to 5
1422
1423.fi
1424.Ip x= 8
1425The repetition assignment operator.
1426Only works on scalars.
1427.Ip .\|. 8
1428The range operator, which is really two different operators depending
1429on the context.
1430In an array context, returns an array of values counting (by ones)
1431from the left value to the right value.
1432This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1433slice operations on arrays.
1434.Sp
1435In a scalar context, .\|. returns a boolean value.
1436The operator is bistable, like a flip-flop, and
1437emulates the line-range (comma) operator of sed, awk, and various editors.
1438Each .\|. operator maintains its own boolean state.
1439It is false as long as its left operand is false.
1440Once the left operand is true, the range operator stays true
1441until the right operand is true,
1442AFTER which the range operator becomes false again.
1443(It doesn't become false till the next time the range operator is evaluated.
1444It can test the right operand and become false on the
1445same evaluation it became true (as in awk), but it still returns true once.
1446If you don't want it to test the right operand till the next
1447evaluation (as in sed), use three dots (.\|.\|.) instead of two.)
1448The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1449and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1450The precedence is a little lower than || and &&.
1451The value returned is either the null string for false, or a sequence number
1452(beginning with 1) for true.
1453The sequence number is reset for each range encountered.
1454The final sequence number in a range has the string \'E0\' appended to it, which
1455doesn't affect its numeric value, but gives you something to search for if you
1456want to exclude the endpoint.
1457You can exclude the beginning point by waiting for the sequence number to be
1458greater than 1.
1459If either operand of scalar .\|. is static, that operand is implicitly compared
1460to the $. variable, the current line number.
1461Examples:
1462.nf
1463
1464.ne 6
1465As a scalar operator:
1466 if (101 .\|. 200) { print; } # print 2nd hundred lines
1467
1468 next line if (1 .\|. /^$/); # skip header lines
1469
1470 s/^/> / if (/^$/ .\|. eof()); # quote body
1471
1472.ne 4
1473As an array operator:
1474 for (101 .\|. 200) { print; } # print $_ 100 times
1475
1476 @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1477 @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items
1478
1479.fi
1480.Ip \-x 8
1481A file test.
1482This unary operator takes one argument, either a filename or a filehandle,
1483and tests the associated file to see if something is true about it.
1484If the argument is omitted, tests $_, except for \-t, which tests
1485.IR STDIN .
1486It returns 1 for true and \'\' for false, or the undefined value if the
1487file doesn't exist.
1488Precedence is higher than logical and relational operators, but lower than
1489arithmetic operators.
1490The operator may be any of:
1491.nf
1492 \-r File is readable by effective uid/gid.
1493 \-w File is writable by effective uid/gid.
1494 \-x File is executable by effective uid/gid.
1495 \-o File is owned by effective uid.
1496 \-R File is readable by real uid/gid.
1497 \-W File is writable by real uid/gid.
1498 \-X File is executable by real uid/gid.
1499 \-O File is owned by real uid.
1500 \-e File exists.
1501 \-z File has zero size.
1502 \-s File has non-zero size (returns size).
1503 \-f File is a plain file.
1504 \-d File is a directory.
1505 \-l File is a symbolic link.
1506 \-p File is a named pipe (FIFO).
1507 \-S File is a socket.
1508 \-b File is a block special file.
1509 \-c File is a character special file.
1510 \-u File has setuid bit set.
1511 \-g File has setgid bit set.
1512 \-k File has sticky bit set.
1513 \-t Filehandle is opened to a tty.
1514 \-T File is a text file.
1515 \-B File is a binary file (opposite of \-T).
1516 \-M Age of file in days when script started.
1517 \-A Same for access time.
1518 \-C Same for inode change time.
1519
1520.fi
1521The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1522is based solely on the mode of the file and the uids and gids of the user.
1523There may be other reasons you can't actually read, write or execute the file.
1524Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1525\-x and \-X return 1 if any execute bit is set in the mode.
1526Scripts run by the superuser may thus need to do a stat() in order to determine
1527the actual mode of the file, or temporarily set the uid to something else.
1528.Sp
1529Example:
1530.nf
1531.ne 7
1532
1533 while (<>) {
1534 chop;
1535 next unless \-f $_; # ignore specials
1536 .\|.\|.
1537 }
1538
1539.fi
1540Note that \-s/a/b/ does not do a negated substitution.
1541Saying \-exp($foo) still works as expected, however\*(--only single letters
1542following a minus are interpreted as file tests.
1543.Sp
1544The \-T and \-B switches work as follows.
1545The first block or so of the file is examined for odd characters such as
1546strange control codes or metacharacters.
1547If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1548Also, any file containing null in the first block is considered a binary file.
1549If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1550rather than the first block.
1551Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1552a filehandle.
1553.PP
1554If any of the file tests (or either stat operator) are given the special
1555filehandle consisting of a solitary underline, then the stat structure
1556of the previous file test (or stat operator) is used, saving a system
1557call.
1558(This doesn't work with \-t, and you need to remember that lstat and -l
1559will leave values in the stat structure for the symbolic link, not the
1560real file.)
1561Example:
1562.nf
1563
1564 print "Can do.\en" if -r $a || -w _ || -x _;
1565
1566.ne 9
1567 stat($filename);
1568 print "Readable\en" if -r _;
1569 print "Writable\en" if -w _;
1570 print "Executable\en" if -x _;
1571 print "Setuid\en" if -u _;
1572 print "Setgid\en" if -g _;
1573 print "Sticky\en" if -k _;
1574 print "Text\en" if -T _;
1575 print "Binary\en" if -B _;
1576
1577.fi
1578.PP
1579Here is what C has that
1580.I perl
1581doesn't:
1582.Ip "unary &" 12
1583Address-of operator.
1584.Ip "unary *" 12
1585Dereference-address operator.
1586.Ip "(TYPE)" 12
1587Type casting operator.
1588.PP
1589Like C,
1590.I perl
1591does a certain amount of expression evaluation at compile time, whenever
1592it determines that all of the arguments to an operator are static and have
1593no side effects.
1594In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1595Backslash interpretation also happens at compile time.
1596You can say
1597.nf
1598
1599.ne 2
1600 \'Now is the time for all\' . "\|\e\|n" .
1601 \'good men to come to.\'
1602
1603.fi
1604and this all reduces to one string internally.
1605.PP
1606The autoincrement operator has a little extra built-in magic to it.
1607If you increment a variable that is numeric, or that has ever been used in
1608a numeric context, you get a normal increment.
1609If, however, the variable has only been used in string contexts since it
1610was set, and has a value that is not null and matches the
1611pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1612as a string, preserving each character within its range, with carry:
1613.nf
1614
1615 print ++($foo = \'99\'); # prints \*(L'100\*(R'
1616 print ++($foo = \'a0\'); # prints \*(L'a1\*(R'
1617 print ++($foo = \'Az\'); # prints \*(L'Ba\*(R'
1618 print ++($foo = \'zz\'); # prints \*(L'aaa\*(R'
1619
1620.fi
1621The autodecrement is not magical.
1622.PP
1623The range operator (in an array context) makes use of the magical
1624autoincrement algorithm if the minimum and maximum are strings.
1625You can say
1626
1627 @alphabet = (\'A\' .. \'Z\');
1628
1629to get all the letters of the alphabet, or
1630
1631 $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1632
1633to get a hexadecimal digit, or
1634
1635 @z2 = (\'01\' .. \'31\'); print @z2[$mday];
1636
1637to get dates with leading zeros.
1638(If the final value specified is not in the sequence that the magical increment
1639would produce, the sequence goes until the next value would be longer than
1640the final value specified.)
1641.PP
1642The || and && operators differ from C's in that, rather than returning 0 or 1,
1643they return the last value evaluated.
1644Thus, a portable way to find out the home directory might be:
1645.nf
1646
1647 $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1648 (getpwuid($<))[7] || die "You're homeless!\en";
1649
1650.fi
1651.PP
1652Along with the literals and variables mentioned earlier,
1653the operations in the following section can serve as terms in an expression.
1654Some of these operations take a LIST as an argument.
1655Such a list can consist of any combination of scalar arguments or array values;
1656the array values will be included in the list as if each individual element were
1657interpolated at that point in the list, forming a longer single-dimensional
1658array value.
1659Elements of the LIST should be separated by commas.
1660If an operation is listed both with and without parentheses around its
1661arguments, it means you can either use it as a unary operator or
1662as a function call.
1663To use it as a function call, the next token on the same line must
1664be a left parenthesis.
1665(There may be intervening white space.)
1666Such a function then has highest precedence, as you would expect from
1667a function.
1668If any token other than a left parenthesis follows, then it is a
1669unary operator, with a precedence depending only on whether it is a LIST
1670operator or not.
1671LIST operators have lowest precedence.
1672All other unary operators have a precedence greater than relational operators
1673but less than arithmetic operators.
1674See the section on Precedence.
1675.PP
1676For operators that can be used in either a scalar or array context,
1677failure is generally indicated in a scalar context by returning
1678the undefined value, and in an array context by returning the null list.
1679Remember though that
1680THERE IS NO GENERAL RULE FOR CONVERTING A LIST INTO A SCALAR.
1681Each operator decides which sort of scalar it would be most
1682appropriate to return.
1683Some operators return the length of the list
1684that would have been returned in an array context.
1685Some operators return the first value in the list.
1686Some operators return the last value in the list.
1687Some operators return a count of successful operations.
1688In general, they do what you want, unless you want consistency.
1689.Ip "/PATTERN/" 8 4
1690See m/PATTERN/.
1691.Ip "?PATTERN?" 8 4
1692This is just like the /pattern/ search, except that it matches only once between
1693calls to the
1694.I reset
1695operator.
1696This is a useful optimization when you only want to see the first occurrence of
1697something in each file of a set of files, for instance.
1698Only ?? patterns local to the current package are reset.
1699.Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2
1700Does the same thing that the accept system call does.
1701Returns true if it succeeded, false otherwise.
1702See example in section on Interprocess Communication.
1703.Ip "alarm(SECONDS)" 8 4
1704.Ip "alarm SECONDS" 8
1705Arranges to have a SIGALRM delivered to this process after the specified number
1706of seconds (minus 1, actually) have elapsed. Thus, alarm(15) will cause
1707a SIGALRM at some point more than 14 seconds in the future.
1708Only one timer may be counting at once. Each call disables the previous
1709timer, and an argument of 0 may be supplied to cancel the previous timer
1710without starting a new one.
1711The returned value is the amount of time remaining on the previous timer.
1712.Ip "atan2(Y,X)" 8 2
1713Returns the arctangent of Y/X in the range
1714.if t \-\(*p to \(*p.
1715.if n \-PI to PI.
1716.Ip "bind(SOCKET,NAME)" 8 2
1717Does the same thing that the bind system call does.
1718Returns true if it succeeded, false otherwise.
1719NAME should be a packed address of the proper type for the socket.
1720See example in section on Interprocess Communication.
1721.Ip "binmode(FILEHANDLE)" 8 4
1722.Ip "binmode FILEHANDLE" 8 4
1723Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
1724that distinguish between binary and text files.
1725Files that are not read in binary mode have CR LF sequences translated
1726to LF on input and LF translated to CR LF on output.
1727Binmode has no effect under Unix.
1728If FILEHANDLE is an expression, the value is taken as the name of
1729the filehandle.
1730.Ip "caller(EXPR)"
1731.Ip "caller"
1732Returns the context of the current subroutine call:
1733.nf
1734
1735 ($package,$filename,$line) = caller;
1736
1737.fi
1738With EXPR, returns some extra information that the debugger uses to print
1739a stack trace. The value of EXPR indicates how many call frames to go
1740back before the current one.
1741.Ip "chdir(EXPR)" 8 2
1742.Ip "chdir EXPR" 8 2
1743Changes the working directory to EXPR, if possible.
1744If EXPR is omitted, changes to home directory.
1745Returns 1 upon success, 0 otherwise.
1746See example under
1747.IR die .
1748.Ip "chmod(LIST)" 8 2
1749.Ip "chmod LIST" 8 2
1750Changes the permissions of a list of files.
1751The first element of the list must be the numerical mode.
1752Returns the number of files successfully changed.
1753.nf
1754
1755.ne 2
1756 $cnt = chmod 0755, \'foo\', \'bar\';
1757 chmod 0755, @executables;
1758
1759.fi
1760.Ip "chop(LIST)" 8 7
1761.Ip "chop(VARIABLE)" 8
1762.Ip "chop VARIABLE" 8
1763.Ip "chop" 8
1764Chops off the last character of a string and returns the character chopped.
1765It's used primarily to remove the newline from the end of an input record,
1766but is much more efficient than s/\en// because it neither scans nor copies
1767the string.
1768If VARIABLE is omitted, chops $_.
1769Example:
1770.nf
1771
1772.ne 5
1773 while (<>) {
1774 chop; # avoid \en on last field
1775 @array = split(/:/);
1776 .\|.\|.
1777 }
1778
1779.fi
1780You can actually chop anything that's an lvalue, including an assignment:
1781.nf
1782
1783 chop($cwd = \`pwd\`);
1784 chop($answer = <STDIN>);
1785
1786.fi
1787If you chop a list, each element is chopped.
1788Only the value of the last chop is returned.
1789.Ip "chown(LIST)" 8 2
1790.Ip "chown LIST" 8 2
1791Changes the owner (and group) of a list of files.
1792The first two elements of the list must be the NUMERICAL uid and gid,
1793in that order.
1794Returns the number of files successfully changed.
1795.nf
1796
1797.ne 2
1798 $cnt = chown $uid, $gid, \'foo\', \'bar\';
1799 chown $uid, $gid, @filenames;
1800
1801.fi
1802.ne 23
1803Here's an example that looks up non-numeric uids in the passwd file:
1804.nf
1805
1806 print "User: ";
1807 $user = <STDIN>;
1808 chop($user);
1809 print "Files: "
1810 $pattern = <STDIN>;
1811 chop($pattern);
1812.ie t \{\
1813 open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
1814'br\}
1815.el \{\
1816 open(pass, \'/etc/passwd\')
1817 || die "Can't open passwd: $!\en";
1818'br\}
1819 while (<pass>) {
1820 ($login,$pass,$uid,$gid) = split(/:/);
1821 $uid{$login} = $uid;
1822 $gid{$login} = $gid;
1823 }
1824 @ary = <${pattern}>; # get filenames
1825 if ($uid{$user} eq \'\') {
1826 die "$user not in passwd file";
1827 }
1828 else {
1829 chown $uid{$user}, $gid{$user}, @ary;
1830 }
1831
1832.fi
1833.Ip "chroot(FILENAME)" 8 5
1834.Ip "chroot FILENAME" 8
1835Does the same as the system call of that name.
1836If you don't know what it does, don't worry about it.
1837If FILENAME is omitted, does chroot to $_.
1838.Ip "close(FILEHANDLE)" 8 5
1839.Ip "close FILEHANDLE" 8
1840Closes the file or pipe associated with the file handle.
1841You don't have to close FILEHANDLE if you are immediately going to
1842do another open on it, since open will close it for you.
1843(See
1844.IR open .)
1845However, an explicit close on an input file resets the line counter ($.), while
1846the implicit close done by
1847.I open
1848does not.
1849Also, closing a pipe will wait for the process executing on the pipe to complete,
1850in case you want to look at the output of the pipe afterwards.
1851Closing a pipe explicitly also puts the status value of the command into $?.
1852Example:
1853.nf
1854
1855.ne 4
1856 open(OUTPUT, \'|sort >foo\'); # pipe to sort
1857 .\|.\|. # print stuff to output
1858 close OUTPUT; # wait for sort to finish
1859 open(INPUT, \'foo\'); # get sort's results
1860
1861.fi
1862FILEHANDLE may be an expression whose value gives the real filehandle name.
1863.Ip "closedir(DIRHANDLE)" 8 5
1864.Ip "closedir DIRHANDLE" 8
1865Closes a directory opened by opendir().
1866.Ip "connect(SOCKET,NAME)" 8 2
1867Does the same thing that the connect system call does.
1868Returns true if it succeeded, false otherwise.
1869NAME should be a package address of the proper type for the socket.
1870See example in section on Interprocess Communication.
1871.Ip "cos(EXPR)" 8 6
1872.Ip "cos EXPR" 8 6
1873Returns the cosine of EXPR (expressed in radians).
1874If EXPR is omitted takes cosine of $_.
1875.Ip "crypt(PLAINTEXT,SALT)" 8 6
1876Encrypts a string exactly like the crypt() function in the C library.
1877Useful for checking the password file for lousy passwords.
1878Only the guys wearing white hats should do this.
1879.Ip "dbmclose(ASSOC_ARRAY)" 8 6
1880.Ip "dbmclose ASSOC_ARRAY" 8
1881Breaks the binding between a dbm file and an associative array.
1882The values remaining in the associative array are meaningless unless
1883you happen to want to know what was in the cache for the dbm file.
1884This function is only useful if you have ndbm.
1885.Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
1886This binds a dbm or ndbm file to an associative array.
1887ASSOC is the name of the associative array.
1888(Unlike normal open, the first argument is NOT a filehandle, even though
1889it looks like one).
1890DBNAME is the name of the database (without the .dir or .pag extension).
1891If the database does not exist, it is created with protection specified
1892by MODE (as modified by the umask).
1893If your system only supports the older dbm functions, you may perform only one
1894dbmopen in your program.
1895If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
1896error.
1897.Sp
1898Values assigned to the associative array prior to the dbmopen are lost.
1899A certain number of values from the dbm file are cached in memory.
1900By default this number is 64, but you can increase it by preallocating
1901that number of garbage entries in the associative array before the dbmopen.
1902You can flush the cache if necessary with the reset command.
1903.Sp
1904If you don't have write access to the dbm file, you can only read
1905associative array variables, not set them.
1906If you want to test whether you can write, either use file tests or
1907try setting a dummy array entry inside an eval, which will trap the error.
1908.Sp
1909Note that functions such as keys() and values() may return huge array values
1910when used on large dbm files.
1911You may prefer to use the each() function to iterate over large dbm files.
1912Example:
1913.nf
1914
1915.ne 6
1916 # print out history file offsets
1917 dbmopen(HIST,'/usr/lib/news/history',0666);
1918 while (($key,$val) = each %HIST) {
1919 print $key, ' = ', unpack('L',$val), "\en";
1920 }
1921 dbmclose(HIST);
1922
1923.fi
1924.Ip "defined(EXPR)" 8 6
1925.Ip "defined EXPR" 8
1926Returns a boolean value saying whether the lvalue EXPR has a real value
1927or not.
1928Many operations return the undefined value under exceptional conditions,
1929such as end of file, uninitialized variable, system error and such.
1930This function allows you to distinguish between an undefined null string
1931and a defined null string with operations that might return a real null
1932string, in particular referencing elements of an array.
1933You may also check to see if arrays or subroutines exist.
1934Use on predefined variables is not guaranteed to produce intuitive results.
1935Examples:
1936.nf
1937
1938.ne 7
1939 print if defined $switch{'D'};
1940 print "$val\en" while defined($val = pop(@ary));
1941 die "Can't readlink $sym: $!"
1942 unless defined($value = readlink $sym);
1943 eval '@foo = ()' if defined(@foo);
1944 die "No XYZ package defined" unless defined %_XYZ;
1945 sub foo { defined &$bar ? &$bar(@_) : die "No bar"; }
1946
1947.fi
1948See also undef.
1949.Ip "delete $ASSOC{KEY}" 8 6
1950Deletes the specified value from the specified associative array.
1951Returns the deleted value, or the undefined value if nothing was deleted.
1952Deleting from $ENV{} modifies the environment.
1953Deleting from an array bound to a dbm file deletes the entry from the dbm
1954file.
1955.Sp
1956The following deletes all the values of an associative array:
1957.nf
1958
1959.ne 3
1960 foreach $key (keys %ARRAY) {
1961 delete $ARRAY{$key};
1962 }
1963
1964.fi
1965(But it would be faster to use the
1966.I reset
1967command.
1968Saying undef %ARRAY is faster yet.)
1969.Ip "die(LIST)" 8
1970.Ip "die LIST" 8
1971Outside of an eval, prints the value of LIST to
1972.I STDERR
1973and exits with the current value of $!
1974(errno).
1975If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
1976If ($? >> 8) is 0, exits with 255.
1977Inside an eval, the error message is stuffed into $@ and the eval is terminated
1978with the undefined value.
1979.Sp
1980Equivalent examples:
1981.nf
1982
1983.ne 3
1984.ie t \{\
1985 die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
1986'br\}
1987.el \{\
1988 die "Can't cd to spool: $!\en"
1989 unless chdir \'/usr/spool/news\';
1990'br\}
1991
1992 chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
1993
1994.fi
1995.Sp
1996If the value of EXPR does not end in a newline, the current script line
1997number and input line number (if any) are also printed, and a newline is
1998supplied.
1999Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
2000better sense when the string \*(L"at foo line 123\*(R" is appended.
2001Suppose you are running script \*(L"canasta\*(R".
2002.nf
2003
2004.ne 7
2005 die "/etc/games is no good";
2006 die "/etc/games is no good, stopped";
2007
2008produce, respectively
2009
2010 /etc/games is no good at canasta line 123.
2011 /etc/games is no good, stopped at canasta line 123.
2012
2013.fi
2014See also
2015.IR exit .
2016.Ip "do BLOCK" 8 4
2017Returns the value of the last command in the sequence of commands indicated
2018by BLOCK.
2019When modified by a loop modifier, executes the BLOCK once before testing the
2020loop condition.
2021(On other statements the loop modifiers test the conditional first.)
2022.Ip "do SUBROUTINE (LIST)" 8 3
2023Executes a SUBROUTINE declared by a
2024.I sub
2025declaration, and returns the value
2026of the last expression evaluated in SUBROUTINE.
2027If there is no subroutine by that name, produces a fatal error.
2028(You may use the \*(L"defined\*(R" operator to determine if a subroutine
2029exists.)
2030If you pass arrays as part of LIST you may wish to pass the length
2031of the array in front of each array.
2032(See the section on subroutines later on.)
2033The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
2034form.
2035.Sp
2036SUBROUTINE may also be a single scalar variable, in which case
2037the name of the subroutine to execute is taken from the variable.
2038.Sp
2039As an alternate (and preferred) form,
2040you may call a subroutine by prefixing the name with
2041an ampersand: &foo(@args).
2042If you aren't passing any arguments, you don't have to use parentheses.
2043If you omit the parentheses, no @_ array is passed to the subroutine.
2044The & form is also used to specify subroutines to the defined and undef
2045operators:
2046.nf
2047
2048 if (defined &$var) { &$var($parm); undef &$var; }
2049
2050.fi
2051.Ip "do EXPR" 8 3
2052Uses the value of EXPR as a filename and executes the contents of the file
2053as a
2054.I perl
2055script.
2056Its primary use is to include subroutines from a
2057.I perl
2058subroutine library.
2059.nf
2060
2061 do \'stat.pl\';
2062
2063is just like
2064
2065 eval \`cat stat.pl\`;
2066
2067.fi
2068except that it's more efficient, more concise, keeps track of the current
2069filename for error messages, and searches all the
2070.B \-I
2071libraries if the file
2072isn't in the current directory (see also the @INC array in Predefined Names).
2073It's the same, however, in that it does reparse the file every time you
2074call it, so if you are going to use the file inside a loop you might prefer
2075to use \-P and #include, at the expense of a little more startup time.
2076(The main problem with #include is that cpp doesn't grok # comments\*(--a
2077workaround is to use \*(L";#\*(R" for standalone comments.)
2078Note that the following are NOT equivalent:
2079.nf
2080
2081.ne 2
2082 do $foo; # eval a file
2083 do $foo(); # call a subroutine
2084
2085.fi
2086Note that inclusion of library routines is better done with
2087the \*(L"require\*(R" operator.
2088.Ip "dump LABEL" 8 6
2089This causes an immediate core dump.
2090Primarily this is so that you can use the undump program to turn your
2091core dump into an executable binary after having initialized all your
2092variables at the beginning of the program.
2093When the new binary is executed it will begin by executing a "goto LABEL"
2094(with all the restrictions that goto suffers).
2095Think of it as a goto with an intervening core dump and reincarnation.
2096If LABEL is omitted, restarts the program from the top.
2097WARNING: any files opened at the time of the dump will NOT be open any more
2098when the program is reincarnated, with possible resulting confusion on the part
2099of perl.
2100See also \-u.
2101.Sp
2102Example:
2103.nf
2104
2105.ne 16
2106 #!/usr/bin/perl
2107 require 'getopt.pl';
2108 require 'stat.pl';
2109 %days = (
2110 'Sun',1,
2111 'Mon',2,
2112 'Tue',3,
2113 'Wed',4,
2114 'Thu',5,
2115 'Fri',6,
2116 'Sat',7);
2117
2118 dump QUICKSTART if $ARGV[0] eq '-d';
2119
2120 QUICKSTART:
2121 do Getopt('f');
2122
2123.fi
2124.Ip "each(ASSOC_ARRAY)" 8 6
2125.Ip "each ASSOC_ARRAY" 8
2126Returns a 2 element array consisting of the key and value for the next
2127value of an associative array, so that you can iterate over it.
2128Entries are returned in an apparently random order.
2129When the array is entirely read, a null array is returned (which when
2130assigned produces a FALSE (0) value).
2131The next call to each() after that will start iterating again.
2132The iterator can be reset only by reading all the elements from the array.
2133You must not modify the array while iterating over it.
2134There is a single iterator for each associative array, shared by all
2135each(), keys() and values() function calls in the program.
2136The following prints out your environment like the printenv program, only
2137in a different order:
2138.nf
2139
2140.ne 3
2141 while (($key,$value) = each %ENV) {
2142 print "$key=$value\en";
2143 }
2144
2145.fi
2146See also keys() and values().
2147.Ip "eof(FILEHANDLE)" 8 8
2148.Ip "eof()" 8
2149.Ip "eof" 8
2150Returns 1 if the next read on FILEHANDLE will return end of file, or if
2151FILEHANDLE is not open.
2152FILEHANDLE may be an expression whose value gives the real filehandle name.
2153(Note that this function actually reads a character and then ungetc's it,
2154so it is not very useful in an interactive context.)
2155An eof without an argument returns the eof status for the last file read.
2156Empty parentheses () may be used to indicate the pseudo file formed of the
2157files listed on the command line, i.e. eof() is reasonable to use inside
2158a while (<>) loop to detect the end of only the last file.
2159Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
2160Examples:
2161.nf
2162
2163.ne 7
2164 # insert dashes just before last line of last file
2165 while (<>) {
2166 if (eof()) {
2167 print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
2168 }
2169 print;
2170 }
2171
2172.ne 7
2173 # reset line numbering on each input file
2174 while (<>) {
2175 print "$.\et$_";
2176 if (eof) { # Not eof().
2177 close(ARGV);
2178 }
2179 }
2180
2181.fi
2182.Ip "eval(EXPR)" 8 6
2183.Ip "eval EXPR" 8 6
2184.Ip "eval BLOCK" 8 6
2185EXPR is parsed and executed as if it were a little
2186.I perl
2187program.
2188It is executed in the context of the current
2189.I perl
2190program, so that
2191any variable settings, subroutine or format definitions remain afterwards.
2192The value returned is the value of the last expression evaluated, just
2193as with subroutines.
2194If there is a syntax error or runtime error, or a die statement is
2195executed, an undefined value is returned by
2196eval, and $@ is set to the error message.
2197If there was no error, $@ is guaranteed to be a null string.
2198If EXPR is omitted, evaluates $_.
2199The final semicolon, if any, may be omitted from the expression.
2200.Sp
2201Note that, since eval traps otherwise-fatal errors, it is useful for
2202determining whether a particular feature
2203(such as dbmopen or symlink) is implemented.
2204It is also Perl's exception trapping mechanism, where the die operator is
2205used to raise exceptions.
2206.Sp
2207If the code to be executed doesn't vary, you may use
2208the eval-BLOCK form to trap run-time errors without incurring
2209the penalty of recompiling each time.
2210The error, if any, is still returned in $@.
2211Evaluating a single-quoted string (as EXPR) has the same effect, except that
2212the eval-EXPR form reports syntax errors at run time via $@, whereas the
2213eval-BLOCK form reports syntax errors at compile time. The eval-EXPR form
2214is optimized to eval-BLOCK the first time it succeeds. (Since the replacement
2215side of a substitution is considered a single-quoted string when you
2216use the e modifier, the same optimization occurs there.) Examples:
2217.nf
2218
2219.ne 11
2220 # make divide-by-zero non-fatal
2221 eval { $answer = $a / $b; }; warn $@ if $@;
2222
2223 # optimized to same thing after first use
2224 eval '$answer = $a / $b'; warn $@ if $@;
2225
2226 # a compile-time error
2227 eval { $answer = };
2228
2229 # a run-time error
2230 eval '$answer ='; # sets $@
2231
2232.fi
2233.Ip "exec(LIST)" 8 8
2234.Ip "exec LIST" 8 6
2235If there is more than one argument in LIST, or if LIST is an array with
2236more than one value,
2237calls execvp() with the arguments in LIST.
2238If there is only one scalar argument, the argument is checked for shell metacharacters.
2239If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
2240If there are none, the argument is split into words and passed directly to
2241execvp(), which is more efficient.
2242Note: exec (and system) do not flush your output buffer, so you may need to
2243set $| to avoid lost output.
2244Examples:
2245.nf
2246
2247 exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
2248 exec "sort $outfile | uniq";
2249
2250.fi
2251.Sp
2252If you don't really want to execute the first argument, but want to lie
2253to the program you are executing about its own name, you can specify
2254the program you actually want to run by assigning that to a variable and
2255putting the name of the variable in front of the LIST without a comma.
2256(This always forces interpretation of the LIST as a multi-valued list, even
2257if there is only a single scalar in the list.)
2258Example:
2259.nf
2260
2261.ne 2
2262 $shell = '/bin/csh';
2263 exec $shell '-sh'; # pretend it's a login shell
2264
2265.fi
2266.Ip "exit(EXPR)" 8 6
2267.Ip "exit EXPR" 8
2268Evaluates EXPR and exits immediately with that value.
2269Example:
2270.nf
2271
2272.ne 2
2273 $ans = <STDIN>;
2274 exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
2275
2276.fi
2277See also
2278.IR die .
2279If EXPR is omitted, exits with 0 status.
2280.Ip "exp(EXPR)" 8 3
2281.Ip "exp EXPR" 8
2282Returns
2283.I e
2284to the power of EXPR.
2285If EXPR is omitted, gives exp($_).
2286.Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2287Implements the fcntl(2) function.
2288You'll probably have to say
2289.nf
2290
2291 require "fcntl.ph"; # probably /usr/local/lib/perl/fcntl.ph
2292
2293.fi
2294first to get the correct function definitions.
2295If fcntl.ph doesn't exist or doesn't have the correct definitions
2296you'll have to roll
2297your own, based on your C header files such as <sys/fcntl.h>.
2298(There is a perl script called h2ph that comes with the perl kit
2299which may help you in this.)
2300Argument processing and value return works just like ioctl below.
2301Note that fcntl will produce a fatal error if used on a machine that doesn't implement
2302fcntl(2).
2303.Ip "fileno(FILEHANDLE)" 8 4
2304.Ip "fileno FILEHANDLE" 8 4
2305Returns the file descriptor for a filehandle.
2306Useful for constructing bitmaps for select().
2307If FILEHANDLE is an expression, the value is taken as the name of
2308the filehandle.
2309.Ip "flock(FILEHANDLE,OPERATION)" 8 4
2310Calls flock(2) on FILEHANDLE.
2311See manual page for flock(2) for definition of OPERATION.
2312Returns true for success, false on failure.
2313Will produce a fatal error if used on a machine that doesn't implement
2314flock(2).
2315Here's a mailbox appender for BSD systems.
2316.nf
2317
2318.ne 20
2319 $LOCK_SH = 1;
2320 $LOCK_EX = 2;
2321 $LOCK_NB = 4;
2322 $LOCK_UN = 8;
2323
2324 sub lock {
2325 flock(MBOX,$LOCK_EX);
2326 # and, in case someone appended
2327 # while we were waiting...
2328 seek(MBOX, 0, 2);
2329 }
2330
2331 sub unlock {
2332 flock(MBOX,$LOCK_UN);
2333 }
2334
2335 open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
2336 || die "Can't open mailbox: $!";
2337
2338 do lock();
2339 print MBOX $msg,"\en\en";
2340 do unlock();
2341
2342.fi
2343.Ip "fork" 8 4
2344Does a fork() call.
2345Returns the child pid to the parent process and 0 to the child process.
2346Note: unflushed buffers remain unflushed in both processes, which means
2347you may need to set $| to avoid duplicate output.
2348.Ip "getc(FILEHANDLE)" 8 4
2349.Ip "getc FILEHANDLE" 8
2350.Ip "getc" 8
2351Returns the next character from the input file attached to FILEHANDLE, or
2352a null string at EOF.
2353If FILEHANDLE is omitted, reads from STDIN.
2354.Ip "getlogin" 8 3
2355Returns the current login from /etc/utmp, if any.
2356If null, use getpwuid.
2357
2358 $login = getlogin || (getpwuid($<))[0] || "Somebody";
2359
2360.Ip "getpeername(SOCKET)" 8 3
2361Returns the packed sockaddr address of other end of the SOCKET connection.
2362.nf
2363
2364.ne 4
2365 # An internet sockaddr
2366 $sockaddr = 'S n a4 x8';
2367 $hersockaddr = getpeername(S);
2368.ie t \{\
2369 ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
2370'br\}
2371.el \{\
2372 ($family, $port, $heraddr) =
2373 unpack($sockaddr,$hersockaddr);
2374'br\}
2375
2376.fi
2377.Ip "getpgrp(PID)" 8 4
2378.Ip "getpgrp PID" 8
2379Returns the current process group for the specified PID, 0 for the current
2380process.
2381Will produce a fatal error if used on a machine that doesn't implement
2382getpgrp(2).
2383If EXPR is omitted, returns process group of current process.
2384.Ip "getppid" 8 4
2385Returns the process id of the parent process.
2386.Ip "getpriority(WHICH,WHO)" 8 4
2387Returns the current priority for a process, a process group, or a user.
2388(See getpriority(2).)
2389Will produce a fatal error if used on a machine that doesn't implement
2390getpriority(2).
2391.Ip "getpwnam(NAME)" 8
2392.Ip "getgrnam(NAME)" 8
2393.Ip "gethostbyname(NAME)" 8
2394.Ip "getnetbyname(NAME)" 8
2395.Ip "getprotobyname(NAME)" 8
2396.Ip "getpwuid(UID)" 8
2397.Ip "getgrgid(GID)" 8
2398.Ip "getservbyname(NAME,PROTO)" 8
2399.Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
2400.Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
2401.Ip "getprotobynumber(NUMBER)" 8
2402.Ip "getservbyport(PORT,PROTO)" 8
2403.Ip "getpwent" 8
2404.Ip "getgrent" 8
2405.Ip "gethostent" 8
2406.Ip "getnetent" 8
2407.Ip "getprotoent" 8
2408.Ip "getservent" 8
2409.Ip "setpwent" 8
2410.Ip "setgrent" 8
2411.Ip "sethostent(STAYOPEN)" 8
2412.Ip "setnetent(STAYOPEN)" 8
2413.Ip "setprotoent(STAYOPEN)" 8
2414.Ip "setservent(STAYOPEN)" 8
2415.Ip "endpwent" 8
2416.Ip "endgrent" 8
2417.Ip "endhostent" 8
2418.Ip "endnetent" 8
2419.Ip "endprotoent" 8
2420.Ip "endservent" 8
2421These routines perform the same functions as their counterparts in the
2422system library.
2423Within an array context,
2424the return values from the various get routines are as follows:
2425.nf
2426
2427 ($name,$passwd,$uid,$gid,
2428 $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
2429 ($name,$passwd,$gid,$members) = getgr.\|.\|.
2430 ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
2431 ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
2432 ($name,$aliases,$proto) = getproto.\|.\|.
2433 ($name,$aliases,$port,$proto) = getserv.\|.\|.
2434
2435.fi
2436(If the entry doesn't exist you get a null list.)
2437.Sp
2438Within a scalar context, you get the name, unless the function was a
2439lookup by name, in which case you get the other thing, whatever it is.
2440(If the entry doesn't exist you get the undefined value.)
2441For example:
2442.nf
2443
2444 $uid = getpwnam
2445 $name = getpwuid
2446 $name = getpwent
2447 $gid = getgrnam
2448 $name = getgrgid
2449 $name = getgrent
2450 etc.
2451
2452.fi
2453The $members value returned by getgr.\|.\|. is a space separated list
2454of the login names of the members of the group.
2455.Sp
2456For the gethost.\|.\|. functions, if the h_errno variable is supported in C,
2457it will be returned to you via $? if the function call fails.
2458The @addrs value returned by a successful call is a list of the
2459raw addresses returned by the corresponding system library call.
2460In the Internet domain, each address is four bytes long and you can unpack
2461it by saying something like:
2462.nf
2463
2464 ($a,$b,$c,$d) = unpack('C4',$addr[0]);
2465
2466.fi
2467.Ip "getsockname(SOCKET)" 8 3
2468Returns the packed sockaddr address of this end of the SOCKET connection.
2469.nf
2470
2471.ne 4
2472 # An internet sockaddr
2473 $sockaddr = 'S n a4 x8';
2474 $mysockaddr = getsockname(S);
2475.ie t \{\
2476 ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
2477'br\}
2478.el \{\
2479 ($family, $port, $myaddr) =
2480 unpack($sockaddr,$mysockaddr);
2481'br\}
2482
2483.fi
2484.Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
2485Returns the socket option requested, or undefined if there is an error.
2486.Ip "gmtime(EXPR)" 8 4
2487.Ip "gmtime EXPR" 8
2488Converts a time as returned by the time function to a 9-element array with
2489the time analyzed for the Greenwich timezone.
2490Typically used as follows:
2491.nf
2492
2493.ne 3
2494.ie t \{\
2495 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
2496'br\}
2497.el \{\
2498 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2499 gmtime(time);
2500'br\}
2501
2502.fi
2503All array elements are numeric, and come straight out of a struct tm.
2504In particular this means that $mon has the range 0.\|.11 and $wday has the
2505range 0.\|.6.
2506If EXPR is omitted, does gmtime(time).
2507.Ip "goto LABEL" 8 6
2508Finds the statement labeled with LABEL and resumes execution there.
2509Currently you may only go to statements in the main body of the program
2510that are not nested inside a do {} construct.
2511This statement is not implemented very efficiently, and is here only to make
2512the
2513.IR sed -to- perl
2514translator easier.
2515I may change its semantics at any time, consistent with support for translated
2516.I sed
2517scripts.
2518Use it at your own risk.
2519Better yet, don't use it at all.
2520.Ip "grep(EXPR,LIST)" 8 4
2521Evaluates EXPR for each element of LIST (locally setting $_ to each element)
2522and returns the array value consisting of those elements for which the
2523expression evaluated to true.
2524In a scalar context, returns the number of times the expression was true.
2525.nf
2526
2527 @foo = grep(!/^#/, @bar); # weed out comments
2528
2529.fi
2530Note that, since $_ is a reference into the array value, it can be
2531used to modify the elements of the array.
2532While this is useful and supported, it can cause bizarre results if
2533the LIST is not a named array.
2534.Ip "hex(EXPR)" 8 4
2535.Ip "hex EXPR" 8
2536Returns the decimal value of EXPR interpreted as an hex string.
2537(To interpret strings that might start with 0 or 0x see oct().)
2538If EXPR is omitted, uses $_.
2539.Ip "index(STR,SUBSTR,POSITION)" 8 4
2540.Ip "index(STR,SUBSTR)" 8 4
2541Returns the position of the first occurrence of SUBSTR in STR at or after
2542POSITION.
2543If POSITION is omitted, starts searching from the beginning of the string.
2544The return value is based at 0, or whatever you've
2545set the $[ variable to.
2546If the substring is not found, returns one less than the base, ordinarily \-1.
2547.Ip "int(EXPR)" 8 4
2548.Ip "int EXPR" 8
2549Returns the integer portion of EXPR.
2550If EXPR is omitted, uses $_.
2551.Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2552Implements the ioctl(2) function.
2553You'll probably have to say
2554.nf
2555
2556 require "ioctl.ph"; # probably /usr/local/lib/perl/ioctl.ph
2557
2558.fi
2559first to get the correct function definitions.
2560If ioctl.ph doesn't exist or doesn't have the correct definitions
2561you'll have to roll
2562your own, based on your C header files such as <sys/ioctl.h>.
2563(There is a perl script called h2ph that comes with the perl kit
2564which may help you in this.)
2565SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
2566to the string value of SCALAR will be passed as the third argument of
2567the actual ioctl call.
2568(If SCALAR has no string value but does have a numeric value, that value
2569will be passed rather than a pointer to the string value.
2570To guarantee this to be true, add a 0 to the scalar before using it.)
2571The pack() and unpack() functions are useful for manipulating the values
2572of structures used by ioctl().
2573The following example sets the erase character to DEL.
2574.nf
2575
2576.ne 9
2577 require 'ioctl.ph';
2578 $sgttyb_t = "ccccs"; # 4 chars and a short
2579 if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
2580 @ary = unpack($sgttyb_t,$sgttyb);
2581 $ary[2] = 127;
2582 $sgttyb = pack($sgttyb_t,@ary);
2583 ioctl(STDIN,$TIOCSETP,$sgttyb)
2584 || die "Can't ioctl: $!";
2585 }
2586
2587.fi
2588The return value of ioctl (and fcntl) is as follows:
2589.nf
2590
2591.ne 4
2592 if OS returns:\h'|3i'perl returns:
2593 -1\h'|3i' undefined value
2594 0\h'|3i' string "0 but true"
2595 anything else\h'|3i' that number
2596
2597.fi
2598Thus perl returns true on success and false on failure, yet you can still
2599easily determine the actual value returned by the operating system:
2600.nf
2601
2602 ($retval = ioctl(...)) || ($retval = -1);
2603 printf "System returned %d\en", $retval;
2604.fi
2605.Ip "join(EXPR,LIST)" 8 8
2606.Ip "join(EXPR,ARRAY)" 8
2607Joins the separate strings of LIST or ARRAY into a single string with fields
2608separated by the value of EXPR, and returns the string.
2609Example:
2610.nf
2611
2612.ie t \{\
2613 $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2614'br\}
2615.el \{\
2616 $_ = join(\|\':\',
2617 $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2618'br\}
2619
2620.fi
2621See
2622.IR split .
2623.Ip "keys(ASSOC_ARRAY)" 8 6
2624.Ip "keys ASSOC_ARRAY" 8
2625Returns a normal array consisting of all the keys of the named associative
2626array.
2627The keys are returned in an apparently random order, but it is the same order
2628as either the values() or each() function produces (given that the associative array
2629has not been modified).
2630Here is yet another way to print your environment:
2631.nf
2632
2633.ne 5
2634 @keys = keys %ENV;
2635 @values = values %ENV;
2636 while ($#keys >= 0) {
2637 print pop(@keys), \'=\', pop(@values), "\en";
2638 }
2639
2640or how about sorted by key:
2641
2642.ne 3
2643 foreach $key (sort(keys %ENV)) {
2644 print $key, \'=\', $ENV{$key}, "\en";
2645 }
2646
2647.fi
2648.Ip "kill(LIST)" 8 8
2649.Ip "kill LIST" 8 2
2650Sends a signal to a list of processes.
2651The first element of the list must be the signal to send.
2652Returns the number of processes successfully signaled.
2653.nf
2654
2655 $cnt = kill 1, $child1, $child2;
2656 kill 9, @goners;
2657
2658.fi
2659If the signal is negative, kills process groups instead of processes.
2660(On System V, a negative \fIprocess\fR number will also kill process groups,
2661but that's not portable.)
2662You may use a signal name in quotes.
2663.Ip "last LABEL" 8 8
2664.Ip "last" 8
2665The
2666.I last
2667command is like the
2668.I break
2669statement in C (as used in loops); it immediately exits the loop in question.
2670If the LABEL is omitted, the command refers to the innermost enclosing loop.
2671The
2672.I continue
2673block, if any, is not executed:
2674.nf
2675
2676.ne 4
2677 line: while (<STDIN>) {
2678 last line if /\|^$/; # exit when done with header
2679 .\|.\|.
2680 }
2681
2682.fi
2683.Ip "length(EXPR)" 8 4
2684.Ip "length EXPR" 8
2685Returns the length in characters of the value of EXPR.
2686If EXPR is omitted, returns length of $_.
2687.Ip "link(OLDFILE,NEWFILE)" 8 2
2688Creates a new filename linked to the old filename.
2689Returns 1 for success, 0 otherwise.
2690.Ip "listen(SOCKET,QUEUESIZE)" 8 2
2691Does the same thing that the listen system call does.
2692Returns true if it succeeded, false otherwise.
2693See example in section on Interprocess Communication.
2694.Ip "local(LIST)" 8 4
2695Declares the listed variables to be local to the enclosing block,
2696subroutine, eval or \*(L"do\*(R".
2697All the listed elements must be legal lvalues.
2698This operator works by saving the current values of those variables in LIST
2699on a hidden stack and restoring them upon exiting the block, subroutine or eval.
2700This means that called subroutines can also reference the local variable,
2701but not the global one.
2702The LIST may be assigned to if desired, which allows you to initialize
2703your local variables.
2704(If no initializer is given for a particular variable, it is created with
2705an undefined value.)
2706Commonly this is used to name the parameters to a subroutine.
2707Examples:
2708.nf
2709
2710.ne 13
2711 sub RANGEVAL {
2712 local($min, $max, $thunk) = @_;
2713 local($result) = \'\';
2714 local($i);
2715
2716 # Presumably $thunk makes reference to $i
2717
2718 for ($i = $min; $i < $max; $i++) {
2719 $result .= eval $thunk;
2720 }
2721
2722 $result;
2723 }
2724
2725.ne 6
2726 if ($sw eq \'-v\') {
2727 # init local array with global array
2728 local(@ARGV) = @ARGV;
2729 unshift(@ARGV,\'echo\');
2730 system @ARGV;
2731 }
2732 # @ARGV restored
2733
2734.ne 6
2735 # temporarily add to digits associative array
2736 if ($base12) {
2737 # (NOTE: not claiming this is efficient!)
2738 local(%digits) = (%digits,'t',10,'e',11);
2739 do parse_num();
2740 }
2741
2742.fi
2743Note that local() is a run-time command, and so gets executed every time
2744through a loop, using up more stack storage each time until it's all
2745released at once when the loop is exited.
2746.Ip "localtime(EXPR)" 8 4
2747.Ip "localtime EXPR" 8
2748Converts a time as returned by the time function to a 9-element array with
2749the time analyzed for the local timezone.
2750Typically used as follows:
2751.nf
2752
2753.ne 3
2754.ie t \{\
2755 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
2756'br\}
2757.el \{\
2758 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2759 localtime(time);
2760'br\}
2761
2762.fi
2763All array elements are numeric, and come straight out of a struct tm.
2764In particular this means that $mon has the range 0.\|.11 and $wday has the
2765range 0.\|.6.
2766If EXPR is omitted, does localtime(time).
2767.Ip "log(EXPR)" 8 4
2768.Ip "log EXPR" 8
2769Returns logarithm (base
2770.IR e )
2771of EXPR.
2772If EXPR is omitted, returns log of $_.
2773.Ip "lstat(FILEHANDLE)" 8 6
2774.Ip "lstat FILEHANDLE" 8
2775.Ip "lstat(EXPR)" 8
2776.Ip "lstat SCALARVARIABLE" 8
2777Does the same thing as the stat() function, but stats a symbolic link
2778instead of the file the symbolic link points to.
2779If symbolic links are unimplemented on your system, a normal stat is done.
2780.Ip "m/PATTERN/gio" 8 4
2781.Ip "/PATTERN/gio" 8
2782Searches a string for a pattern match, and returns true (1) or false (\'\').
2783If no string is specified via the =~ or !~ operator,
2784the $_ string is searched.
2785(The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
2786See also the section on regular expressions.
2787.Sp
2788If / is the delimiter then the initial \*(L'm\*(R' is optional.
2789With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
2790as delimiters.
2791This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
2792If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
2793done in a case-insensitive manner.
2794PATTERN may contain references to scalar variables, which will be interpolated
2795(and the pattern recompiled) every time the pattern search is evaluated.
2796(Note that $) and $| may not be interpolated because they look like end-of-string tests.)
2797If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
2798the trailing delimiter.
2799This avoids expensive run-time recompilations, and
2800is useful when the value you are interpolating won't change over the
2801life of the script.
2802If the PATTERN evaluates to a null string, the most recent successful
2803regular expression is used instead.
2804.Sp
2805If used in a context that requires an array value, a pattern match returns an
2806array consisting of the subexpressions matched by the parentheses in the
2807pattern,
2808i.e. ($1, $2, $3.\|.\|.).
2809It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
2810or $'.
2811If the match fails, a null array is returned.
2812If the match succeeds, but there were no parentheses, an array value of (1)
2813is returned.
2814.Sp
2815Examples:
2816.nf
2817
2818.ne 4
2819 open(tty, \'/dev/tty\');
2820 <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|); # do foo if desired
2821
2822 if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
2823
2824 next if m#^/usr/spool/uucp#;
2825
2826.ne 5
2827 # poor man's grep
2828 $arg = shift;
2829 while (<>) {
2830 print if /$arg/o; # compile only once
2831 }
2832
2833 if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
2834
2835.fi
2836This last example splits $foo into the first two words and the remainder
2837of the line, and assigns those three fields to $F1, $F2 and $Etc.
2838The conditional is true if any variables were assigned, i.e. if the pattern
2839matched.
2840.Sp
2841The \*(L"g\*(R" modifier specifies global pattern matching\*(--that is,
2842matching as many times as possible within the string. How it behaves
2843depends on the context. In an array context, it returns a list of
2844all the substrings matched by all the parentheses in the regular expression.
2845If there are no parentheses, it returns a list of all the matched strings,
2846as if there were parentheses around the whole pattern. In a scalar context,
2847it iterates through the string, returning TRUE each time it matches, and
2848FALSE when it eventually runs out of matches. (In other words, it remembers
2849where it left off last time and restarts the search at that point.) It
2850presumes that you have not modified the string since the last match.
2851Modifying the string between matches may result in undefined behavior.
2852(You can actually get away with in-place modifications via substr()
2853that do not change the length of the entire string. In general, however,
2854you should be using s///g for such modifications.) Examples:
2855.nf
2856
2857 # array context
2858 ($one,$five,$fifteen) = (\`uptime\` =~ /(\ed+\e.\ed+)/g);
2859
2860 # scalar context
2861 $/ = ""; $* = 1;
2862 while ($paragraph = <>) {
2863 while ($paragraph =~ /[a-z][\'")]*[.!?]+[\'")]*\es/g) {
2864 $sentences++;
2865 }
2866 }
2867 print "$sentences\en";
2868
2869.fi
2870.Ip "mkdir(FILENAME,MODE)" 8 3
2871Creates the directory specified by FILENAME, with permissions specified by
2872MODE (as modified by umask).
2873If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
2874.Ip "msgctl(ID,CMD,ARG)" 8 4
2875Calls the System V IPC function msgctl. If CMD is &IPC_STAT, then ARG
2876must be a variable which will hold the returned msqid_ds structure.
2877Returns like ioctl: the undefined value for error, "0 but true" for
2878zero, or the actual return value otherwise.
2879.Ip "msgget(KEY,FLAGS)" 8 4
2880Calls the System V IPC function msgget. Returns the message queue id,
2881or the undefined value if there is an error.
2882.Ip "msgsnd(ID,MSG,FLAGS)" 8 4
2883Calls the System V IPC function msgsnd to send the message MSG to the
2884message queue ID. MSG must begin with the long integer message type,
2885which may be created with pack("L", $type). Returns true if
2886successful, or false if there is an error.
2887.Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
2888Calls the System V IPC function msgrcv to receive a message from
2889message queue ID into variable VAR with a maximum message size of
2890SIZE. Note that if a message is received, the message type will be
2891the first thing in VAR, and the maximum length of VAR is SIZE plus the
2892size of the message type. Returns true if successful, or false if
2893there is an error.
2894.Ip "next LABEL" 8 8
2895.Ip "next" 8
2896The
2897.I next
2898command is like the
2899.I continue
2900statement in C; it starts the next iteration of the loop:
2901.nf
2902
2903.ne 4
2904 line: while (<STDIN>) {
2905 next line if /\|^#/; # discard comments
2906 .\|.\|.
2907 }
2908
2909.fi
2910Note that if there were a
2911.I continue
2912block on the above, it would get executed even on discarded lines.
2913If the LABEL is omitted, the command refers to the innermost enclosing loop.
2914.Ip "oct(EXPR)" 8 4
2915.Ip "oct EXPR" 8
2916Returns the decimal value of EXPR interpreted as an octal string.
2917(If EXPR happens to start off with 0x, interprets it as a hex string instead.)
2918The following will handle decimal, octal and hex in the standard notation:
2919.nf
2920
2921 $val = oct($val) if $val =~ /^0/;
2922
2923.fi
2924If EXPR is omitted, uses $_.
2925.Ip "open(FILEHANDLE,EXPR)" 8 8
2926.Ip "open(FILEHANDLE)" 8
2927.Ip "open FILEHANDLE" 8
2928Opens the file whose filename is given by EXPR, and associates it with
2929FILEHANDLE.
2930If FILEHANDLE is an expression, its value is used as the name of the
2931real filehandle wanted.
2932If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
2933contains the filename.
2934If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
2935input.
2936If the filename begins with \*(L">\*(R", the file is opened for output.
2937If the filename begins with \*(L">>\*(R", the file is opened for appending.
2938(You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
2939want both read and write access to the file.)
2940If the filename begins with \*(L"|\*(R", the filename is interpreted
2941as a command to which output is to be piped, and if the filename ends
2942with a \*(L"|\*(R", the filename is interpreted as command which pipes
2943input to us.
2944(You may not have a command that pipes both in and out.)
2945Opening \'\-\' opens
2946.I STDIN
2947and opening \'>\-\' opens
2948.IR STDOUT .
2949Open returns non-zero upon success, the undefined value otherwise.
2950If the open involved a pipe, the return value happens to be the pid
2951of the subprocess.
2952Examples:
2953.nf
2954
2955.ne 3
2956 $article = 100;
2957 open article || die "Can't find article $article: $!\en";
2958 while (<article>) {\|.\|.\|.
2959
2960.ie t \{\
2961 open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved)
2962'br\}
2963.el \{\
2964 open(LOG, \'>>/usr/spool/news/twitlog\'\|);
2965 # (log is reserved)
2966'br\}
2967
2968.ie t \{\
2969 open(article, "caesar <$article |"\|); # decrypt article
2970'br\}
2971.el \{\
2972 open(article, "caesar <$article |"\|);
2973 # decrypt article
2974'br\}
2975
2976.ie t \{\
2977 open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process#
2978'br\}
2979.el \{\
2980 open(extract, "|sort >/tmp/Tmp$$"\|);
2981 # $$ is our process#
2982'br\}
2983
2984.ne 7
2985 # process argument list of files along with any includes
2986
2987 foreach $file (@ARGV) {
2988 do process($file, \'fh00\'); # no pun intended
2989 }
2990
2991 sub process {
2992 local($filename, $input) = @_;
2993 $input++; # this is a string increment
2994 unless (open($input, $filename)) {
2995 print STDERR "Can't open $filename: $!\en";
2996 return;
2997 }
2998.ie t \{\
2999 while (<$input>) { # note the use of indirection
3000'br\}
3001.el \{\
3002 while (<$input>) { # note use of indirection
3003'br\}
3004 if (/^#include "(.*)"/) {
3005 do process($1, $input);
3006 next;
3007 }
3008 .\|.\|. # whatever
3009 }
3010 }
3011
3012.fi
3013You may also, in the Bourne shell tradition, specify an EXPR beginning
3014with \*(L">&\*(R", in which case the rest of the string
3015is interpreted as the name of a filehandle
3016(or file descriptor, if numeric) which is to be duped and opened.
3017You may use & after >, >>, <, +>, +>> and +<.
3018The mode you specify should match the mode of the original filehandle.
3019Here is a script that saves, redirects, and restores
3020.I STDOUT
3021and
3022.IR STDERR :
3023.nf
3024
3025.ne 21
3026 #!/usr/bin/perl
3027 open(SAVEOUT, ">&STDOUT");
3028 open(SAVEERR, ">&STDERR");
3029
3030 open(STDOUT, ">foo.out") || die "Can't redirect stdout";
3031 open(STDERR, ">&STDOUT") || die "Can't dup stdout";
3032
3033 select(STDERR); $| = 1; # make unbuffered
3034 select(STDOUT); $| = 1; # make unbuffered
3035
3036 print STDOUT "stdout 1\en"; # this works for
3037 print STDERR "stderr 1\en"; # subprocesses too
3038
3039 close(STDOUT);
3040 close(STDERR);
3041
3042 open(STDOUT, ">&SAVEOUT");
3043 open(STDERR, ">&SAVEERR");
3044
3045 print STDOUT "stdout 2\en";
3046 print STDERR "stderr 2\en";
3047
3048.fi
3049If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
3050then there is an implicit fork done, and the return value of open
3051is the pid of the child within the parent process, and 0 within the child
3052process.
3053(Use defined($pid) to determine if the open was successful.)
3054The filehandle behaves normally for the parent, but i/o to that
3055filehandle is piped from/to the
3056.IR STDOUT / STDIN
3057of the child process.
3058In the child process the filehandle isn't opened\*(--i/o happens from/to
3059the new
3060.I STDOUT
3061or
3062.IR STDIN .
3063Typically this is used like the normal piped open when you want to exercise
3064more control over just how the pipe command gets executed, such as when
3065you are running setuid, and don't want to have to scan shell commands
3066for metacharacters.
3067The following pairs are more or less equivalent:
3068.nf
3069
3070.ne 5
3071 open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
3072 open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
3073
3074 open(FOO, "cat \-n '$file'|");
3075 open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
3076
3077.fi
3078Explicitly closing any piped filehandle causes the parent process to wait for the
3079child to finish, and returns the status value in $?.
3080Note: on any operation which may do a fork,
3081unflushed buffers remain unflushed in both
3082processes, which means you may need to set $| to
3083avoid duplicate output.
3084.Sp
3085The filename that is passed to open will have leading and trailing
3086whitespace deleted.
3087In order to open a file with arbitrary weird characters in it, it's necessary
3088to protect any leading and trailing whitespace thusly:
3089.nf
3090
3091.ne 2
3092 $file =~ s#^(\es)#./$1#;
3093 open(FOO, "< $file\e0");
3094
3095.fi
3096.Ip "opendir(DIRHANDLE,EXPR)" 8 3
3097Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
3098rewinddir() and closedir().
3099Returns true if successful.
3100DIRHANDLEs have their own namespace separate from FILEHANDLEs.
3101.Ip "ord(EXPR)" 8 4
3102.Ip "ord EXPR" 8
3103Returns the numeric ascii value of the first character of EXPR.
3104If EXPR is omitted, uses $_.
3105''' Comments on f & d by gnb@melba.bby.oz.au 22/11/89
3106.Ip "pack(TEMPLATE,LIST)" 8 4
3107Takes an array or list of values and packs it into a binary structure,
3108returning the string containing the structure.
3109The TEMPLATE is a sequence of characters that give the order and type
3110of values, as follows:
3111.nf
3112
3113 A An ascii string, will be space padded.
3114 a An ascii string, will be null padded.
3115 c A signed char value.
3116 C An unsigned char value.
3117 s A signed short value.
3118 S An unsigned short value.
3119 i A signed integer value.
3120 I An unsigned integer value.
3121 l A signed long value.
3122 L An unsigned long value.
3123 n A short in \*(L"network\*(R" order.
3124 N A long in \*(L"network\*(R" order.
3125 f A single-precision float in the native format.
3126 d A double-precision float in the native format.
3127 p A pointer to a string.
3128 v A short in \*(L"VAX\*(R" (little-endian) order.
3129 V A long in \*(L"VAX\*(R" (little-endian) order.
3130 x A null byte.
3131 X Back up a byte.
3132 @ Null fill to absolute position.
3133 u A uuencoded string.
3134 b A bit string (ascending bit order, like vec()).
3135 B A bit string (descending bit order).
3136 h A hex string (low nybble first).
3137 H A hex string (high nybble first).
3138
3139.fi
3140Each letter may optionally be followed by a number which gives a repeat
3141count.
3142With all types except "a", "A", "b", "B", "h" and "H",
3143the pack function will gobble up that many values
3144from the LIST.
3145A * for the repeat count means to use however many items are left.
3146The "a" and "A" types gobble just one value, but pack it as a string of length
3147count,
3148padding with nulls or spaces as necessary.
3149(When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
3150Likewise, the "b" and "B" fields pack a string that many bits long.
3151The "h" and "H" fields pack a string that many nybbles long.
3152Real numbers (floats and doubles) are in the native machine format
3153only; due to the multiplicity of floating formats around, and the lack
3154of a standard \*(L"network\*(R" representation, no facility for
3155interchange has been made.
3156This means that packed floating point data
3157written on one machine may not be readable on another - even if both
3158use IEEE floating point arithmetic (as the endian-ness of the memory
3159representation is not part of the IEEE spec).
3160Note that perl uses
3161doubles internally for all numeric calculation, and converting from
3162double -> float -> double will lose precision (i.e. unpack("f",
3163pack("f", $foo)) will not in general equal $foo).
3164.br
3165Examples:
3166.nf
3167
3168 $foo = pack("cccc",65,66,67,68);
3169 # foo eq "ABCD"
3170 $foo = pack("c4",65,66,67,68);
3171 # same thing
3172
3173 $foo = pack("ccxxcc",65,66,67,68);
3174 # foo eq "AB\e0\e0CD"
3175
3176 $foo = pack("s2",1,2);
3177 # "\e1\e0\e2\e0" on little-endian
3178 # "\e0\e1\e0\e2" on big-endian
3179
3180 $foo = pack("a4","abcd","x","y","z");
3181 # "abcd"
3182
3183 $foo = pack("aaaa","abcd","x","y","z");
3184 # "axyz"
3185
3186 $foo = pack("a14","abcdefg");
3187 # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
3188
3189 $foo = pack("i9pl", gmtime);
3190 # a real struct tm (on my system anyway)
3191
3192 sub bintodec {
3193 unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
3194 }
3195.fi
3196The same template may generally also be used in the unpack function.
3197.Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3
3198Opens a pair of connected pipes like the corresponding system call.
3199Note that if you set up a loop of piped processes, deadlock can occur
3200unless you are very careful.
3201In addition, note that perl's pipes use stdio buffering, so you may need
3202to set $| to flush your WRITEHANDLE after each command, depending on
3203the application.
3204[Requires version 3.0 patchlevel 9.]
3205.Ip "pop(ARRAY)" 8
3206.Ip "pop ARRAY" 8 6
3207Pops and returns the last value of the array, shortening the array by 1.
3208Has the same effect as
3209.nf
3210
3211 $tmp = $ARRAY[$#ARRAY\-\|\-];
3212
3213.fi
3214If there are no elements in the array, returns the undefined value.
3215.Ip "print(FILEHANDLE LIST)" 8 10
3216.Ip "print(LIST)" 8
3217.Ip "print FILEHANDLE LIST" 8
3218.Ip "print LIST" 8
3219.Ip "print" 8
3220Prints a string or a comma-separated list of strings.
3221Returns non-zero if successful.
3222FILEHANDLE may be a scalar variable name, in which case the variable contains
3223the name of the filehandle, thus introducing one level of indirection.
3224(NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
3225misinterpreted as an operator unless you interpose a + or put parens around
3226the arguments.)
3227If FILEHANDLE is omitted, prints by default to standard output (or to the
3228last selected output channel\*(--see select()).
3229If LIST is also omitted, prints $_ to
3230.IR STDOUT .
3231To set the default output channel to something other than
3232.I STDOUT
3233use the select operation.
3234Note that, because print takes a LIST, anything in the LIST is evaluated
3235in an array context, and any subroutine that you call will have one or more
3236of its expressions evaluated in an array context.
3237Also be careful not to follow the print keyword with a left parenthesis
3238unless you want the corresponding right parenthesis to terminate the
3239arguments to the print\*(--interpose a + or put parens around all the arguments.
3240.Ip "printf(FILEHANDLE LIST)" 8 10
3241.Ip "printf(LIST)" 8
3242.Ip "printf FILEHANDLE LIST" 8
3243.Ip "printf LIST" 8
3244Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
3245.Ip "push(ARRAY,LIST)" 8 7
3246Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
3247onto the end of ARRAY.
3248The length of ARRAY increases by the length of LIST.
3249Has the same effect as
3250.nf
3251
3252 for $value (LIST) {
3253 $ARRAY[++$#ARRAY] = $value;
3254 }
3255
3256.fi
3257but is more efficient.
3258.Ip "q/STRING/" 8 5
3259.Ip "qq/STRING/" 8
3260.Ip "qx/STRING/" 8
3261These are not really functions, but simply syntactic sugar to let you
3262avoid putting too many backslashes into quoted strings.
3263The q operator is a generalized single quote, and the qq operator a
3264generalized double quote.
3265The qx operator is a generalized backquote.
3266Any non-alphanumeric delimiter can be used in place of /, including newline.
3267If the delimiter is an opening bracket or parenthesis, the final delimiter
3268will be the corresponding closing bracket or parenthesis.
3269(Embedded occurrences of the closing bracket need to be backslashed as usual.)
3270Examples:
3271.nf
3272
3273.ne 5
3274 $foo = q!I said, "You said, \'She said it.\'"!;
3275 $bar = q(\'This is it.\');
3276 $today = qx{ date };
3277 $_ .= qq
3278*** The previous line contains the naughty word "$&".\en
3279 if /(ibm|apple|awk)/; # :-)
3280
3281.fi
3282.Ip "rand(EXPR)" 8 8
3283.Ip "rand EXPR" 8
3284.Ip "rand" 8
3285Returns a random fractional number between 0 and the value of EXPR.
3286(EXPR should be positive.)
3287If EXPR is omitted, returns a value between 0 and 1.
3288See also srand().
3289.Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3290.Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
3291Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3292FILEHANDLE.
3293Returns the number of bytes actually read, or undef if there was an error.
3294SCALAR will be grown or shrunk to the length actually read.
3295An OFFSET may be specified to place the read data at some other place
3296than the beginning of the string.
3297This call is actually implemented in terms of stdio's fread call. To get
3298a true read system call, see sysread.
3299.Ip "readdir(DIRHANDLE)" 8 3
3300.Ip "readdir DIRHANDLE" 8
3301Returns the next directory entry for a directory opened by opendir().
3302If used in an array context, returns all the rest of the entries in the
3303directory.
3304If there are no more entries, returns an undefined value in a scalar context
3305or a null list in an array context.
3306.Ip "readlink(EXPR)" 8 6
3307.Ip "readlink EXPR" 8
3308Returns the value of a symbolic link, if symbolic links are implemented.
3309If not, gives a fatal error.
3310If there is some system error, returns the undefined value and sets $! (errno).
3311If EXPR is omitted, uses $_.
3312.Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
3313Receives a message on a socket.
3314Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
3315SOCKET filehandle.
3316Returns the address of the sender, or the undefined value if there's an error.
3317SCALAR will be grown or shrunk to the length actually read.
3318Takes the same flags as the system call of the same name.
3319.Ip "redo LABEL" 8 8
3320.Ip "redo" 8
3321The
3322.I redo
3323command restarts the loop block without evaluating the conditional again.
3324The
3325.I continue
3326block, if any, is not executed.
3327If the LABEL is omitted, the command refers to the innermost enclosing loop.
3328This command is normally used by programs that want to lie to themselves
3329about what was just input:
3330.nf
3331
3332.ne 16
3333 # a simpleminded Pascal comment stripper
3334 # (warning: assumes no { or } in strings)
3335 line: while (<STDIN>) {
3336 while (s|\|({.*}.*\|){.*}|$1 \||) {}
3337 s|{.*}| \||;
3338 if (s|{.*| \||) {
3339 $front = $_;
3340 while (<STDIN>) {
3341 if (\|/\|}/\|) { # end of comment?
3342 s|^|$front{|;
3343 redo line;
3344 }
3345 }
3346 }
3347 print;
3348 }
3349
3350.fi
3351.Ip "rename(OLDNAME,NEWNAME)" 8 2
3352Changes the name of a file.
3353Returns 1 for success, 0 otherwise.
3354Will not work across filesystem boundaries.
3355.Ip "require(EXPR)" 8 6
3356.Ip "require EXPR" 8
3357.Ip "require" 8
3358Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
3359Has semantics similar to the following subroutine:
3360.nf
3361
3362 sub require {
3363 local($filename) = @_;
3364 return 1 if $INC{$filename};
3365 local($realfilename,$result);
3366 ITER: {
3367 foreach $prefix (@INC) {
3368 $realfilename = "$prefix/$filename";
3369 if (-f $realfilename) {
3370 $result = do $realfilename;
3371 last ITER;
3372 }
3373 }
3374 die "Can't find $filename in \e@INC";
3375 }
3376 die $@ if $@;
3377 die "$filename did not return true value" unless $result;
3378 $INC{$filename} = $realfilename;
3379 $result;
3380 }
3381
3382.fi
3383Note that the file will not be included twice under the same specified name.
3384The file must return true as the last statement to indicate successful
3385execution of any initialization code, so it's customary to end
3386such a file with \*(L"1;\*(R" unless you're sure it'll return true otherwise.
3387.Ip "reset(EXPR)" 8 6
3388.Ip "reset EXPR" 8
3389.Ip "reset" 8
3390Generally used in a
3391.I continue
3392block at the end of a loop to clear variables and reset ?? searches
3393so that they work again.
3394The expression is interpreted as a list of single characters (hyphens allowed
3395for ranges).
3396All variables and arrays beginning with one of those letters are reset to
3397their pristine state.
3398If the expression is omitted, one-match searches (?pattern?) are reset to
3399match again.
3400Only resets variables or searches in the current package.
3401Always returns 1.
3402Examples:
3403.nf
3404
3405.ne 3
3406 reset \'X\'; \h'|2i'# reset all X variables
3407 reset \'a\-z\';\h'|2i'# reset lower case variables
3408 reset; \h'|2i'# just reset ?? searches
3409
3410.fi
3411Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
3412arrays.
3413.Sp
3414The use of reset on dbm associative arrays does not change the dbm file.
3415(It does, however, flush any entries cached by perl, which may be useful if
3416you are sharing the dbm file.
3417Then again, maybe not.)
3418.Ip "return LIST" 8 3
3419Returns from a subroutine with the value specified.
3420(Note that a subroutine can automatically return
3421the value of the last expression evaluated.
3422That's the preferred method\*(--use of an explicit
3423.I return
3424is a bit slower.)
3425.Ip "reverse(LIST)" 8 4
3426.Ip "reverse LIST" 8
3427In an array context, returns an array value consisting of the elements
3428of LIST in the opposite order.
3429In a scalar context, returns a string value consisting of the bytes of
3430the first element of LIST in the opposite order.
3431.Ip "rewinddir(DIRHANDLE)" 8 5
3432.Ip "rewinddir DIRHANDLE" 8
3433Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
3434.Ip "rindex(STR,SUBSTR,POSITION)" 8 6
3435.Ip "rindex(STR,SUBSTR)" 8 4
3436Works just like index except that it
3437returns the position of the LAST occurrence of SUBSTR in STR.
3438If POSITION is specified, returns the last occurrence at or before that
3439position.
3440.Ip "rmdir(FILENAME)" 8 4
3441.Ip "rmdir FILENAME" 8
3442Deletes the directory specified by FILENAME if it is empty.
3443If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
3444If FILENAME is omitted, uses $_.
3445.Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
3446Searches a string for a pattern, and if found, replaces that pattern with the
3447replacement text and returns the number of substitutions made.
3448Otherwise it returns false (0).
3449The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
3450of the pattern are to be replaced.
3451The \*(L"i\*(R" is also optional, and if present, indicates that matching
3452is to be done in a case-insensitive manner.
3453The \*(L"e\*(R" is likewise optional, and if present, indicates that
3454the replacement string is to be evaluated as an expression rather than just
3455as a double-quoted string.
3456Any non-alphanumeric delimiter may replace the slashes;
3457if single quotes are used, no
3458interpretation is done on the replacement string (the e modifier overrides
3459this, however); if backquotes are used, the replacement string is a command
3460to execute whose output will be used as the actual replacement text.
3461If the PATTERN is delimited by bracketing quotes, the REPLACEMENT
3462has its own pair of quotes, which may or may not be bracketing quotes, e.g.
3463s(foo)(bar) or s<foo>/bar/.
3464If no string is specified via the =~ or !~ operator,
3465the $_ string is searched and modified.
3466(The string specified with =~ must be a scalar variable, an array element,
3467or an assignment to one of those, i.e. an lvalue.)
3468If the pattern contains a $ that looks like a variable rather than an
3469end-of-string test, the variable will be interpolated into the pattern at
3470run-time.
3471If you only want the pattern compiled once the first time the variable is
3472interpolated, add an \*(L"o\*(R" at the end.
3473If the PATTERN evaluates to a null string, the most recent successful
3474regular expression is used instead.
3475See also the section on regular expressions.
3476Examples:
3477.nf
3478
3479 s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen
3480
3481 $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
3482
3483 s/Login: $foo/Login: $bar/; # run-time pattern
3484
3485 ($foo = $bar) =~ s/bar/foo/;
3486
3487 $_ = \'abc123xyz\';
3488 s/\ed+/$&*2/e; # yields \*(L'abc246xyz\*(R'
3489 s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc 246xyz\*(R'
3490 s/\ew/$& x 2/eg; # yields \*(L'aabbcc 224466xxyyzz\*(R'
3491
3492 s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields
3493
3494.fi
3495(Note the use of $ instead of \|\e\| in the last example. See section
3496on regular expressions.)
3497.Ip "scalar(EXPR)" 8 3
3498Forces EXPR to be interpreted in a scalar context and returns the value
3499of EXPR.
3500.Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
3501Randomly positions the file pointer for FILEHANDLE, just like the fseek()
3502call of stdio.
3503FILEHANDLE may be an expression whose value gives the name of the filehandle.
3504Returns 1 upon success, 0 otherwise.
3505.Ip "seekdir(DIRHANDLE,POS)" 8 3
3506Sets the current position for the readdir() routine on DIRHANDLE.
3507POS must be a value returned by telldir().
3508Has the same caveats about possible directory compaction as the corresponding
3509system library routine.
3510.Ip "select(FILEHANDLE)" 8 3
3511.Ip "select" 8 3
3512Returns the currently selected filehandle.
3513Sets the current default filehandle for output, if FILEHANDLE is supplied.
3514This has two effects: first, a
3515.I write
3516or a
3517.I print
3518without a filehandle will default to this FILEHANDLE.
3519Second, references to variables related to output will refer to this output
3520channel.
3521For example, if you have to set the top of form format for more than
3522one output channel, you might do the following:
3523.nf
3524
3525.ne 4
3526 select(REPORT1);
3527 $^ = \'report1_top\';
3528 select(REPORT2);
3529 $^ = \'report2_top\';
3530
3531.fi
3532FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
3533Thus:
3534.nf
3535
3536 $oldfh = select(STDERR); $| = 1; select($oldfh);
3537
3538.fi
3539.Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
3540This calls the select system call with the bitmasks specified, which can
3541be constructed using fileno() and vec(), along these lines:
3542.nf
3543
3544 $rin = $win = $ein = '';
3545 vec($rin,fileno(STDIN),1) = 1;
3546 vec($win,fileno(STDOUT),1) = 1;
3547 $ein = $rin | $win;
3548
3549.fi
3550If you want to select on many filehandles you might wish to write a subroutine:
3551.nf
3552
3553 sub fhbits {
3554 local(@fhlist) = split(' ',$_[0]);
3555 local($bits);
3556 for (@fhlist) {
3557 vec($bits,fileno($_),1) = 1;
3558 }
3559 $bits;
3560 }
3561 $rin = &fhbits('STDIN TTY SOCK');
3562
3563.fi
3564The usual idiom is:
3565.nf
3566
3567 ($nfound,$timeleft) =
3568 select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
3569
3570or to block until something becomes ready:
3571
3572.ie t \{\
3573 $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
3574'br\}
3575.el \{\
3576 $nfound = select($rout=$rin, $wout=$win,
3577 $eout=$ein, undef);
3578'br\}
3579
3580.fi
3581Any of the bitmasks can also be undef.
3582The timeout, if specified, is in seconds, which may be fractional.
3583NOTE: not all implementations are capable of returning the $timeleft.
3584If not, they always return $timeleft equal to the supplied $timeout.
3585.Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
3586Calls the System V IPC function semctl. If CMD is &IPC_STAT or
3587&GETALL, then ARG must be a variable which will hold the returned
3588semid_ds structure or semaphore value array. Returns like ioctl: the
3589undefined value for error, "0 but true" for zero, or the actual return
3590value otherwise.
3591.Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4
3592Calls the System V IPC function semget. Returns the semaphore id, or
3593the undefined value if there is an error.
3594.Ip "semop(KEY,OPSTRING)" 8 4
3595Calls the System V IPC function semop to perform semaphore operations
3596such as signaling and waiting. OPSTRING must be a packed array of
3597semop structures. Each semop structure can be generated with
3598\&'pack("sss", $semnum, $semop, $semflag)'. The number of semaphore
3599operations is implied by the length of OPSTRING. Returns true if
3600successful, or false if there is an error. As an example, the
3601following code waits on semaphore $semnum of semaphore id $semid:
3602.nf
3603
3604 $semop = pack("sss", $semnum, -1, 0);
3605 die "Semaphore trouble: $!\en" unless semop($semid, $semop);
3606
3607.fi
3608To signal the semaphore, replace "-1" with "1".
3609.Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
3610.Ip "send(SOCKET,MSG,FLAGS)" 8
3611Sends a message on a socket.
3612Takes the same flags as the system call of the same name.
3613On unconnected sockets you must specify a destination to send TO.
3614Returns the number of characters sent, or the undefined value if
3615there is an error.
3616.Ip "setpgrp(PID,PGRP)" 8 4
3617Sets the current process group for the specified PID, 0 for the current
3618process.
3619Will produce a fatal error if used on a machine that doesn't implement
3620setpgrp(2).
3621.Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
3622Sets the current priority for a process, a process group, or a user.
3623(See setpriority(2).)
3624Will produce a fatal error if used on a machine that doesn't implement
3625setpriority(2).
3626.Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
3627Sets the socket option requested.
3628Returns undefined if there is an error.
3629OPTVAL may be specified as undef if you don't want to pass an argument.
3630.Ip "shift(ARRAY)" 8 6
3631.Ip "shift ARRAY" 8
3632.Ip "shift" 8
3633Shifts the first value of the array off and returns it,
3634shortening the array by 1 and moving everything down.
3635If there are no elements in the array, returns the undefined value.
3636If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
3637array in subroutines.
3638(This is determined lexically.)
3639See also unshift(), push() and pop().
3640Shift() and unshift() do the same thing to the left end of an array that push()
3641and pop() do to the right end.
3642.Ip "shmctl(ID,CMD,ARG)" 8 4
3643Calls the System V IPC function shmctl. If CMD is &IPC_STAT, then ARG
3644must be a variable which will hold the returned shmid_ds structure.
3645Returns like ioctl: the undefined value for error, "0 but true" for
3646zero, or the actual return value otherwise.
3647.Ip "shmget(KEY,SIZE,FLAGS)" 8 4
3648Calls the System V IPC function shmget. Returns the shared memory
3649segment id, or the undefined value if there is an error.
3650.Ip "shmread(ID,VAR,POS,SIZE)" 8 4
3651.Ip "shmwrite(ID,STRING,POS,SIZE)" 8
3652Reads or writes the System V shared memory segment ID starting at
3653position POS for size SIZE by attaching to it, copying in/out, and
3654detaching from it. When reading, VAR must be a variable which
3655will hold the data read. When writing, if STRING is too long,
3656only SIZE bytes are used; if STRING is too short, nulls are
3657written to fill out SIZE bytes. Return true if successful, or
3658false if there is an error.
3659.Ip "shutdown(SOCKET,HOW)" 8 3
3660Shuts down a socket connection in the manner indicated by HOW, which has
3661the same interpretation as in the system call of the same name.
3662.Ip "sin(EXPR)" 8 4
3663.Ip "sin EXPR" 8
3664Returns the sine of EXPR (expressed in radians).
3665If EXPR is omitted, returns sine of $_.
3666.Ip "sleep(EXPR)" 8 6
3667.Ip "sleep EXPR" 8
3668.Ip "sleep" 8
3669Causes the script to sleep for EXPR seconds, or forever if no EXPR.
3670May be interrupted by sending the process a SIGALRM.
3671Returns the number of seconds actually slept.
3672You probably cannot mix alarm() and sleep() calls, since sleep() is
3673often implemented using alarm().
3674.Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
3675Opens a socket of the specified kind and attaches it to filehandle SOCKET.
3676DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3677of the same name.
3678You may need to run h2ph on sys/socket.h to get the proper values handy
3679in a perl library file.
3680Return true if successful.
3681See the example in the section on Interprocess Communication.
3682.Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
3683Creates an unnamed pair of sockets in the specified domain, of the specified
3684type.
3685DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3686of the same name.
3687If unimplemented, yields a fatal error.
3688Return true if successful.
3689.Ip "sort(SUBROUTINE LIST)" 8 9
3690.Ip "sort(LIST)" 8
3691.Ip "sort SUBROUTINE LIST" 8
3692.Ip "sort BLOCK LIST" 8
3693.Ip "sort LIST" 8
3694Sorts the LIST and returns the sorted array value.
3695Nonexistent values of arrays are stripped out.
3696If SUBROUTINE or BLOCK is omitted, sorts in standard string comparison order.
3697If SUBROUTINE is specified, gives the name of a subroutine that returns
3698an integer less than, equal to, or greater than 0,
3699depending on how the elements of the array are to be ordered.
3700(The <=> and cmp operators are extremely useful in such routines.)
3701SUBROUTINE may be a scalar variable name, in which case the value provides
3702the name of the subroutine to use.
3703In place of a SUBROUTINE name, you can provide a BLOCK as an anonymous,
3704in-line sort subroutine.
3705.Sp
3706In the interests of efficiency the normal calling code for subroutines
3707is bypassed, with the following effects: the subroutine may not be a recursive
3708subroutine, and the two elements to be compared are passed into the subroutine
3709not via @_ but as $a and $b (see example below).
3710They are passed by reference so don't modify $a and $b.
3711.Sp
3712Examples:
3713.nf
3714
3715.ne 2
3716 # sort lexically
3717 @articles = sort @files;
3718
3719.ne 2
3720 # same thing, but with explicit sort routine
3721 @articles = sort {$a cmp $b} @files;
3722
3723.ne 2
3724 # same thing in reversed order
3725 @articles = sort {$b cmp $a} @files;
3726
3727.ne 2
3728 # sort numerically ascending
3729 @articles = sort {$a <=> $b} @files;
3730
3731.ne 2
3732 # sort numerically descending
3733 @articles = sort {$b <=> $a} @files;
3734
3735.ne 5
3736 # sort using explicit subroutine name
3737 sub byage {
3738 $age{$a} <=> $age{$b}; # presuming integers
3739 }
3740 @sortedclass = sort byage @class;
3741
3742.ne 9
3743 sub reverse { $b cmp $a; }
3744 @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
3745 @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
3746 print sort @harry;
3747 # prints AbelCaincatdogx
3748 print sort reverse @harry;
3749 # prints xdogcatCainAbel
3750 print sort @george, \'to\', @harry;
3751 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
3752
3753.fi
3754.Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
3755.Ip "splice(ARRAY,OFFSET,LENGTH)" 8
3756.Ip "splice(ARRAY,OFFSET)" 8
3757Removes the elements designated by OFFSET and LENGTH from an array, and
3758replaces them with the elements of LIST, if any.
3759Returns the elements removed from the array.
3760The array grows or shrinks as necessary.
3761If LENGTH is omitted, removes everything from OFFSET onward.
3762The following equivalencies hold (assuming $[ == 0):
3763.nf
3764
3765 push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
3766 pop(@a)\h'|3.5i'splice(@a,-1)
3767 shift(@a)\h'|3.5i'splice(@a,0,1)
3768 unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
3769 $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
3770
3771Example, assuming array lengths are passed before arrays:
3772
3773 sub aeq { # compare two array values
3774 local(@a) = splice(@_,0,shift);
3775 local(@b) = splice(@_,0,shift);
3776 return 0 unless @a == @b; # same len?
3777 while (@a) {
3778 return 0 if pop(@a) ne pop(@b);
3779 }
3780 return 1;
3781 }
3782 if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
3783
3784.fi
3785.Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
3786.Ip "split(/PATTERN/,EXPR)" 8 8
3787.Ip "split(/PATTERN/)" 8
3788.Ip "split" 8
3789Splits a string into an array of strings, and returns it.
3790(If not in an array context, returns the number of fields found and splits
3791into the @_ array.
3792(In an array context, you can force the split into @_
3793by using ?? as the pattern delimiters, but it still returns the array value.))
3794If EXPR is omitted, splits the $_ string.
3795If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
3796Anything matching PATTERN is taken to be a delimiter separating the fields.
3797(Note that the delimiter may be longer than one character.)
3798If LIMIT is specified, splits into no more than that many fields (though it
3799may split into fewer).
3800If LIMIT is unspecified, trailing null fields are stripped (which
3801potential users of pop() would do well to remember).
3802A pattern matching the null string (not to be confused with a null pattern //,
3803which is just one member of the set of patterns matching a null string)
3804will split the value of EXPR into separate characters at each point it
3805matches that way.
3806For example:
3807.nf
3808
3809 print join(\':\', split(/ */, \'hi there\'));
3810
3811.fi
3812produces the output \*(L'h:i:t:h:e:r:e\*(R'.
3813.Sp
3814The LIMIT parameter can be used to partially split a line
3815.nf
3816
3817 ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
3818
3819.fi
3820(When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
3821larger than the number of variables in the list, to avoid unnecessary work.
3822For the list above LIMIT would have been 4 by default.
3823In time critical applications it behooves you not to split into
3824more fields than you really need.)
3825.Sp
3826If the PATTERN contains parentheses, additional array elements are created
3827from each matching substring in the delimiter.
3828.Sp
3829 split(/([,-])/,"1-10,20");
3830.Sp
3831produces the array value
3832.Sp
3833 (1,'-',10,',',20)
3834.Sp
3835The pattern /PATTERN/ may be replaced with an expression to specify patterns
3836that vary at runtime.
3837(To do runtime compilation only once, use /$variable/o.)
3838As a special case, specifying a space (\'\ \') will split on white space
3839just as split with no arguments does, but leading white space does NOT
3840produce a null first field.
3841Thus, split(\'\ \') can be used to emulate
3842.IR awk 's
3843default behavior, whereas
3844split(/\ /) will give you as many null initial fields as there are
3845leading spaces.
3846.Sp
3847Example:
3848.nf
3849
3850.ne 5
3851 open(passwd, \'/etc/passwd\');
3852 while (<passwd>) {
3853.ie t \{\
3854 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
3855'br\}
3856.el \{\
3857 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
3858 = split(\|/\|:\|/\|);
3859'br\}
3860 .\|.\|.
3861 }
3862
3863.fi
3864(Note that $shell above will still have a newline on it. See chop().)
3865See also
3866.IR join .
3867.Ip "sprintf(FORMAT,LIST)" 8 4
3868Returns a string formatted by the usual printf conventions.
3869The * character is not supported.
3870.Ip "sqrt(EXPR)" 8 4
3871.Ip "sqrt EXPR" 8
3872Return the square root of EXPR.
3873If EXPR is omitted, returns square root of $_.
3874.Ip "srand(EXPR)" 8 4
3875.Ip "srand EXPR" 8
3876Sets the random number seed for the
3877.I rand
3878operator.
3879If EXPR is omitted, does srand(time).
3880.Ip "stat(FILEHANDLE)" 8 8
3881.Ip "stat FILEHANDLE" 8
3882.Ip "stat(EXPR)" 8
3883.Ip "stat SCALARVARIABLE" 8
3884Returns a 13-element array giving the statistics for a file, either the file
3885opened via FILEHANDLE, or named by EXPR.
3886Returns a null list if the stat fails.
3887Typically used as follows:
3888.nf
3889
3890.ne 3
3891 ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
3892 $atime,$mtime,$ctime,$blksize,$blocks)
3893 = stat($filename);
3894
3895.fi
3896If stat is passed the special filehandle consisting of an underline,
3897no stat is done, but the current contents of the stat structure from
3898the last stat or filetest are returned.
3899Example:
3900.nf
3901
3902.ne 3
3903 if (-x $file && (($d) = stat(_)) && $d < 0) {
3904 print "$file is executable NFS file\en";
3905 }
3906
3907.fi
3908(This only works on machines for which the device number is negative under NFS.)
3909.Ip "study(SCALAR)" 8 6
3910.Ip "study SCALAR" 8
3911.Ip "study"
3912Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
3913doing many pattern matches on the string before it is next modified.
3914This may or may not save time, depending on the nature and number of patterns
3915you are searching on, and on the distribution of character frequencies in
3916the string to be searched\*(--you probably want to compare runtimes with and
3917without it to see which runs faster.
3918Those loops which scan for many short constant strings (including the constant
3919parts of more complex patterns) will benefit most.
3920You may have only one study active at a time\*(--if you study a different
3921scalar the first is \*(L"unstudied\*(R".
3922(The way study works is this: a linked list of every character in the string
3923to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
3924are.
3925From each search string, the rarest character is selected, based on some
3926static frequency tables constructed from some C programs and English text.
3927Only those places that contain this \*(L"rarest\*(R" character are examined.)
3928.Sp
3929For example, here is a loop which inserts index producing entries before any line
3930containing a certain pattern:
3931.nf
3932
3933.ne 8
3934 while (<>) {
3935 study;
3936 print ".IX foo\en" if /\ebfoo\eb/;
3937 print ".IX bar\en" if /\ebbar\eb/;
3938 print ".IX blurfl\en" if /\ebblurfl\eb/;
3939 .\|.\|.
3940 print;
3941 }
3942
3943.fi
3944In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
3945will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
3946In general, this is a big win except in pathological cases.
3947The only question is whether it saves you more time than it took to build
3948the linked list in the first place.
3949.Sp
3950Note that if you have to look for strings that you don't know till runtime,
3951you can build an entire loop as a string and eval that to avoid recompiling
3952all your patterns all the time.
3953Together with undefining $/ to input entire files as one record, this can
3954be very fast, often faster than specialized programs like fgrep.
3955The following scans a list of files (@files)
3956for a list of words (@words), and prints out the names of those files that
3957contain a match:
3958.nf
3959
3960.ne 12
3961 $search = \'while (<>) { study;\';
3962 foreach $word (@words) {
3963 $search .= "++\e$seen{\e$ARGV} if /\e\eb$word\e\eb/;\en";
3964 }
3965 $search .= "}";
3966 @ARGV = @files;
3967 undef $/;
3968 eval $search; # this screams
3969 $/ = "\en"; # put back to normal input delim
3970 foreach $file (sort keys(%seen)) {
3971 print $file, "\en";
3972 }
3973
3974.fi
3975.Ip "substr(EXPR,OFFSET,LEN)" 8 2
3976.Ip "substr(EXPR,OFFSET)" 8 2
3977Extracts a substring out of EXPR and returns it.
3978First character is at offset 0, or whatever you've set $[ to.
3979If OFFSET is negative, starts that far from the end of the string.
3980If LEN is omitted, returns everything to the end of the string.
3981You can use the substr() function as an lvalue, in which case EXPR must
3982be an lvalue.
3983If you assign something shorter than LEN, the string will shrink, and
3984if you assign something longer than LEN, the string will grow to accommodate it.
3985To keep the string the same length you may need to pad or chop your value using
3986sprintf().
3987.Ip "symlink(OLDFILE,NEWFILE)" 8 2
3988Creates a new filename symbolically linked to the old filename.
3989Returns 1 for success, 0 otherwise.
3990On systems that don't support symbolic links, produces a fatal error at
3991run time.
3992To check for that, use eval:
3993.nf
3994
3995 $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
3996
3997.fi
3998.Ip "syscall(LIST)" 8 6
3999.Ip "syscall LIST" 8
4000Calls the system call specified as the first element of the list, passing
4001the remaining elements as arguments to the system call.
4002If unimplemented, produces a fatal error.
4003The arguments are interpreted as follows: if a given argument is numeric,
4004the argument is passed as an int.
4005If not, the pointer to the string value is passed.
4006You are responsible to make sure a string is pre-extended long enough
4007to receive any result that might be written into a string.
4008If your integer arguments are not literals and have never been interpreted
4009in a numeric context, you may need to add 0 to them to force them to look
4010like numbers.
4011.nf
4012
4013 require 'syscall.ph'; # may need to run h2ph
4014 syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
4015
4016.fi
4017.Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
4018.Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
4019Attempts to read LENGTH bytes of data into variable SCALAR from the specified
4020FILEHANDLE, using the system call read(2).
4021It bypasses stdio, so mixing this with other kinds of reads may cause
4022confusion.
4023Returns the number of bytes actually read, or undef if there was an error.
4024SCALAR will be grown or shrunk to the length actually read.
4025An OFFSET may be specified to place the read data at some other place
4026than the beginning of the string.
4027.Ip "system(LIST)" 8 6
4028.Ip "system LIST" 8
4029Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
4030is done first, and the parent process waits for the child process to complete.
4031Note that argument processing varies depending on the number of arguments.
4032The return value is the exit status of the program as returned by the wait()
4033call.
4034To get the actual exit value divide by 256.
4035See also
4036.IR exec .
4037.Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
4038.Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
4039Attempts to write LENGTH bytes of data from variable SCALAR to the specified
4040FILEHANDLE, using the system call write(2).
4041It bypasses stdio, so mixing this with prints may cause
4042confusion.
4043Returns the number of bytes actually written, or undef if there was an error.
4044An OFFSET may be specified to place the read data at some other place
4045than the beginning of the string.
4046.Ip "tell(FILEHANDLE)" 8 6
4047.Ip "tell FILEHANDLE" 8 6
4048.Ip "tell" 8
4049Returns the current file position for FILEHANDLE.
4050FILEHANDLE may be an expression whose value gives the name of the actual
4051filehandle.
4052If FILEHANDLE is omitted, assumes the file last read.
4053.Ip "telldir(DIRHANDLE)" 8 5
4054.Ip "telldir DIRHANDLE" 8
4055Returns the current position of the readdir() routines on DIRHANDLE.
4056Value may be given to seekdir() to access a particular location in
4057a directory.
4058Has the same caveats about possible directory compaction as the corresponding
4059system library routine.
4060.Ip "time" 8 4
4061Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
4062Suitable for feeding to gmtime() and localtime().
4063.Ip "times" 8 4
4064Returns a four-element array giving the user and system times, in seconds, for this
4065process and the children of this process.
4066.Sp
4067 ($user,$system,$cuser,$csystem) = times;
4068.Sp
4069.Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5
4070.Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8
4071Translates all occurrences of the characters found in the search list with
4072the corresponding character in the replacement list.
4073It returns the number of characters replaced or deleted.
4074If no string is specified via the =~ or !~ operator,
4075the $_ string is translated.
4076(The string specified with =~ must be a scalar variable, an array element,
4077or an assignment to one of those, i.e. an lvalue.)
4078For
4079.I sed
4080devotees,
4081.I y
4082is provided as a synonym for
4083.IR tr .
4084If the SEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST
4085has its own pair of quotes, which may or may not be bracketing quotes, e.g.
4086tr[A-Z][a-z] or tr(+-*/)/ABCD/.
4087.Sp
4088If the c modifier is specified, the SEARCHLIST character set is complemented.
4089If the d modifier is specified, any characters specified by SEARCHLIST that
4090are not found in REPLACEMENTLIST are deleted.
4091(Note that this is slightly more flexible than the behavior of some
4092.I tr
4093programs, which delete anything they find in the SEARCHLIST, period.)
4094If the s modifier is specified, sequences of characters that were translated
4095to the same character are squashed down to 1 instance of the character.
4096.Sp
4097If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
4098as specified.
4099Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
4100the final character is replicated till it is long enough.
4101If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
4102This latter is useful for counting characters in a class, or for squashing
4103character sequences in a class.
4104.Sp
4105Examples:
4106.nf
4107
4108 $ARGV[1] \|=~ \|y/A\-Z/a\-z/; \h'|3i'# canonicalize to lower case
4109
4110 $cnt = tr/*/*/; \h'|3i'# count the stars in $_
4111
4112 $cnt = tr/0\-9//; \h'|3i'# count the digits in $_
4113
4114 tr/a\-zA\-Z//s; \h'|3i'# bookkeeper \-> bokeper
4115
4116 ($HOST = $host) =~ tr/a\-z/A\-Z/;
4117
4118 y/a\-zA\-Z/ /cs; \h'|3i'# change non-alphas to single space
4119
4120 tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
4121
4122.fi
4123.Ip "truncate(FILEHANDLE,LENGTH)" 8 4
4124.Ip "truncate(EXPR,LENGTH)" 8
4125Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
4126length.
4127Produces a fatal error if truncate isn't implemented on your system.
4128.Ip "umask(EXPR)" 8 4
4129.Ip "umask EXPR" 8
4130.Ip "umask" 8
4131Sets the umask for the process and returns the old one.
4132If EXPR is omitted, merely returns current umask.
4133.Ip "undef(EXPR)" 8 6
4134.Ip "undef EXPR" 8
4135.Ip "undef" 8
4136Undefines the value of EXPR, which must be an lvalue.
4137Use only on a scalar value, an entire array, or a subroutine name (using &).
4138(Undef will probably not do what you expect on most predefined variables or
4139dbm array values.)
4140Always returns the undefined value.
4141You can omit the EXPR, in which case nothing is undefined, but you still
4142get an undefined value that you could, for instance, return from a subroutine.
4143Examples:
4144.nf
4145
4146.ne 6
4147 undef $foo;
4148 undef $bar{'blurfl'};
4149 undef @ary;
4150 undef %assoc;
4151 undef &mysub;
4152 return (wantarray ? () : undef) if $they_blew_it;
4153
4154.fi
4155.Ip "unlink(LIST)" 8 4
4156.Ip "unlink LIST" 8
4157Deletes a list of files.
4158Returns the number of files successfully deleted.
4159.nf
4160
4161.ne 2
4162 $cnt = unlink \'a\', \'b\', \'c\';
4163 unlink @goners;
4164 unlink <*.bak>;
4165
4166.fi
4167Note: unlink will not delete directories unless you are superuser and the
4168.B \-U
4169flag is supplied to
4170.IR perl .
4171Even if these conditions are met, be warned that unlinking a directory
4172can inflict damage on your filesystem.
4173Use rmdir instead.
4174.Ip "unpack(TEMPLATE,EXPR)" 8 4
4175Unpack does the reverse of pack: it takes a string representing
4176a structure and expands it out into an array value, returning the array
4177value.
4178(In a scalar context, it merely returns the first value produced.)
4179The TEMPLATE has the same format as in the pack function.
4180Here's a subroutine that does substring:
4181.nf
4182
4183.ne 4
4184 sub substr {
4185 local($what,$where,$howmuch) = @_;
4186 unpack("x$where a$howmuch", $what);
4187 }
4188
4189.ne 3
4190and then there's
4191
4192 sub ord { unpack("c",$_[0]); }
4193
4194.fi
4195In addition, you may prefix a field with a %<number> to indicate that
4196you want a <number>-bit checksum of the items instead of the items themselves.
4197Default is a 16-bit checksum.
4198For example, the following computes the same number as the System V sum program:
4199.nf
4200
4201.ne 4
4202 while (<>) {
4203 $checksum += unpack("%16C*", $_);
4204 }
4205 $checksum %= 65536;
4206
4207.fi
4208.Ip "unshift(ARRAY,LIST)" 8 4
4209Does the opposite of a
4210.IR shift .
4211Or the opposite of a
4212.IR push ,
4213depending on how you look at it.
4214Prepends list to the front of the array, and returns the number of elements
4215in the new array.
4216.nf
4217
4218 unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
4219
4220.fi
4221.Ip "utime(LIST)" 8 2
4222.Ip "utime LIST" 8 2
4223Changes the access and modification times on each file of a list of files.
4224The first two elements of the list must be the NUMERICAL access and
4225modification times, in that order.
4226Returns the number of files successfully changed.
4227The inode modification time of each file is set to the current time.
4228Example of a \*(L"touch\*(R" command:
4229.nf
4230
4231.ne 3
4232 #!/usr/bin/perl
4233 $now = time;
4234 utime $now, $now, @ARGV;
4235
4236.fi
4237.Ip "values(ASSOC_ARRAY)" 8 6
4238.Ip "values ASSOC_ARRAY" 8
4239Returns a normal array consisting of all the values of the named associative
4240array.
4241The values are returned in an apparently random order, but it is the same order
4242as either the keys() or each() function would produce on the same array.
4243See also keys() and each().
4244.Ip "vec(EXPR,OFFSET,BITS)" 8 2
4245Treats a string as a vector of unsigned integers, and returns the value
4246of the bitfield specified.
4247May also be assigned to.
4248BITS must be a power of two from 1 to 32.
4249.Sp
4250Vectors created with vec() can also be manipulated with the logical operators
4251|, & and ^,
4252which will assume a bit vector operation is desired when both operands are
4253strings.
4254This interpretation is not enabled unless there is at least one vec() in
4255your program, to protect older programs.
4256.Sp
4257To transform a bit vector into a string or array of 0's and 1's, use these:
4258.nf
4259
4260 $bits = unpack("b*", $vector);
4261 @bits = split(//, unpack("b*", $vector));
4262
4263.fi
4264If you know the exact length in bits, it can be used in place of the *.
4265.Ip "wait" 8 6
4266Waits for a child process to terminate and returns the pid of the deceased
4267process, or -1 if there are no child processes.
4268The status is returned in $?.
4269.Ip "waitpid(PID,FLAGS)" 8 6
4270Waits for a particular child process to terminate and returns the pid of the deceased
4271process, or -1 if there is no such child process.
4272The status is returned in $?.
4273If you say
4274.nf
4275
4276 require "sys/wait.h";
4277 .\|.\|.
4278 waitpid(-1,&WNOHANG);
4279
4280.fi
4281then you can do a non-blocking wait for any process. Non-blocking wait
4282is only available on machines supporting either the
4283.I waitpid (2)
4284or
4285.I wait4 (2)
4286system calls.
4287However, waiting for a particular pid with FLAGS of 0 is implemented
4288everywhere. (Perl emulates the system call by remembering the status
4289values of processes that have exited but have not been harvested by the
4290Perl script yet.)
4291.Ip "wantarray" 8 4
4292Returns true if the context of the currently executing subroutine
4293is looking for an array value.
4294Returns false if the context is looking for a scalar.
4295.nf
4296
4297 return wantarray ? () : undef;
4298
4299.fi
4300.Ip "warn(LIST)" 8 4
4301.Ip "warn LIST" 8
4302Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
4303.Ip "write(FILEHANDLE)" 8 6
4304.Ip "write(EXPR)" 8
4305.Ip "write" 8
4306Writes a formatted record (possibly multi-line) to the specified file,
4307using the format associated with that file.
4308By default the format for a file is the one having the same name is the
4309filehandle, but the format for the current output channel (see
4310.IR select )
4311may be set explicitly
4312by assigning the name of the format to the $~ variable.
4313.Sp
4314Top of form processing is handled automatically:
4315if there is insufficient room on the current page for the formatted
4316record, the page is advanced by writing a form feed,
4317a special top-of-page format is used
4318to format the new page header, and then the record is written.
4319By default the top-of-page format is the name of the filehandle with
4320\*(L"_TOP\*(R" appended, but it may be dynamicallly set to the
4321format of your choice by assigning the name to the $^ variable while
4322the filehandle is selected.
4323The number of lines remaining on the current page is in variable $-, which
4324can be set to 0 to force a new page.
4325.Sp
4326If FILEHANDLE is unspecified, output goes to the current default output channel,
4327which starts out as
4328.I STDOUT
4329but may be changed by the
4330.I select
4331operator.
4332If the FILEHANDLE is an EXPR, then the expression is evaluated and the
4333resulting string is used to look up the name of the FILEHANDLE at run time.
4334For more on formats, see the section on formats later on.
4335.Sp
4336Note that write is NOT the opposite of read.
4337.Sh "Precedence"
4338.I Perl
4339operators have the following associativity and precedence:
4340.nf
4341
4342nonassoc\h'|1i'print printf exec system sort reverse
4343\h'1.5i'chmod chown kill unlink utime die return
4344left\h'|1i',
4345right\h'|1i'= += \-= *= etc.
4346right\h'|1i'?:
4347nonassoc\h'|1i'.\|.
4348left\h'|1i'||
4349left\h'|1i'&&
4350left\h'|1i'| ^
4351left\h'|1i'&
4352nonassoc\h'|1i'== != <=> eq ne cmp
4353nonassoc\h'|1i'< > <= >= lt gt le ge
4354nonassoc\h'|1i'chdir exit eval reset sleep rand umask
4355nonassoc\h'|1i'\-r \-w \-x etc.
4356left\h'|1i'<< >>
4357left\h'|1i'+ \- .
4358left\h'|1i'* / % x
4359left\h'|1i'=~ !~
4360right\h'|1i'! ~ and unary minus
4361right\h'|1i'**
4362nonassoc\h'|1i'++ \-\|\-
4363left\h'|1i'\*(L'(\*(R'
4364
4365.fi
4366As mentioned earlier, if any list operator (print, etc.) or
4367any unary operator (chdir, etc.)
4368is followed by a left parenthesis as the next token on the same line,
4369the operator and arguments within parentheses are taken to
4370be of highest precedence, just like a normal function call.
4371Examples:
4372.nf
4373
4374 chdir $foo || die;\h'|3i'# (chdir $foo) || die
4375 chdir($foo) || die;\h'|3i'# (chdir $foo) || die
4376 chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
4377 chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
4378
4379but, because * is higher precedence than ||:
4380
4381 chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
4382 chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
4383 chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
4384 chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
4385
4386 rand 10 * 20;\h'|3i'# rand (10 * 20)
4387 rand(10) * 20;\h'|3i'# (rand 10) * 20
4388 rand (10) * 20;\h'|3i'# (rand 10) * 20
4389 rand +(10) * 20;\h'|3i'# rand (10 * 20)
4390
4391.fi
4392In the absence of parentheses,
4393the precedence of list operators such as print, sort or chmod is
4394either very high or very low depending on whether you look at the left
4395side of operator or the right side of it.
4396For example, in
4397.nf
4398
4399 @ary = (1, 3, sort 4, 2);
4400 print @ary; # prints 1324
4401
4402.fi
4403the commas on the right of the sort are evaluated before the sort, but
4404the commas on the left are evaluated after.
4405In other words, list operators tend to gobble up all the arguments that
4406follow them, and then act like a simple term with regard to the preceding
4407expression.
4408Note that you have to be careful with parens:
4409.nf
4410
4411.ne 3
4412 # These evaluate exit before doing the print:
4413 print($foo, exit); # Obviously not what you want.
4414 print $foo, exit; # Nor is this.
4415
4416.ne 4
4417 # These do the print before evaluating exit:
4418 (print $foo), exit; # This is what you want.
4419 print($foo), exit; # Or this.
4420 print ($foo), exit; # Or even this.
4421
4422Also note that
4423
4424 print ($foo & 255) + 1, "\en";
4425
4426.fi
4427probably doesn't do what you expect at first glance.
4428.Sh "Subroutines"
4429A subroutine may be declared as follows:
4430.nf
4431
4432 sub NAME BLOCK
4433
4434.fi
4435.PP
4436Any arguments passed to the routine come in as array @_,
4437that is ($_[0], $_[1], .\|.\|.).
4438The array @_ is a local array, but its values are references to the
4439actual scalar parameters.
4440The return value of the subroutine is the value of the last expression
4441evaluated, and can be either an array value or a scalar value.
4442Alternately, a return statement may be used to specify the returned value and
4443exit the subroutine.
4444To create local variables see the
4445.I local
4446operator.
4447.PP
4448A subroutine is called using the
4449.I do
4450operator or the & operator.
4451.nf
4452
4453.ne 12
4454Example:
4455
4456 sub MAX {
4457 local($max) = pop(@_);
4458 foreach $foo (@_) {
4459 $max = $foo \|if \|$max < $foo;
4460 }
4461 $max;
4462 }
4463
4464 .\|.\|.
4465 $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
4466
4467.ne 21
4468Example:
4469
4470 # get a line, combining continuation lines
4471 # that start with whitespace
4472 sub get_line {
4473 $thisline = $lookahead;
4474 line: while ($lookahead = <STDIN>) {
4475 if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
4476 $thisline \|.= \|$lookahead;
4477 }
4478 else {
4479 last line;
4480 }
4481 }
4482 $thisline;
4483 }
4484
4485 $lookahead = <STDIN>; # get first line
4486 while ($_ = do get_line(\|)) {
4487 .\|.\|.
4488 }
4489
4490.fi
4491.nf
4492.ne 6
4493Use array assignment to a local list to name your formal arguments:
4494
4495 sub maybeset {
4496 local($key, $value) = @_;
4497 $foo{$key} = $value unless $foo{$key};
4498 }
4499
4500.fi
4501This also has the effect of turning call-by-reference into call-by-value,
4502since the assignment copies the values.
4503.Sp
4504Subroutines may be called recursively.
4505If a subroutine is called using the & form, the argument list is optional.
4506If omitted, no @_ array is set up for the subroutine; the @_ array at the
4507time of the call is visible to subroutine instead.
4508.nf
4509
4510 do foo(1,2,3); # pass three arguments
4511 &foo(1,2,3); # the same
4512
4513 do foo(); # pass a null list
4514 &foo(); # the same
4515 &foo; # pass no arguments\*(--more efficient
4516
4517.fi
4518.Sh "Passing By Reference"
4519Sometimes you don't want to pass the value of an array to a subroutine but
4520rather the name of it, so that the subroutine can modify the global copy
4521of it rather than working with a local copy.
4522In perl you can refer to all the objects of a particular name by prefixing
4523the name with a star: *foo.
4524When evaluated, it produces a scalar value that represents all the objects
4525of that name, including any filehandle, format or subroutine.
4526When assigned to within a local() operation, it causes the name mentioned
4527to refer to whatever * value was assigned to it.
4528Example:
4529.nf
4530
4531 sub doubleary {
4532 local(*someary) = @_;
4533 foreach $elem (@someary) {
4534 $elem *= 2;
4535 }
4536 }
4537 do doubleary(*foo);
4538 do doubleary(*bar);
4539
4540.fi
4541Assignment to *name is currently recommended only inside a local().
4542You can actually assign to *name anywhere, but the previous referent of
4543*name may be stranded forever.
4544This may or may not bother you.
4545.Sp
4546Note that scalars are already passed by reference, so you can modify scalar
4547arguments without using this mechanism by referring explicitly to the $_[nnn]
4548in question.
4549You can modify all the elements of an array by passing all the elements
4550as scalars, but you have to use the * mechanism to push, pop or change the
4551size of an array.
4552The * mechanism will probably be more efficient in any case.
4553.Sp
4554Since a *name value contains unprintable binary data, if it is used as
4555an argument in a print, or as a %s argument in a printf or sprintf, it
4556then has the value '*name', just so it prints out pretty.
4557.Sp
4558Even if you don't want to modify an array, this mechanism is useful for
4559passing multiple arrays in a single LIST, since normally the LIST mechanism
4560will merge all the array values so that you can't extract out the
4561individual arrays.
4562.Sh "Regular Expressions"
4563The patterns used in pattern matching are regular expressions such as
4564those supplied in the Version 8 regexp routines.
4565(In fact, the routines are derived from Henry Spencer's freely redistributable
4566reimplementation of the V8 routines.)
4567In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
4568Word boundaries may be matched by \eb, and non-boundaries by \eB.
4569A whitespace character is matched by \es, non-whitespace by \eS.
4570A numeric character is matched by \ed, non-numeric by \eD.
4571You may use \ew, \es and \ed within character classes.
4572Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
4573Within character classes \eb represents backspace rather than a word boundary.
4574Alternatives may be separated by |.
4575The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
4576matches the digit'th substring.
4577(Outside of the pattern, always use $ instead of \e in front of the digit.
4578The scope of $<digit> (and $\`, $& and $\')
4579extends to the end of the enclosing BLOCK or eval string, or to
4580the next pattern match with subexpressions.
4581The \e<digit> notation sometimes works outside the current pattern, but should
4582not be relied upon.)
4583You may have as many parentheses as you wish. If you have more than 9
4584substrings, the variables $10, $11, ... refer to the corresponding
4585substring. Within the pattern, \e10, \e11,
4586etc. refer back to substrings if there have been at least that many left parens
4587before the backreference. Otherwise (for backward compatibilty) \e10
4588is the same as \e010, a backspace,
4589and \e11 the same as \e011, a tab.
4590And so on.
4591(\e1 through \e9 are always backreferences.)
4592.PP
4593$+ returns whatever the last bracket match matched.
4594$& returns the entire matched string.
4595($0 used to return the same thing, but not any more.)
4596$\` returns everything before the matched string.
4597$\' returns everything after the matched string.
4598Examples:
4599.nf
4600
4601 s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
4602
4603.ne 5
4604 if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
4605 $hours = $1;
4606 $minutes = $2;
4607 $seconds = $3;
4608 }
4609
4610.fi
4611By default, the ^ character is only guaranteed to match at the beginning
4612of the string,
4613the $ character only at the end (or before the newline at the end)
4614and
4615.I perl
4616does certain optimizations with the assumption that the string contains
4617only one line.
4618The behavior of ^ and $ on embedded newlines will be inconsistent.
4619You may, however, wish to treat a string as a multi-line buffer, such that
4620the ^ will match after any newline within the string, and $ will match
4621before any newline.
4622At the cost of a little more overhead, you can do this by setting the variable
4623$* to 1.
4624Setting it back to 0 makes
4625.I perl
4626revert to its old behavior.
4627.PP
4628To facilitate multi-line substitutions, the . character never matches a newline
4629(even when $* is 0).
4630In particular, the following leaves a newline on the $_ string:
4631.nf
4632
4633 $_ = <STDIN>;
4634 s/.*(some_string).*/$1/;
4635
4636If the newline is unwanted, try one of
4637
4638 s/.*(some_string).*\en/$1/;
4639 s/.*(some_string)[^\e000]*/$1/;
4640 s/.*(some_string)(.|\en)*/$1/;
4641 chop; s/.*(some_string).*/$1/;
4642 /(some_string)/ && ($_ = $1);
4643
4644.fi
4645Any item of a regular expression may be followed with digits in curly brackets
4646of the form {n,m}, where n gives the minimum number of times to match the item
4647and m gives the maximum.
4648The form {n} is equivalent to {n,n} and matches exactly n times.
4649The form {n,} matches n or more times.
4650(If a curly bracket occurs in any other context, it is treated as a regular
4651character.)
4652The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
4653to {0,1}.
4654There is no limit to the size of n or m, but large numbers will chew up
4655more memory.
4656.Sp
4657You will note that all backslashed metacharacters in
4658.I perl
4659are alphanumeric,
4660such as \eb, \ew, \en.
4661Unlike some other regular expression languages, there are no backslashed
4662symbols that aren't alphanumeric.
4663So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
4664interpreted as a literal character, not a metacharacter.
4665This makes it simple to quote a string that you want to use for a pattern
4666but that you are afraid might contain metacharacters.
4667Simply quote all the non-alphanumeric characters:
4668.nf
4669
4670 $pattern =~ s/(\eW)/\e\e$1/g;
4671
4672.fi
4673.Sh "Formats"
4674Output record formats for use with the
4675.I write
4676operator may declared as follows:
4677.nf
4678
4679.ne 3
4680 format NAME =
4681 FORMLIST
4682 .
4683
4684.fi
4685If name is omitted, format \*(L"STDOUT\*(R" is defined.
4686FORMLIST consists of a sequence of lines, each of which may be of one of three
4687types:
4688.Ip 1. 4
4689A comment.
4690.Ip 2. 4
4691A \*(L"picture\*(R" line giving the format for one output line.
4692.Ip 3. 4
4693An argument line supplying values to plug into a picture line.
4694.PP
4695Picture lines are printed exactly as they look, except for certain fields
4696that substitute values into the line.
4697Each picture field starts with either @ or ^.
4698The @ field (not to be confused with the array marker @) is the normal
4699case; ^ fields are used
4700to do rudimentary multi-line text block filling.
4701The length of the field is supplied by padding out the field
4702with multiple <, >, or | characters to specify, respectively, left justification,
4703right justification, or centering.
4704As an alternate form of right justification,
4705you may also use # characters (with an optional .) to specify a numeric field.
4706(Use of ^ instead of @ causes the field to be blanked if undefined.)
4707If any of the values supplied for these fields contains a newline, only
4708the text up to the newline is printed.
4709The special field @* can be used for printing multi-line values.
4710It should appear by itself on a line.
4711.PP
4712The values are specified on the following line, in the same order as
4713the picture fields.
4714The values should be separated by commas.
4715.PP
4716Picture fields that begin with ^ rather than @ are treated specially.
4717The value supplied must be a scalar variable name which contains a text
4718string.
4719.I Perl
4720puts as much text as it can into the field, and then chops off the front
4721of the string so that the next time the variable is referenced,
4722more of the text can be printed.
4723Normally you would use a sequence of fields in a vertical stack to print
4724out a block of text.
4725If you like, you can end the final field with .\|.\|., which will appear in the
4726output if the text was too long to appear in its entirety.
4727You can change which characters are legal to break on by changing the
4728variable $: to a list of the desired characters.
4729.PP
4730Since use of ^ fields can produce variable length records if the text to be
4731formatted is short, you can suppress blank lines by putting the tilde (~)
4732character anywhere in the line.
4733(Normally you should put it in the front if possible, for visibility.)
4734The tilde will be translated to a space upon output.
4735If you put a second tilde contiguous to the first, the line will be repeated
4736until all the fields on the line are exhausted.
4737(If you use a field of the @ variety, the expression you supply had better
4738not give the same value every time forever!)
4739.PP
4740Examples:
4741.nf
4742.lg 0
4743.cs R 25
4744.ft C
4745
4746.ne 10
4747# a report on the /etc/passwd file
4748format STDOUT_TOP =
4749\& Passwd File
4750Name Login Office Uid Gid Home
4751------------------------------------------------------------------
4752\&.
4753format STDOUT =
4754@<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
4755$name, $login, $office,$uid,$gid, $home
4756\&.
4757
4758.ne 29
4759# a report from a bug report form
4760format STDOUT_TOP =
4761\& Bug Reports
4762@<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>>
4763$system, $%, $date
4764------------------------------------------------------------------
4765\&.
4766format STDOUT =
4767Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4768\& $subject
4769Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4770\& $index, $description
4771Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4772\& $priority, $date, $description
4773From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4774\& $from, $description
4775Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4776\& $programmer, $description
4777\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4778\& $description
4779\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4780\& $description
4781\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4782\& $description
4783\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4784\& $description
4785\&~ ^<<<<<<<<<<<<<<<<<<<<<<<...
4786\& $description
4787\&.
4788
4789.ft R
4790.cs R
4791.lg
4792.fi
4793It is possible to intermix prints with writes on the same output channel,
4794but you'll have to handle $\- (lines left on the page) yourself.
4795.PP
4796If you are printing lots of fields that are usually blank, you should consider
4797using the reset operator between records.
4798Not only is it more efficient, but it can prevent the bug of adding another
4799field and forgetting to zero it.
4800.Sh "Interprocess Communication"
4801The IPC facilities of perl are built on the Berkeley socket mechanism.
4802If you don't have sockets, you can ignore this section.
4803The calls have the same names as the corresponding system calls,
4804but the arguments tend to differ, for two reasons.
4805First, perl file handles work differently than C file descriptors.
4806Second, perl already knows the length of its strings, so you don't need
4807to pass that information.
4808Here is a sample client (untested):
4809.nf
4810
4811 ($them,$port) = @ARGV;
4812 $port = 2345 unless $port;
4813 $them = 'localhost' unless $them;
4814
4815 $SIG{'INT'} = 'dokill';
4816 sub dokill { kill 9,$child if $child; }
4817
4818 require 'sys/socket.ph';
4819
4820 $sockaddr = 'S n a4 x8';
4821 chop($hostname = `hostname`);
4822
4823 ($name, $aliases, $proto) = getprotobyname('tcp');
4824 ($name, $aliases, $port) = getservbyname($port, 'tcp')
4825 unless $port =~ /^\ed+$/;
4826.ie t \{\
4827 ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
4828'br\}
4829.el \{\
4830 ($name, $aliases, $type, $len, $thisaddr) =
4831 gethostbyname($hostname);
4832'br\}
4833 ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
4834
4835 $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
4836 $that = pack($sockaddr, &AF_INET, $port, $thataddr);
4837
4838 socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4839 bind(S, $this) || die "bind: $!";
4840 connect(S, $that) || die "connect: $!";
4841
4842 select(S); $| = 1; select(stdout);
4843
4844 if ($child = fork) {
4845 while (<>) {
4846 print S;
4847 }
4848 sleep 3;
4849 do dokill();
4850 }
4851 else {
4852 while (<S>) {
4853 print;
4854 }
4855 }
4856
4857.fi
4858And here's a server:
4859.nf
4860
4861 ($port) = @ARGV;
4862 $port = 2345 unless $port;
4863
4864 require 'sys/socket.ph';
4865
4866 $sockaddr = 'S n a4 x8';
4867
4868 ($name, $aliases, $proto) = getprotobyname('tcp');
4869 ($name, $aliases, $port) = getservbyname($port, 'tcp')
4870 unless $port =~ /^\ed+$/;
4871
4872 $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
4873
4874 select(NS); $| = 1; select(stdout);
4875
4876 socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4877 bind(S, $this) || die "bind: $!";
4878 listen(S, 5) || die "connect: $!";
4879
4880 select(S); $| = 1; select(stdout);
4881
4882 for (;;) {
4883 print "Listening again\en";
4884 ($addr = accept(NS,S)) || die $!;
4885 print "accept ok\en";
4886
4887 ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
4888 @inetaddr = unpack('C4',$inetaddr);
4889 print "$af $port @inetaddr\en";
4890
4891 while (<NS>) {
4892 print;
4893 print NS;
4894 }
4895 }
4896
4897.fi
4898.Sh "Predefined Names"
4899The following names have special meaning to
4900.IR perl .
4901I could have used alphabetic symbols for some of these, but I didn't want
4902to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
4903out.
4904You'll just have to suffer along with these silly symbols.
4905Most of them have reasonable mnemonics, or analogues in one of the shells.
4906.Ip $_ 8
4907The default input and pattern-searching space.
4908The following pairs are equivalent:
4909.nf
4910
4911.ne 2
4912 while (<>) {\|.\|.\|. # only equivalent in while!
4913 while ($_ = <>) {\|.\|.\|.
4914
4915.ne 2
4916 /\|^Subject:/
4917 $_ \|=~ \|/\|^Subject:/
4918
4919.ne 2
4920 y/a\-z/A\-Z/
4921 $_ =~ y/a\-z/A\-Z/
4922
4923.ne 2
4924 chop
4925 chop($_)
4926
4927.fi
4928(Mnemonic: underline is understood in certain operations.)
4929.Ip $. 8
4930The current input line number of the last filehandle that was read.
4931Readonly.
4932Remember that only an explicit close on the filehandle resets the line number.
4933Since <> never does an explicit close, line numbers increase across ARGV files
4934(but see examples under eof).
4935(Mnemonic: many programs use . to mean the current line number.)
4936.Ip $/ 8
4937The input record separator, newline by default.
4938Works like
4939.IR awk 's
4940RS variable, including treating blank lines as delimiters
4941if set to the null string.
4942You may set it to a multicharacter string to match a multi-character
4943delimiter.
4944Note that setting it to "\en\en" means something slightly different
4945than setting it to "", if the file contains consecutive blank lines.
4946Setting it to "" will treat two or more consecutive blank lines as a single
4947blank line.
4948Setting it to "\en\en" will blindly assume that the next input character
4949belongs to the next paragraph, even if it's a newline.
4950(Mnemonic: / is used to delimit line boundaries when quoting poetry.)
4951.Ip $, 8
4952The output field separator for the print operator.
4953Ordinarily the print operator simply prints out the comma separated fields
4954you specify.
4955In order to get behavior more like
4956.IR awk ,
4957set this variable as you would set
4958.IR awk 's
4959OFS variable to specify what is printed between fields.
4960(Mnemonic: what is printed when there is a , in your print statement.)
4961.Ip $"" 8
4962This is like $, except that it applies to array values interpolated into
4963a double-quoted string (or similar interpreted string).
4964Default is a space.
4965(Mnemonic: obvious, I think.)
4966.Ip $\e 8
4967The output record separator for the print operator.
4968Ordinarily the print operator simply prints out the comma separated fields
4969you specify, with no trailing newline or record separator assumed.
4970In order to get behavior more like
4971.IR awk ,
4972set this variable as you would set
4973.IR awk 's
4974ORS variable to specify what is printed at the end of the print.
4975(Mnemonic: you set $\e instead of adding \en at the end of the print.
4976Also, it's just like /, but it's what you get \*(L"back\*(R" from
4977.IR perl .)
4978.Ip $# 8
4979The output format for printed numbers.
4980This variable is a half-hearted attempt to emulate
4981.IR awk 's
4982OFMT variable.
4983There are times, however, when
4984.I awk
4985and
4986.I perl
4987have differing notions of what
4988is in fact numeric.
4989Also, the initial value is %.20g rather than %.6g, so you need to set $#
4990explicitly to get
4991.IR awk 's
4992value.
4993(Mnemonic: # is the number sign.)
4994.Ip $% 8
4995The current page number of the currently selected output channel.
4996(Mnemonic: % is page number in nroff.)
4997.Ip $= 8
4998The current page length (printable lines) of the currently selected output
4999channel.
5000Default is 60.
5001(Mnemonic: = has horizontal lines.)
5002.Ip $\- 8
5003The number of lines left on the page of the currently selected output channel.
5004(Mnemonic: lines_on_page \- lines_printed.)
5005.Ip $~ 8
5006The name of the current report format for the currently selected output
5007channel.
5008Default is name of the filehandle.
5009(Mnemonic: brother to $^.)
5010.Ip $^ 8
5011The name of the current top-of-page format for the currently selected output
5012channel.
5013Default is name of the filehandle with \*(L"_TOP\*(R" appended.
5014(Mnemonic: points to top of page.)
5015.Ip $| 8
5016If set to nonzero, forces a flush after every write or print on the currently
5017selected output channel.
5018Default is 0.
5019Note that
5020.I STDOUT
5021will typically be line buffered if output is to the
5022terminal and block buffered otherwise.
5023Setting this variable is useful primarily when you are outputting to a pipe,
5024such as when you are running a
5025.I perl
5026script under rsh and want to see the
5027output as it's happening.
5028(Mnemonic: when you want your pipes to be piping hot.)
5029.Ip $$ 8
5030The process number of the
5031.I perl
5032running this script.
5033(Mnemonic: same as shells.)
5034.Ip $? 8
5035The status returned by the last pipe close, backtick (\`\`) command or
5036.I system
5037operator.
5038Note that this is the status word returned by the wait() system
5039call, so the exit value of the subprocess is actually ($? >> 8).
5040$? & 255 gives which signal, if any, the process died from, and whether
5041there was a core dump.
5042(Mnemonic: similar to sh and ksh.)
5043.Ip $& 8 4
5044The string matched by the last successful pattern match
5045(not counting any matches hidden
5046within a BLOCK or eval enclosed by the current BLOCK).
5047(Mnemonic: like & in some editors.)
5048.Ip $\` 8 4
5049The string preceding whatever was matched by the last successful pattern match
5050(not counting any matches hidden within a BLOCK or eval enclosed by the current
5051BLOCK).
5052(Mnemonic: \` often precedes a quoted string.)
5053.Ip $\' 8 4
5054The string following whatever was matched by the last successful pattern match
5055(not counting any matches hidden within a BLOCK or eval enclosed by the current
5056BLOCK).
5057(Mnemonic: \' often follows a quoted string.)
5058Example:
5059.nf
5060
5061.ne 3
5062 $_ = \'abcdefghi\';
5063 /def/;
5064 print "$\`:$&:$\'\en"; # prints abc:def:ghi
5065
5066.fi
5067.Ip $+ 8 4
5068The last bracket matched by the last search pattern.
5069This is useful if you don't know which of a set of alternative patterns
5070matched.
5071For example:
5072.nf
5073
5074 /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
5075
5076.fi
5077(Mnemonic: be positive and forward looking.)
5078.Ip $* 8 2
5079Set to 1 to do multiline matching within a string, 0 to tell
5080.I perl
5081that it can assume that strings contain a single line, for the purpose
5082of optimizing pattern matches.
5083Pattern matches on strings containing multiple newlines can produce confusing
5084results when $* is 0.
5085Default is 0.
5086(Mnemonic: * matches multiple things.)
5087Note that this variable only influences the interpretation of ^ and $.
5088A literal newline can be searched for even when $* == 0.
5089.Ip $0 8
5090Contains the name of the file containing the
5091.I perl
5092script being executed.
5093Assigning to $0 modifies the argument area that the ps(1) program sees.
5094(Mnemonic: same as sh and ksh.)
5095.Ip $<digit> 8
5096Contains the subpattern from the corresponding set of parentheses in the last
5097pattern matched, not counting patterns matched in nested blocks that have
5098been exited already.
5099(Mnemonic: like \edigit.)
5100.Ip $[ 8 2
5101The index of the first element in an array, and of the first character in
5102a substring.
5103Default is 0, but you could set it to 1 to make
5104.I perl
5105behave more like
5106.I awk
5107(or Fortran)
5108when subscripting and when evaluating the index() and substr() functions.
5109(Mnemonic: [ begins subscripts.)
5110.Ip $] 8 2
5111The string printed out when you say \*(L"perl -v\*(R".
5112It can be used to determine at the beginning of a script whether the perl
5113interpreter executing the script is in the right range of versions.
5114If used in a numeric context, returns the version + patchlevel / 1000.
5115Example:
5116.nf
5117
5118.ne 8
5119 # see if getc is available
5120 ($version,$patchlevel) =
5121 $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
5122 print STDERR "(No filename completion available.)\en"
5123 if $version * 1000 + $patchlevel < 2016;
5124
5125or, used numerically,
5126
5127 warn "No checksumming!\en" if $] < 3.019;
5128
5129.fi
5130(Mnemonic: Is this version of perl in the right bracket?)
5131.Ip $; 8 2
5132The subscript separator for multi-dimensional array emulation.
5133If you refer to an associative array element as
5134.nf
5135 $foo{$a,$b,$c}
5136
5137it really means
5138
5139 $foo{join($;, $a, $b, $c)}
5140
5141But don't put
5142
5143 @foo{$a,$b,$c} # a slice\*(--note the @
5144
5145which means
5146
5147 ($foo{$a},$foo{$b},$foo{$c})
5148
5149.fi
5150Default is "\e034", the same as SUBSEP in
5151.IR awk .
5152Note that if your keys contain binary data there might not be any safe
5153value for $;.
5154(Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
5155Yeah, I know, it's pretty lame, but $, is already taken for something more
5156important.)
5157.Ip $! 8 2
5158If used in a numeric context, yields the current value of errno, with all the
5159usual caveats.
5160(This means that you shouldn't depend on the value of $! to be anything
5161in particular unless you've gotten a specific error return indicating a
5162system error.)
5163If used in a string context, yields the corresponding system error string.
5164You can assign to $! in order to set errno
5165if, for instance, you want $! to return the string for error n, or you want
5166to set the exit value for the die operator.
5167(Mnemonic: What just went bang?)
5168.Ip $@ 8 2
5169The perl syntax error message from the last eval command.
5170If null, the last eval parsed and executed correctly (although the operations
5171you invoked may have failed in the normal fashion).
5172(Mnemonic: Where was the syntax error \*(L"at\*(R"?)
5173.Ip $< 8 2
5174The real uid of this process.
5175(Mnemonic: it's the uid you came FROM, if you're running setuid.)
5176.Ip $> 8 2
5177The effective uid of this process.
5178Example:
5179.nf
5180
5181.ne 2
5182 $< = $>; # set real uid to the effective uid
5183 ($<,$>) = ($>,$<); # swap real and effective uid
5184
5185.fi
5186(Mnemonic: it's the uid you went TO, if you're running setuid.)
5187Note: $< and $> can only be swapped on machines supporting setreuid().
5188.Ip $( 8 2
5189The real gid of this process.
5190If you are on a machine that supports membership in multiple groups
5191simultaneously, gives a space separated list of groups you are in.
5192The first number is the one returned by getgid(), and the subsequent ones
5193by getgroups(), one of which may be the same as the first number.
5194(Mnemonic: parentheses are used to GROUP things.
5195The real gid is the group you LEFT, if you're running setgid.)
5196.Ip $) 8 2
5197The effective gid of this process.
5198If you are on a machine that supports membership in multiple groups
5199simultaneously, gives a space separated list of groups you are in.
5200The first number is the one returned by getegid(), and the subsequent ones
5201by getgroups(), one of which may be the same as the first number.
5202(Mnemonic: parentheses are used to GROUP things.
5203The effective gid is the group that's RIGHT for you, if you're running setgid.)
5204.Sp
5205Note: $<, $>, $( and $) can only be set on machines that support the
5206corresponding set[re][ug]id() routine.
5207$( and $) can only be swapped on machines supporting setregid().
5208.Ip $: 8 2
5209The current set of characters after which a string may be broken to
5210fill continuation fields (starting with ^) in a format.
5211Default is "\ \en-", to break on whitespace or hyphens.
5212(Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
5213.Ip $^D 8 2
5214The current value of the debugging flags.
5215(Mnemonic: value of
5216.B \-D
5217switch.)
5218.Ip $^F 8 2
5219The maximum system file descriptor, ordinarily 2. System file descriptors
5220are passed to subprocesses, while higher file descriptors are not.
5221During an open, system file descriptors are preserved even if the open
5222fails. Ordinary file descriptors are closed before the open is attempted.
5223.Ip $^I 8 2
5224The current value of the inplace-edit extension.
5225Use undef to disable inplace editing.
5226(Mnemonic: value of
5227.B \-i
5228switch.)
5229.Ip $^L 8 2
5230What formats output to perform a formfeed. Default is \ef.
5231.Ip $^P 8 2
5232The internal flag that the debugger clears so that it doesn't
5233debug itself. You could conceivable disable debugging yourself
5234by clearing it.
5235.Ip $^T 8 2
5236The time at which the script began running, in seconds since the epoch.
5237The values returned by the
5238.B \-M ,
5239.B \-A
5240and
5241.B \-C
5242filetests are based on this value.
5243.Ip $^W 8 2
5244The current value of the warning switch.
5245(Mnemonic: related to the
5246.B \-w
5247switch.)
5248.Ip $^X 8 2
5249The name that Perl itself was executed as, from argv[0].
5250.Ip $ARGV 8 3
5251contains the name of the current file when reading from <>.
5252.Ip @ARGV 8 3
5253The array ARGV contains the command line arguments intended for the script.
5254Note that $#ARGV is the generally number of arguments minus one, since
5255$ARGV[0] is the first argument, NOT the command name.
5256See $0 for the command name.
5257.Ip @INC 8 3
5258The array INC contains the list of places to look for
5259.I perl
5260scripts to be
5261evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command.
5262It initially consists of the arguments to any
5263.B \-I
5264command line switches, followed
5265by the default
5266.I perl
5267library, probably \*(L"/usr/local/lib/perl\*(R",
5268followed by \*(L".\*(R", to represent the current directory.
5269.Ip %INC 8 3
5270The associative array INC contains entries for each filename that has
5271been included via \*(L"do\*(R" or \*(L"require\*(R".
5272The key is the filename you specified, and the value is the location of
5273the file actually found.
5274The \*(L"require\*(R" command uses this array to determine whether
5275a given file has already been included.
5276.Ip $ENV{expr} 8 2
5277The associative array ENV contains your current environment.
5278Setting a value in ENV changes the environment for child processes.
5279.Ip $SIG{expr} 8 2
5280The associative array SIG is used to set signal handlers for various signals.
5281Example:
5282.nf
5283
5284.ne 12
5285 sub handler { # 1st argument is signal name
5286 local($sig) = @_;
5287 print "Caught a SIG$sig\-\|\-shutting down\en";
5288 close(LOG);
5289 exit(0);
5290 }
5291
5292 $SIG{\'INT\'} = \'handler\';
5293 $SIG{\'QUIT\'} = \'handler\';
5294 .\|.\|.
5295 $SIG{\'INT\'} = \'DEFAULT\'; # restore default action
5296 $SIG{\'QUIT\'} = \'IGNORE\'; # ignore SIGQUIT
5297
5298.fi
5299The SIG array only contains values for the signals actually set within
5300the perl script.
5301.Sh "Packages"
5302Perl provides a mechanism for alternate namespaces to protect packages from
5303stomping on each others variables.
5304By default, a perl script starts compiling into the package known as \*(L"main\*(R".
5305By use of the
5306.I package
5307declaration, you can switch namespaces.
5308The scope of the package declaration is from the declaration itself to the end
5309of the enclosing block (the same scope as the local() operator).
5310Typically it would be the first declaration in a file to be included by
5311the \*(L"require\*(R" operator.
5312You can switch into a package in more than one place; it merely influences
5313which symbol table is used by the compiler for the rest of that block.
5314You can refer to variables and filehandles in other packages by prefixing
5315the identifier with the package name and a single quote.
5316If the package name is null, the \*(L"main\*(R" package as assumed.
5317.PP
5318Only identifiers starting with letters are stored in the packages symbol
5319table.
5320All other symbols are kept in package \*(L"main\*(R".
5321In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
5322and SIG are forced to be in package \*(L"main\*(R", even when used for
5323other purposes than their built-in one.
5324Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
5325or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
5326will be interpreted instead as a pattern match, a substitution
5327or a translation.
5328.PP
5329Eval'ed strings are compiled in the package in which the eval was compiled
5330in.
5331(Assignments to $SIG{}, however, assume the signal handler specified is in the
5332main package.
5333Qualify the signal handler name if you wish to have a signal handler in
5334a package.)
5335For an example, examine perldb.pl in the perl library.
5336It initially switches to the DB package so that the debugger doesn't interfere
5337with variables in the script you are trying to debug.
5338At various points, however, it temporarily switches back to the main package
5339to evaluate various expressions in the context of the main package.
5340.PP
5341The symbol table for a package happens to be stored in the associative array
5342of that name prepended with an underscore.
5343The value in each entry of the associative array is
5344what you are referring to when you use the *name notation.
5345In fact, the following have the same effect (in package main, anyway),
5346though the first is more
5347efficient because it does the symbol table lookups at compile time:
5348.nf
5349
5350.ne 2
5351 local(*foo) = *bar;
5352 local($_main{'foo'}) = $_main{'bar'};
5353
5354.fi
5355You can use this to print out all the variables in a package, for instance.
5356Here is dumpvar.pl from the perl library:
5357.nf
5358.ne 11
5359 package dumpvar;
5360
5361 sub main'dumpvar {
5362 \& ($package) = @_;
5363 \& local(*stab) = eval("*_$package");
5364 \& while (($key,$val) = each(%stab)) {
5365 \& {
5366 \& local(*entry) = $val;
5367 \& if (defined $entry) {
5368 \& print "\e$$key = '$entry'\en";
5369 \& }
5370.ne 7
5371 \& if (defined @entry) {
5372 \& print "\e@$key = (\en";
5373 \& foreach $num ($[ .. $#entry) {
5374 \& print " $num\et'",$entry[$num],"'\en";
5375 \& }
5376 \& print ")\en";
5377 \& }
5378.ne 10
5379 \& if ($key ne "_$package" && defined %entry) {
5380 \& print "\e%$key = (\en";
5381 \& foreach $key (sort keys(%entry)) {
5382 \& print " $key\et'",$entry{$key},"'\en";
5383 \& }
5384 \& print ")\en";
5385 \& }
5386 \& }
5387 \& }
5388 }
5389
5390.fi
5391Note that, even though the subroutine is compiled in package dumpvar, the
5392name of the subroutine is qualified so that its name is inserted into package
5393\*(L"main\*(R".
5394.Sh "Style"
5395Each programmer will, of course, have his or her own preferences in regards
5396to formatting, but there are some general guidelines that will make your
5397programs easier to read.
5398.Ip 1. 4 4
5399Just because you CAN do something a particular way doesn't mean that
5400you SHOULD do it that way.
5401.I Perl
5402is designed to give you several ways to do anything, so consider picking
5403the most readable one.
5404For instance
5405
5406 open(FOO,$foo) || die "Can't open $foo: $!";
5407
5408is better than
5409
5410 die "Can't open $foo: $!" unless open(FOO,$foo);
5411
5412because the second way hides the main point of the statement in a
5413modifier.
5414On the other hand
5415
5416 print "Starting analysis\en" if $verbose;
5417
5418is better than
5419
5420 $verbose && print "Starting analysis\en";
5421
5422since the main point isn't whether the user typed -v or not.
5423.Sp
5424Similarly, just because an operator lets you assume default arguments
5425doesn't mean that you have to make use of the defaults.
5426The defaults are there for lazy systems programmers writing one-shot
5427programs.
5428If you want your program to be readable, consider supplying the argument.
5429.Sp
5430Along the same lines, just because you
5431.I can
5432omit parentheses in many places doesn't mean that you ought to:
5433.nf
5434
5435 return print reverse sort num values array;
5436 return print(reverse(sort num (values(%array))));
5437
5438.fi
5439When in doubt, parenthesize.
5440At the very least it will let some poor schmuck bounce on the % key in vi.
5441.Sp
5442Even if you aren't in doubt, consider the mental welfare of the person who
5443has to maintain the code after you, and who will probably put parens in
5444the wrong place.
5445.Ip 2. 4 4
5446Don't go through silly contortions to exit a loop at the top or the
5447bottom, when
5448.I perl
5449provides the "last" operator so you can exit in the middle.
5450Just outdent it a little to make it more visible:
5451.nf
5452
5453.ne 7
5454 line:
5455 for (;;) {
5456 statements;
5457 last line if $foo;
5458 next line if /^#/;
5459 statements;
5460 }
5461
5462.fi
5463.Ip 3. 4 4
5464Don't be afraid to use loop labels\*(--they're there to enhance readability as
5465well as to allow multi-level loop breaks.
5466See last example.
5467.Ip 4. 4 4
5468For portability, when using features that may not be implemented on every
5469machine, test the construct in an eval to see if it fails.
5470If you know what version or patchlevel a particular feature was implemented,
5471you can test $] to see if it will be there.
5472.Ip 5. 4 4
5473Choose mnemonic identifiers.
5474.Ip 6. 4 4
5475Be consistent.
5476.Sh "Debugging"
5477If you invoke
5478.I perl
5479with a
5480.B \-d
5481switch, your script will be run under a debugging monitor.
5482It will halt before the first executable statement and ask you for a
5483command, such as:
5484.Ip "h" 12 4
5485Prints out a help message.
5486.Ip "T" 12 4
5487Stack trace.
5488.Ip "s" 12 4
5489Single step.
5490Executes until it reaches the beginning of another statement.
5491.Ip "n" 12 4
5492Next.
5493Executes over subroutine calls, until it reaches the beginning of the
5494next statement.
5495.Ip "f" 12 4
5496Finish.
5497Executes statements until it has finished the current subroutine.
5498.Ip "c" 12 4
5499Continue.
5500Executes until the next breakpoint is reached.
5501.Ip "c line" 12 4
5502Continue to the specified line.
5503Inserts a one-time-only breakpoint at the specified line.
5504.Ip "<CR>" 12 4
5505Repeat last n or s.
5506.Ip "l min+incr" 12 4
5507List incr+1 lines starting at min.
5508If min is omitted, starts where last listing left off.
5509If incr is omitted, previous value of incr is used.
5510.Ip "l min-max" 12 4
5511List lines in the indicated range.
5512.Ip "l line" 12 4
5513List just the indicated line.
5514.Ip "l" 12 4
5515List next window.
5516.Ip "-" 12 4
5517List previous window.
5518.Ip "w line" 12 4
5519List window around line.
5520.Ip "l subname" 12 4
5521List subroutine.
5522If it's a long subroutine it just lists the beginning.
5523Use \*(L"l\*(R" to list more.
5524.Ip "/pattern/" 12 4
5525Regular expression search forward for pattern; the final / is optional.
5526.Ip "?pattern?" 12 4
5527Regular expression search backward for pattern; the final ? is optional.
5528.Ip "L" 12 4
5529List lines that have breakpoints or actions.
5530.Ip "S" 12 4
5531Lists the names of all subroutines.
5532.Ip "t" 12 4
5533Toggle trace mode on or off.
5534.Ip "b line condition" 12 4
5535Set a breakpoint.
5536If line is omitted, sets a breakpoint on the
5537line that is about to be executed.
5538If a condition is specified, it is evaluated each time the statement is
5539reached and a breakpoint is taken only if the condition is true.
5540Breakpoints may only be set on lines that begin an executable statement.
5541.Ip "b subname condition" 12 4
5542Set breakpoint at first executable line of subroutine.
5543.Ip "d line" 12 4
5544Delete breakpoint.
5545If line is omitted, deletes the breakpoint on the
5546line that is about to be executed.
5547.Ip "D" 12 4
5548Delete all breakpoints.
5549.Ip "a line command" 12 4
5550Set an action for line.
5551A multi-line command may be entered by backslashing the newlines.
5552.Ip "A" 12 4
5553Delete all line actions.
5554.Ip "< command" 12 4
5555Set an action to happen before every debugger prompt.
5556A multi-line command may be entered by backslashing the newlines.
5557.Ip "> command" 12 4
5558Set an action to happen after the prompt when you've just given a command
5559to return to executing the script.
5560A multi-line command may be entered by backslashing the newlines.
5561.Ip "V package" 12 4
5562List all variables in package.
5563Default is main package.
5564.Ip "! number" 12 4
5565Redo a debugging command.
5566If number is omitted, redoes the previous command.
5567.Ip "! -number" 12 4
5568Redo the command that was that many commands ago.
5569.Ip "H -number" 12 4
5570Display last n commands.
5571Only commands longer than one character are listed.
5572If number is omitted, lists them all.
5573.Ip "q or ^D" 12 4
5574Quit.
5575.Ip "command" 12 4
5576Execute command as a perl statement.
5577A missing semicolon will be supplied.
5578.Ip "p expr" 12 4
5579Same as \*(L"print DB'OUT expr\*(R".
5580The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
5581may be redirected to.
5582.PP
5583If you want to modify the debugger, copy perldb.pl from the perl library
5584to your current directory and modify it as necessary.
5585(You'll also have to put -I. on your command line.)
5586You can do some customization by setting up a .perldb file which contains
5587initialization code.
5588For instance, you could make aliases like these:
5589.nf
5590
5591 $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
5592 $DB'alias{'stop'} = 's/^stop (at|in)/b/';
5593 $DB'alias{'.'} =
5594 's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
5595
5596.fi
5597.Sh "Setuid Scripts"
5598.I Perl
5599is designed to make it easy to write secure setuid and setgid scripts.
5600Unlike shells, which are based on multiple substitution passes on each line
5601of the script,
5602.I perl
5603uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
5604Additionally, since the language has more built-in functionality, it
5605has to rely less upon external (and possibly untrustworthy) programs to
5606accomplish its purposes.
5607.PP
5608In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
5609insecure, but this kernel feature can be disabled.
5610If it is,
5611.I perl
5612can emulate the setuid and setgid mechanism when it notices the otherwise
5613useless setuid/gid bits on perl scripts.
5614If the kernel feature isn't disabled,
5615.I perl
5616will complain loudly that your setuid script is insecure.
5617You'll need to either disable the kernel setuid script feature, or put
5618a C wrapper around the script.
5619.PP
5620When perl is executing a setuid script, it takes special precautions to
5621prevent you from falling into any obvious traps.
5622(In some ways, a perl script is more secure than the corresponding
5623C program.)
5624Any command line argument, environment variable, or input is marked as
5625\*(L"tainted\*(R", and may not be used, directly or indirectly, in any
5626command that invokes a subshell, or in any command that modifies files,
5627directories or processes.
5628Any variable that is set within an expression that has previously referenced
5629a tainted value also becomes tainted (even if it is logically impossible
5630for the tainted value to influence the variable).
5631For example:
5632.nf
5633
5634.ne 5
5635 $foo = shift; # $foo is tainted
5636 $bar = $foo,\'bar\'; # $bar is also tainted
5637 $xxx = <>; # Tainted
5638 $path = $ENV{\'PATH\'}; # Tainted, but see below
5639 $abc = \'abc\'; # Not tainted
5640
5641.ne 4
5642 system "echo $foo"; # Insecure
5643 system "/bin/echo", $foo; # Secure (doesn't use sh)
5644 system "echo $bar"; # Insecure
5645 system "echo $abc"; # Insecure until PATH set
5646
5647.ne 5
5648 $ENV{\'PATH\'} = \'/bin:/usr/bin\';
5649 $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5650
5651 $path = $ENV{\'PATH\'}; # Not tainted
5652 system "echo $abc"; # Is secure now!
5653
5654.ne 5
5655 open(FOO,"$foo"); # OK
5656 open(FOO,">$foo"); # Not OK
5657
5658 open(FOO,"echo $foo|"); # Not OK, but...
5659 open(FOO,"-|") || exec \'echo\', $foo; # OK
5660
5661 $zzz = `echo $foo`; # Insecure, zzz tainted
5662
5663 unlink $abc,$foo; # Insecure
5664 umask $foo; # Insecure
5665
5666.ne 3
5667 exec "echo $foo"; # Insecure
5668 exec "echo", $foo; # Secure (doesn't use sh)
5669 exec "sh", \'-c\', $foo; # Considered secure, alas
5670
5671.fi
5672The taintedness is associated with each scalar value, so some elements
5673of an array can be tainted, and others not.
5674.PP
5675If you try to do something insecure, you will get a fatal error saying
5676something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
5677Note that you can still write an insecure system call or exec,
5678but only by explicitly doing something like the last example above.
5679You can also bypass the tainting mechanism by referencing
5680subpatterns\*(--\c
5681.I perl
5682presumes that if you reference a substring using $1, $2, etc, you knew
5683what you were doing when you wrote the pattern:
5684.nf
5685
5686 $ARGV[0] =~ /^\-P(\ew+)$/;
5687 $printer = $1; # Not tainted
5688
5689.fi
5690This is fairly secure since \ew+ doesn't match shell metacharacters.
5691Use of .+ would have been insecure, but
5692.I perl
5693doesn't check for that, so you must be careful with your patterns.
5694This is the ONLY mechanism for untainting user supplied filenames if you
5695want to do file operations on them (unless you make $> equal to $<).
5696.PP
5697It's also possible to get into trouble with other operations that don't care
5698whether they use tainted values.
5699Make judicious use of the file tests in dealing with any user-supplied
5700filenames.
5701When possible, do opens and such after setting $> = $<.
5702.I Perl
5703doesn't prevent you from opening tainted filenames for reading, so be
5704careful what you print out.
5705The tainting mechanism is intended to prevent stupid mistakes, not to remove
5706the need for thought.
5707.SH ENVIRONMENT
5708.Ip HOME 12 4
5709Used if chdir has no argument.
5710.Ip LOGDIR 12 4
5711Used if chdir has no argument and HOME is not set.
5712.Ip PATH 12 4
5713Used in executing subprocesses, and in finding the script if \-S
5714is used.
5715.Ip PERLLIB 12 4
5716A colon-separated list of directories in which to look for Perl library
5717files before looking in the standard library and the current directory.
5718.Ip PERLDB 12 4
5719The command used to get the debugger code. If unset, uses
5720.br
5721
5722 require 'perldb.pl'
5723
5724.PP
5725Apart from these,
5726.I perl
5727uses no other environment variables, except to make them available
5728to the script being executed, and to child processes.
5729However, scripts running setuid would do well to execute the following lines
5730before doing anything else, just to keep people honest:
5731.nf
5732
5733.ne 3
5734 $ENV{\'PATH\'} = \'/bin:/usr/bin\'; # or whatever you need
5735 $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
5736 $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5737
5738.fi
5739.SH AUTHOR
5740Larry Wall <lwall@netlabs.com>
5741.br
5742MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
5743.SH FILES
5744/tmp/perl\-eXXXXXX temporary file for
5745.B \-e
5746commands.
5747.SH SEE ALSO
5748a2p awk to perl translator
5749.br
5750s2p sed to perl translator
5751.SH DIAGNOSTICS
5752Compilation errors will tell you the line number of the error, with an
5753indication of the next token or token type that was to be examined.
5754(In the case of a script passed to
5755.I perl
5756via
5757.B \-e
5758switches, each
5759.B \-e
5760is counted as one line.)
5761.PP
5762Setuid scripts have additional constraints that can produce error messages
5763such as \*(L"Insecure dependency\*(R".
5764See the section on setuid scripts.
5765.SH TRAPS
5766Accustomed
5767.IR awk
5768users should take special note of the following:
5769.Ip * 4 2
5770Semicolons are required after all simple statements in
5771.I perl
5772(except at the end of a block).
5773Newline is not a statement delimiter.
5774.Ip * 4 2
5775Curly brackets are required on ifs and whiles.
5776.Ip * 4 2
5777Variables begin with $ or @ in
5778.IR perl .
5779.Ip * 4 2
5780Arrays index from 0 unless you set $[.
5781Likewise string positions in substr() and index().
5782.Ip * 4 2
5783You have to decide whether your array has numeric or string indices.
5784.Ip * 4 2
5785Associative array values do not spring into existence upon mere reference.
5786.Ip * 4 2
5787You have to decide whether you want to use string or numeric comparisons.
5788.Ip * 4 2
5789Reading an input line does not split it for you. You get to split it yourself
5790to an array.
5791And the
5792.I split
5793operator has different arguments.
5794.Ip * 4 2
5795The current input line is normally in $_, not $0.
5796It generally does not have the newline stripped.
5797($0 is the name of the program executed.)
5798.Ip * 4 2
5799$<digit> does not refer to fields\*(--it refers to substrings matched by the last
5800match pattern.
5801.Ip * 4 2
5802The
5803.I print
5804statement does not add field and record separators unless you set
5805$, and $\e.
5806.Ip * 4 2
5807You must open your files before you print to them.
5808.Ip * 4 2
5809The range operator is \*(L".\|.\*(R", not comma.
5810(The comma operator works as in C.)
5811.Ip * 4 2
5812The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
5813(\*(L"~\*(R" is the one's complement operator, as in C.)
5814.Ip * 4 2
5815The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
5816(\*(L"^\*(R" is the XOR operator, as in C.)
5817.Ip * 4 2
5818The concatenation operator is \*(L".\*(R", not the null string.
5819(Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
5820since the third slash would be interpreted as a division operator\*(--the
5821tokener is in fact slightly context sensitive for operators like /, ?, and <.
5822And in fact, . itself can be the beginning of a number.)
5823.Ip * 4 2
5824.IR Next ,
5825.I exit
5826and
5827.I continue
5828work differently.
5829.Ip * 4 2
5830The following variables work differently
5831.nf
5832
5833 Awk \h'|2.5i'Perl
5834 ARGC \h'|2.5i'$#ARGV
5835 ARGV[0] \h'|2.5i'$0
5836 FILENAME\h'|2.5i'$ARGV
5837 FNR \h'|2.5i'$. \- something
5838 FS \h'|2.5i'(whatever you like)
5839 NF \h'|2.5i'$#Fld, or some such
5840 NR \h'|2.5i'$.
5841 OFMT \h'|2.5i'$#
5842 OFS \h'|2.5i'$,
5843 ORS \h'|2.5i'$\e
5844 RLENGTH \h'|2.5i'length($&)
5845 RS \h'|2.5i'$/
5846 RSTART \h'|2.5i'length($\`)
5847 SUBSEP \h'|2.5i'$;
5848
5849.fi
5850.Ip * 4 2
5851When in doubt, run the
5852.I awk
5853construct through a2p and see what it gives you.
5854.PP
5855Cerebral C programmers should take note of the following:
5856.Ip * 4 2
5857Curly brackets are required on ifs and whiles.
5858.Ip * 4 2
5859You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
5860.Ip * 4 2
5861.I Break
5862and
5863.I continue
5864become
5865.I last
5866and
5867.IR next ,
5868respectively.
5869.Ip * 4 2
5870There's no switch statement.
5871.Ip * 4 2
5872Variables begin with $ or @ in
5873.IR perl .
5874.Ip * 4 2
5875Printf does not implement *.
5876.Ip * 4 2
5877Comments begin with #, not /*.
5878.Ip * 4 2
5879You can't take the address of anything.
5880.Ip * 4 2
5881ARGV must be capitalized.
5882.Ip * 4 2
5883The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
5884.Ip * 4 2
5885Signal handlers deal with signal names, not numbers.
5886.PP
5887Seasoned
5888.I sed
5889programmers should take note of the following:
5890.Ip * 4 2
5891Backreferences in substitutions use $ rather than \e.
5892.Ip * 4 2
5893The pattern matching metacharacters (, ), and | do not have backslashes in front.
5894.Ip * 4 2
5895The range operator is .\|. rather than comma.
5896.PP
5897Sharp shell programmers should take note of the following:
5898.Ip * 4 2
5899The backtick operator does variable interpretation without regard to the
5900presence of single quotes in the command.
5901.Ip * 4 2
5902The backtick operator does no translation of the return value, unlike csh.
5903.Ip * 4 2
5904Shells (especially csh) do several levels of substitution on each command line.
5905.I Perl
5906does substitution only in certain constructs such as double quotes,
5907backticks, angle brackets and search patterns.
5908.Ip * 4 2
5909Shells interpret scripts a little bit at a time.
5910.I Perl
5911compiles the whole program before executing it.
5912.Ip * 4 2
5913The arguments are available via @ARGV, not $1, $2, etc.
5914.Ip * 4 2
5915The environment is not automatically made available as variables.
5916.SH ERRATA\0AND\0ADDENDA
5917The Perl book,
5918.I Programming\0Perl ,
5919has the following omissions and goofs.
5920.PP
5921On page 5, the examples which read
5922.nf
5923
5924 eval "/usr/bin/perl
5925
5926should read
5927
5928 eval "exec /usr/bin/perl
5929
5930.fi
5931.PP
5932On page 195, the equivalent to the System V sum program only works for
5933very small files. To do larger files, use
5934.nf
5935
5936 undef $/;
5937 $checksum = unpack("%32C*",<>) % 32767;
5938
5939.fi
5940.PP
5941The descriptions of alarm and sleep refer to signal SIGALARM. These
5942should refer to SIGALRM.
5943.PP
5944The
5945.B \-0
5946switch to set the initial value of $/ was added to Perl after the book
5947went to press.
5948.PP
5949The
5950.B \-l
5951switch now does automatic line ending processing.
5952.PP
5953The qx// construct is now a synonym for backticks.
5954.PP
5955$0 may now be assigned to set the argument displayed by
5956.I ps (1).
5957.PP
5958The new @###.## format was omitted accidentally from the description
5959on formats.
5960.PP
5961It wasn't known at press time that s///ee caused multiple evaluations of
5962the replacement expression. This is to be construed as a feature.
5963.PP
5964(LIST) x $count now does array replication.
5965.PP
5966There is now no limit on the number of parentheses in a regular expression.
5967.PP
5968In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[,
5969\el, \eL, \eu, \eU, \eE. The latter five control up/lower case translation.
5970.PP
5971The
5972.B $/
5973variable may now be set to a multi-character delimiter.
5974.PP
5975There is now a g modifier on ordinary pattern matching that causes it
5976to iterate through a string finding multiple matches.
5977.PP
5978All of the $^X variables are new except for $^T.
5979.PP
5980The default top-of-form format for FILEHANDLE is now FILEHANDLE_TOP rather
5981than top.
5982.PP
5983The eval {} and sort {} constructs were added in version 4.018.
5984.PP
5985The v and V (little-endian) template options for pack and unpack were
5986added in 4.019.
5987.SH BUGS
5988.PP
5989.I Perl
5990is at the mercy of your machine's definitions of various operations
5991such as type casting, atof() and sprintf().
5992.PP
5993If your stdio requires an seek or eof between reads and writes on a particular
5994stream, so does
5995.IR perl .
5996(This doesn't apply to sysread() and syswrite().)
5997.PP
5998While none of the built-in data types have any arbitrary size limits (apart
5999from memory size), there are still a few arbitrary limits:
6000a given identifier may not be longer than 255 characters,
6001and no component of your PATH may be longer than 255 if you use \-S.
6002A regular expression may not compile to more than 32767 bytes internally.
6003.PP
6004.I Perl
6005actually stands for Pathologically Eclectic Rubbish Lister, but don't tell
6006anyone I said that.
6007.rn }` ''