Initial commit of OpenSPARC T2 architecture model.
[OpenSPARC-T2-SAM] / sam-t2 / devtools / amd64 / man / man1 / perlsec.1
CommitLineData
920dae64
AT
1.\" Automatically generated by Pod::Man v1.37, Pod::Parser v1.32
2.\"
3.\" Standard preamble:
4.\" ========================================================================
5.de Sh \" Subsection heading
6.br
7.if t .Sp
8.ne 5
9.PP
10\fB\\$1\fR
11.PP
12..
13.de Sp \" Vertical space (when we can't use .PP)
14.if t .sp .5v
15.if n .sp
16..
17.de Vb \" Begin verbatim text
18.ft CW
19.nf
20.ne \\$1
21..
22.de Ve \" End verbatim text
23.ft R
24.fi
25..
26.\" Set up some character translations and predefined strings. \*(-- will
27.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
28.\" double quote, and \*(R" will give a right double quote. | will give a
29.\" real vertical bar. \*(C+ will give a nicer C++. Capital omega is used to
30.\" do unbreakable dashes and therefore won't be available. \*(C` and \*(C'
31.\" expand to `' in nroff, nothing in troff, for use with C<>.
32.tr \(*W-|\(bv\*(Tr
33.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
34.ie n \{\
35. ds -- \(*W-
36. ds PI pi
37. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
38. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
39. ds L" ""
40. ds R" ""
41. ds C` ""
42. ds C' ""
43'br\}
44.el\{\
45. ds -- \|\(em\|
46. ds PI \(*p
47. ds L" ``
48. ds R" ''
49'br\}
50.\"
51.\" If the F register is turned on, we'll generate index entries on stderr for
52.\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index
53.\" entries marked with X<> in POD. Of course, you'll have to process the
54.\" output yourself in some meaningful fashion.
55.if \nF \{\
56. de IX
57. tm Index:\\$1\t\\n%\t"\\$2"
58..
59. nr % 0
60. rr F
61.\}
62.\"
63.\" For nroff, turn off justification. Always turn off hyphenation; it makes
64.\" way too many mistakes in technical documents.
65.hy 0
66.if n .na
67.\"
68.\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).
69.\" Fear. Run. Save yourself. No user-serviceable parts.
70. \" fudge factors for nroff and troff
71.if n \{\
72. ds #H 0
73. ds #V .8m
74. ds #F .3m
75. ds #[ \f1
76. ds #] \fP
77.\}
78.if t \{\
79. ds #H ((1u-(\\\\n(.fu%2u))*.13m)
80. ds #V .6m
81. ds #F 0
82. ds #[ \&
83. ds #] \&
84.\}
85. \" simple accents for nroff and troff
86.if n \{\
87. ds ' \&
88. ds ` \&
89. ds ^ \&
90. ds , \&
91. ds ~ ~
92. ds /
93.\}
94.if t \{\
95. ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u"
96. ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'
97. ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'
98. ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'
99. ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'
100. ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'
101.\}
102. \" troff and (daisy-wheel) nroff accents
103.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'
104.ds 8 \h'\*(#H'\(*b\h'-\*(#H'
105.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#]
106.ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'
107.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'
108.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#]
109.ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#]
110.ds ae a\h'-(\w'a'u*4/10)'e
111.ds Ae A\h'-(\w'A'u*4/10)'E
112. \" corrections for vroff
113.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'
114.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'
115. \" for low resolution devices (crt and lpr)
116.if \n(.H>23 .if \n(.V>19 \
117\{\
118. ds : e
119. ds 8 ss
120. ds o a
121. ds d- d\h'-1'\(ga
122. ds D- D\h'-1'\(hy
123. ds th \o'bp'
124. ds Th \o'LP'
125. ds ae ae
126. ds Ae AE
127.\}
128.rm #[ #] #H #V #F C
129.\" ========================================================================
130.\"
131.IX Title "PERLSEC 1"
132.TH PERLSEC 1 "2006-01-07" "perl v5.8.8" "Perl Programmers Reference Guide"
133.SH "NAME"
134perlsec \- Perl security
135.SH "DESCRIPTION"
136.IX Header "DESCRIPTION"
137Perl is designed to make it easy to program securely even when running
138with extra privileges, like setuid or setgid programs. Unlike most
139command line shells, which are based on multiple substitution passes on
140each line of the script, Perl uses a more conventional evaluation scheme
141with fewer hidden snags. Additionally, because the language has more
142builtin functionality, it can rely less upon external (and possibly
143untrustworthy) programs to accomplish its purposes.
144.PP
145Perl automatically enables a set of special security checks, called \fItaint
146mode\fR, when it detects its program running with differing real and effective
147user or group IDs. The setuid bit in Unix permissions is mode 04000, the
148setgid bit mode 02000; either or both may be set. You can also enable taint
149mode explicitly by using the \fB\-T\fR command line flag. This flag is
150\&\fIstrongly\fR suggested for server programs and any program run on behalf of
151someone else, such as a \s-1CGI\s0 script. Once taint mode is on, it's on for
152the remainder of your script.
153.PP
154While in this mode, Perl takes special precautions called \fItaint
155checks\fR to prevent both obvious and subtle traps. Some of these checks
156are reasonably simple, such as verifying that path directories aren't
157writable by others; careful programmers have always used checks like
158these. Other checks, however, are best supported by the language itself,
159and it is these checks especially that contribute to making a set-id Perl
160program more secure than the corresponding C program.
161.PP
162You may not use data derived from outside your program to affect
163something else outside your program\*(--at least, not by accident. All
164command line arguments, environment variables, locale information (see
165perllocale), results of certain system calls (\f(CW\*(C`readdir()\*(C'\fR,
166\&\f(CW\*(C`readlink()\*(C'\fR, the variable of \f(CW\*(C`shmread()\*(C'\fR, the messages returned by
167\&\f(CW\*(C`msgrcv()\*(C'\fR, the password, gcos and shell fields returned by the
168\&\f(CW\*(C`getpwxxx()\*(C'\fR calls), and all file input are marked as \*(L"tainted\*(R".
169Tainted data may not be used directly or indirectly in any command
170that invokes a sub\-shell, nor in any command that modifies files,
171directories, or processes, \fBwith the following exceptions\fR:
172.IP "\(bu" 4
173Arguments to \f(CW\*(C`print\*(C'\fR and \f(CW\*(C`syswrite\*(C'\fR are \fBnot\fR checked for taintedness.
174.IP "\(bu" 4
175Symbolic methods
176.Sp
177.Vb 1
178\& $obj->$method(@args);
179.Ve
180.Sp
181and symbolic sub references
182.Sp
183.Vb 2
184\& &{$foo}(@args);
185\& $foo->(@args);
186.Ve
187.Sp
188are not checked for taintedness. This requires extra carefulness
189unless you want external data to affect your control flow. Unless
190you carefully limit what these symbolic values are, people are able
191to call functions \fBoutside\fR your Perl code, such as POSIX::system,
192in which case they are able to run arbitrary external code.
193.PP
194For efficiency reasons, Perl takes a conservative view of
195whether data is tainted. If an expression contains tainted data,
196any subexpression may be considered tainted, even if the value
197of the subexpression is not itself affected by the tainted data.
198.PP
199Because taintedness is associated with each scalar value, some
200elements of an array or hash can be tainted and others not.
201The keys of a hash are never tainted.
202.PP
203For example:
204.PP
205.Vb 8
206\& $arg = shift; # $arg is tainted
207\& $hid = $arg, 'bar'; # $hid is also tainted
208\& $line = <>; # Tainted
209\& $line = <STDIN>; # Also tainted
210\& open FOO, "/home/me/bar" or die $!;
211\& $line = <FOO>; # Still tainted
212\& $path = $ENV{'PATH'}; # Tainted, but see below
213\& $data = 'abc'; # Not tainted
214.Ve
215.PP
216.Vb 5
217\& system "echo $arg"; # Insecure
218\& system "/bin/echo", $arg; # Considered insecure
219\& # (Perl doesn't know about /bin/echo)
220\& system "echo $hid"; # Insecure
221\& system "echo $data"; # Insecure until PATH set
222.Ve
223.PP
224.Vb 1
225\& $path = $ENV{'PATH'}; # $path now tainted
226.Ve
227.PP
228.Vb 2
229\& $ENV{'PATH'} = '/bin:/usr/bin';
230\& delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
231.Ve
232.PP
233.Vb 2
234\& $path = $ENV{'PATH'}; # $path now NOT tainted
235\& system "echo $data"; # Is secure now!
236.Ve
237.PP
238.Vb 2
239\& open(FOO, "< $arg"); # OK - read-only file
240\& open(FOO, "> $arg"); # Not OK - trying to write
241.Ve
242.PP
243.Vb 3
244\& open(FOO,"echo $arg|"); # Not OK
245\& open(FOO,"-|")
246\& or exec 'echo', $arg; # Also not OK
247.Ve
248.PP
249.Vb 1
250\& $shout = `echo $arg`; # Insecure, $shout now tainted
251.Ve
252.PP
253.Vb 2
254\& unlink $data, $arg; # Insecure
255\& umask $arg; # Insecure
256.Ve
257.PP
258.Vb 3
259\& exec "echo $arg"; # Insecure
260\& exec "echo", $arg; # Insecure
261\& exec "sh", '-c', $arg; # Very insecure!
262.Ve
263.PP
264.Vb 2
265\& @files = <*.c>; # insecure (uses readdir() or similar)
266\& @files = glob('*.c'); # insecure (uses readdir() or similar)
267.Ve
268.PP
269.Vb 4
270\& # In Perl releases older than 5.6.0 the <*.c> and glob('*.c') would
271\& # have used an external program to do the filename expansion; but in
272\& # either case the result is tainted since the list of filenames comes
273\& # from outside of the program.
274.Ve
275.PP
276.Vb 2
277\& $bad = ($arg, 23); # $bad will be tainted
278\& $arg, `true`; # Insecure (although it isn't really)
279.Ve
280.PP
281If you try to do something insecure, you will get a fatal error saying
282something like \*(L"Insecure dependency\*(R" or \*(L"Insecure \f(CW$ENV\fR{\s-1PATH\s0}\*(R".
283.PP
284The exception to the principle of \*(L"one tainted value taints the whole
285expression\*(R" is with the ternary conditional operator \f(CW\*(C`?:\*(C'\fR. Since code
286with a ternary conditional
287.PP
288.Vb 1
289\& $result = $tainted_value ? "Untainted" : "Also untainted";
290.Ve
291.PP
292is effectively
293.PP
294.Vb 5
295\& if ( $tainted_value ) {
296\& $result = "Untainted";
297\& } else {
298\& $result = "Also untainted";
299\& }
300.Ve
301.PP
302it doesn't make sense for \f(CW$result\fR to be tainted.
303.Sh "Laundering and Detecting Tainted Data"
304.IX Subsection "Laundering and Detecting Tainted Data"
305To test whether a variable contains tainted data, and whose use would
306thus trigger an \*(L"Insecure dependency\*(R" message, you can use the
307\&\f(CW\*(C`tainted()\*(C'\fR function of the Scalar::Util module, available in your
308nearby \s-1CPAN\s0 mirror, and included in Perl starting from the release 5.8.0.
309Or you may be able to use the following \f(CW\*(C`is_tainted()\*(C'\fR function.
310.PP
311.Vb 3
312\& sub is_tainted {
313\& return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
314\& }
315.Ve
316.PP
317This function makes use of the fact that the presence of tainted data
318anywhere within an expression renders the entire expression tainted. It
319would be inefficient for every operator to test every argument for
320taintedness. Instead, the slightly more efficient and conservative
321approach is used that if any tainted value has been accessed within the
322same expression, the whole expression is considered tainted.
323.PP
324But testing for taintedness gets you only so far. Sometimes you have just
325to clear your data's taintedness. Values may be untainted by using them
326as keys in a hash; otherwise the only way to bypass the tainting
327mechanism is by referencing subpatterns from a regular expression match.
328Perl presumes that if you reference a substring using \f(CW$1\fR, \f(CW$2\fR, etc., that
329you knew what you were doing when you wrote the pattern. That means using
330a bit of thought\*(--don't just blindly untaint anything, or you defeat the
331entire mechanism. It's better to verify that the variable has only good
332characters (for certain values of \*(L"good\*(R") rather than checking whether it
333has any bad characters. That's because it's far too easy to miss bad
334characters that you never thought of.
335.PP
336Here's a test to make sure that the data contains nothing but \*(L"word\*(R"
337characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
338or a dot.
339.PP
340.Vb 5
341\& if ($data =~ /^([-\e@\ew.]+)$/) {
342\& $data = $1; # $data now untainted
343\& } else {
344\& die "Bad data in '$data'"; # log this somewhere
345\& }
346.Ve
347.PP
348This is fairly secure because \f(CW\*(C`/\ew+/\*(C'\fR doesn't normally match shell
349metacharacters, nor are dot, dash, or at going to mean something special
350to the shell. Use of \f(CW\*(C`/.+/\*(C'\fR would have been insecure in theory because
351it lets everything through, but Perl doesn't check for that. The lesson
352is that when untainting, you must be exceedingly careful with your patterns.
353Laundering data using regular expression is the \fIonly\fR mechanism for
354untainting dirty data, unless you use the strategy detailed below to fork
355a child of lesser privilege.
356.PP
357The example does not untaint \f(CW$data\fR if \f(CW\*(C`use locale\*(C'\fR is in effect,
358because the characters matched by \f(CW\*(C`\ew\*(C'\fR are determined by the locale.
359Perl considers that locale definitions are untrustworthy because they
360contain data from outside the program. If you are writing a
361locale-aware program, and want to launder data with a regular expression
362containing \f(CW\*(C`\ew\*(C'\fR, put \f(CW\*(C`no locale\*(C'\fR ahead of the expression in the same
363block. See \*(L"\s-1SECURITY\s0\*(R" in perllocale for further discussion and examples.
364.ie n .Sh "Switches On the ""#!"" Line"
365.el .Sh "Switches On the ``#!'' Line"
366.IX Subsection "Switches On the #! Line"
367When you make a script executable, in order to make it usable as a
368command, the system will pass switches to perl from the script's #!
369line. Perl checks that any command line switches given to a setuid
370(or setgid) script actually match the ones set on the #! line. Some
371Unix and Unix-like environments impose a one-switch limit on the #!
372line, so you may need to use something like \f(CW\*(C`\-wU\*(C'\fR instead of \f(CW\*(C`\-w \-U\*(C'\fR
373under such systems. (This issue should arise only in Unix or
374Unix-like environments that support #! and setuid or setgid scripts.)
375.ie n .Sh "Taint mode and @INC"
376.el .Sh "Taint mode and \f(CW@INC\fP"
377.IX Subsection "Taint mode and @INC"
378When the taint mode (\f(CW\*(C`\-T\*(C'\fR) is in effect, the \*(L".\*(R" directory is removed
379from \f(CW@INC\fR, and the environment variables \f(CW\*(C`PERL5LIB\*(C'\fR and \f(CW\*(C`PERLLIB\*(C'\fR
380are ignored by Perl. You can still adjust \f(CW@INC\fR from outside the
381program by using the \f(CW\*(C`\-I\*(C'\fR command line option as explained in
382perlrun. The two environment variables are ignored because
383they are obscured, and a user running a program could be unaware that
384they are set, whereas the \f(CW\*(C`\-I\*(C'\fR option is clearly visible and
385therefore permitted.
386.PP
387Another way to modify \f(CW@INC\fR without modifying the program, is to use
388the \f(CW\*(C`lib\*(C'\fR pragma, e.g.:
389.PP
390.Vb 1
391\& perl -Mlib=/foo program
392.Ve
393.PP
394The benefit of using \f(CW\*(C`\-Mlib=/foo\*(C'\fR over \f(CW\*(C`\-I/foo\*(C'\fR, is that the former
395will automagically remove any duplicated directories, while the later
396will not.
397.PP
398Note that if a tainted string is added to \f(CW@INC\fR, the following
399problem will be reported:
400.PP
401.Vb 1
402\& Insecure dependency in require while running with -T switch
403.Ve
404.Sh "Cleaning Up Your Path"
405.IX Subsection "Cleaning Up Your Path"
406For "Insecure \f(CW$ENV{PATH}\fR" messages, you need to set \f(CW$ENV{'PATH'}\fR to
407a known value, and each directory in the path must be absolute and
408non-writable by others than its owner and group. You may be surprised to
409get this message even if the pathname to your executable is fully
410qualified. This is \fInot\fR generated because you didn't supply a full path
411to the program; instead, it's generated because you never set your \s-1PATH\s0
412environment variable, or you didn't set it to something that was safe.
413Because Perl can't guarantee that the executable in question isn't itself
414going to turn around and execute some other program that is dependent on
415your \s-1PATH\s0, it makes sure you set the \s-1PATH\s0.
416.PP
417The \s-1PATH\s0 isn't the only environment variable which can cause problems.
418Because some shells may use the variables \s-1IFS\s0, \s-1CDPATH\s0, \s-1ENV\s0, and
419\&\s-1BASH_ENV\s0, Perl checks that those are either empty or untainted when
420starting subprocesses. You may wish to add something like this to your
421setid and taint-checking scripts.
422.PP
423.Vb 1
424\& delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
425.Ve
426.PP
427It's also possible to get into trouble with other operations that don't
428care whether they use tainted values. Make judicious use of the file
429tests in dealing with any user-supplied filenames. When possible, do
430opens and such \fBafter\fR properly dropping any special user (or group!)
431privileges. Perl doesn't prevent you from opening tainted filenames for reading,
432so be careful what you print out. The tainting mechanism is intended to
433prevent stupid mistakes, not to remove the need for thought.
434.PP
435Perl does not call the shell to expand wild cards when you pass \f(CW\*(C`system\*(C'\fR
436and \f(CW\*(C`exec\*(C'\fR explicit parameter lists instead of strings with possible shell
437wildcards in them. Unfortunately, the \f(CW\*(C`open\*(C'\fR, \f(CW\*(C`glob\*(C'\fR, and
438backtick functions provide no such alternate calling convention, so more
439subterfuge will be required.
440.PP
441Perl provides a reasonably safe way to open a file or pipe from a setuid
442or setgid program: just create a child process with reduced privilege who
443does the dirty work for you. First, fork a child using the special
444\&\f(CW\*(C`open\*(C'\fR syntax that connects the parent and child by a pipe. Now the
445child resets its \s-1ID\s0 set and any other per-process attributes, like
446environment variables, umasks, current working directories, back to the
447originals or known safe values. Then the child process, which no longer
448has any special permissions, does the \f(CW\*(C`open\*(C'\fR or other system call.
449Finally, the child passes the data it managed to access back to the
450parent. Because the file or pipe was opened in the child while running
451under less privilege than the parent, it's not apt to be tricked into
452doing something it shouldn't.
453.PP
454Here's a way to do backticks reasonably safely. Notice how the \f(CW\*(C`exec\*(C'\fR is
455not called with a string that the shell could expand. This is by far the
456best way to call something that might be subjected to shell escapes: just
457never call the shell at all.
458.PP
459.Vb 25
460\& use English '-no_match_vars';
461\& die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
462\& if ($pid) { # parent
463\& while (<KID>) {
464\& # do something
465\& }
466\& close KID;
467\& } else {
468\& my @temp = ($EUID, $EGID);
469\& my $orig_uid = $UID;
470\& my $orig_gid = $GID;
471\& $EUID = $UID;
472\& $EGID = $GID;
473\& # Drop privileges
474\& $UID = $orig_uid;
475\& $GID = $orig_gid;
476\& # Make sure privs are really gone
477\& ($EUID, $EGID) = @temp;
478\& die "Can't drop privileges"
479\& unless $UID == $EUID && $GID eq $EGID;
480\& $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
481\& # Consider sanitizing the environment even more.
482\& exec 'myprog', 'arg1', 'arg2'
483\& or die "can't exec myprog: $!";
484\& }
485.Ve
486.PP
487A similar strategy would work for wildcard expansion via \f(CW\*(C`glob\*(C'\fR, although
488you can use \f(CW\*(C`readdir\*(C'\fR instead.
489.PP
490Taint checking is most useful when although you trust yourself not to have
491written a program to give away the farm, you don't necessarily trust those
492who end up using it not to try to trick it into doing something bad. This
493is the kind of security checking that's useful for set-id programs and
494programs launched on someone else's behalf, like \s-1CGI\s0 programs.
495.PP
496This is quite different, however, from not even trusting the writer of the
497code not to try to do something evil. That's the kind of trust needed
498when someone hands you a program you've never seen before and says, \*(L"Here,
499run this.\*(R" For that kind of safety, check out the Safe module,
500included standard in the Perl distribution. This module allows the
501programmer to set up special compartments in which all system operations
502are trapped and namespace access is carefully controlled.
503.Sh "Security Bugs"
504.IX Subsection "Security Bugs"
505Beyond the obvious problems that stem from giving special privileges to
506systems as flexible as scripts, on many versions of Unix, set-id scripts
507are inherently insecure right from the start. The problem is a race
508condition in the kernel. Between the time the kernel opens the file to
509see which interpreter to run and when the (now\-set\-id) interpreter turns
510around and reopens the file to interpret it, the file in question may have
511changed, especially if you have symbolic links on your system.
512.PP
513Fortunately, sometimes this kernel \*(L"feature\*(R" can be disabled.
514Unfortunately, there are two ways to disable it. The system can simply
515outlaw scripts with any set-id bit set, which doesn't help much.
516Alternately, it can simply ignore the set-id bits on scripts. If the
517latter is true, Perl can emulate the setuid and setgid mechanism when it
518notices the otherwise useless setuid/gid bits on Perl scripts. It does
519this via a special executable called \fIsuidperl\fR that is automatically
520invoked for you if it's needed.
521.PP
522However, if the kernel set-id script feature isn't disabled, Perl will
523complain loudly that your set-id script is insecure. You'll need to
524either disable the kernel set-id script feature, or put a C wrapper around
525the script. A C wrapper is just a compiled program that does nothing
526except call your Perl program. Compiled programs are not subject to the
527kernel bug that plagues set-id scripts. Here's a simple wrapper, written
528in C:
529.PP
530.Vb 6
531\& #define REAL_PATH "/path/to/script"
532\& main(ac, av)
533\& char **av;
534\& {
535\& execv(REAL_PATH, av);
536\& }
537.Ve
538.PP
539Compile this wrapper into a binary executable and then make \fIit\fR rather
540than your script setuid or setgid.
541.PP
542In recent years, vendors have begun to supply systems free of this
543inherent security bug. On such systems, when the kernel passes the name
544of the set-id script to open to the interpreter, rather than using a
545pathname subject to meddling, it instead passes \fI/dev/fd/3\fR. This is a
546special file already opened on the script, so that there can be no race
547condition for evil scripts to exploit. On these systems, Perl should be
548compiled with \f(CW\*(C`\-DSETUID_SCRIPTS_ARE_SECURE_NOW\*(C'\fR. The \fIConfigure\fR
549program that builds Perl tries to figure this out for itself, so you
550should never have to specify this yourself. Most modern releases of
551SysVr4 and \s-1BSD\s0 4.4 use this approach to avoid the kernel race condition.
552.PP
553Prior to release 5.6.1 of Perl, bugs in the code of \fIsuidperl\fR could
554introduce a security hole.
555.Sh "Protecting Your Programs"
556.IX Subsection "Protecting Your Programs"
557There are a number of ways to hide the source to your Perl programs,
558with varying levels of \*(L"security\*(R".
559.PP
560First of all, however, you \fIcan't\fR take away read permission, because
561the source code has to be readable in order to be compiled and
562interpreted. (That doesn't mean that a \s-1CGI\s0 script's source is
563readable by people on the web, though.) So you have to leave the
564permissions at the socially friendly 0755 level. This lets
565people on your local system only see your source.
566.PP
567Some people mistakenly regard this as a security problem. If your program does
568insecure things, and relies on people not knowing how to exploit those
569insecurities, it is not secure. It is often possible for someone to
570determine the insecure things and exploit them without viewing the
571source. Security through obscurity, the name for hiding your bugs
572instead of fixing them, is little security indeed.
573.PP
574You can try using encryption via source filters (Filter::* from \s-1CPAN\s0,
575or Filter::Util::Call and Filter::Simple since Perl 5.8).
576But crackers might be able to decrypt it. You can try using the byte
577code compiler and interpreter described below, but crackers might be
578able to de-compile it. You can try using the native-code compiler
579described below, but crackers might be able to disassemble it. These
580pose varying degrees of difficulty to people wanting to get at your
581code, but none can definitively conceal it (this is true of every
582language, not just Perl).
583.PP
584If you're concerned about people profiting from your code, then the
585bottom line is that nothing but a restrictive licence will give you
586legal security. License your software and pepper it with threatening
587statements like \*(L"This is unpublished proprietary software of \s-1XYZ\s0 Corp.
588Your access to it does not give you permission to use it blah blah
589blah.\*(R" You should see a lawyer to be sure your licence's wording will
590stand up in court.
591.Sh "Unicode"
592.IX Subsection "Unicode"
593Unicode is a new and complex technology and one may easily overlook
594certain security pitfalls. See perluniintro for an overview and
595perlunicode for details, and \*(L"Security Implications of Unicode\*(R" in perlunicode for security implications in particular.
596.Sh "Algorithmic Complexity Attacks"
597.IX Subsection "Algorithmic Complexity Attacks"
598Certain internal algorithms used in the implementation of Perl can
599be attacked by choosing the input carefully to consume large amounts
600of either time or space or both. This can lead into the so-called
601\&\fIDenial of Service\fR (DoS) attacks.
602.IP "\(bu" 4
603Hash Function \- the algorithm used to \*(L"order\*(R" hash elements has been
604changed several times during the development of Perl, mainly to be
605reasonably fast. In Perl 5.8.1 also the security aspect was taken
606into account.
607.Sp
608In Perls before 5.8.1 one could rather easily generate data that as
609hash keys would cause Perl to consume large amounts of time because
610internal structure of hashes would badly degenerate. In Perl 5.8.1
611the hash function is randomly perturbed by a pseudorandom seed which
612makes generating such naughty hash keys harder.
613See \*(L"\s-1PERL_HASH_SEED\s0\*(R" in perlrun for more information.
614.Sp
615The random perturbation is done by default but if one wants for some
616reason emulate the old behaviour one can set the environment variable
617\&\s-1PERL_HASH_SEED\s0 to zero (or any other integer). One possible reason
618for wanting to emulate the old behaviour is that in the new behaviour
619consecutive runs of Perl will order hash keys differently, which may
620confuse some applications (like Data::Dumper: the outputs of two
621different runs are no more identical).
622.Sp
623\&\fBPerl has never guaranteed any ordering of the hash keys\fR, and the
624ordering has already changed several times during the lifetime of
625Perl 5. Also, the ordering of hash keys has always been, and
626continues to be, affected by the insertion order.
627.Sp
628Also note that while the order of the hash elements might be
629randomised, this \*(L"pseudoordering\*(R" should \fBnot\fR be used for
630applications like shuffling a list randomly (use \fIList::Util::shuffle()\fR
631for that, see List::Util, a standard core module since Perl 5.8.0;
632or the \s-1CPAN\s0 module Algorithm::Numerical::Shuffle), or for generating
633permutations (use e.g. the \s-1CPAN\s0 modules Algorithm::Permute or
634Algorithm::FastPermute), or for any cryptographic applications.
635.IP "\(bu" 4
636Regular expressions \- Perl's regular expression engine is so called
637\&\s-1NFA\s0 (Non\-Finite Automaton), which among other things means that it can
638rather easily consume large amounts of both time and space if the
639regular expression may match in several ways. Careful crafting of the
640regular expressions can help but quite often there really isn't much
641one can do (the book \*(L"Mastering Regular Expressions\*(R" is required
642reading, see perlfaq2). Running out of space manifests itself by
643Perl running out of memory.
644.IP "\(bu" 4
645Sorting \- the quicksort algorithm used in Perls before 5.8.0 to
646implement the \fIsort()\fR function is very easy to trick into misbehaving
647so that it consumes a lot of time. Nothing more is required than
648resorting a list already sorted. Starting from Perl 5.8.0 a different
649sorting algorithm, mergesort, is used. Mergesort is insensitive to
650its input data, so it cannot be similarly fooled.
651.PP
652See <http://www.cs.rice.edu/~scrosby/hash/> for more information,
653and any computer science text book on the algorithmic complexity.
654.SH "SEE ALSO"
655.IX Header "SEE ALSO"
656perlrun for its description of cleaning up environment variables.