git.subgeniuskitty.com - OpenSPARC-T2-SAM/.git/blame_incremental - sam-t2/devtools/v8plus/lib/perl5/5.8.8/pod/perlsec.pod

... / ...

Commit	Line	Data
	1	=head1 NAME
	2
	3	perlsec - Perl security
	4
	5	=head1 DESCRIPTION
	6
	7	Perl is designed to make it easy to program securely even when running
	8	with extra privileges, like setuid or setgid programs. Unlike most
	9	command line shells, which are based on multiple substitution passes on
	10	each line of the script, Perl uses a more conventional evaluation scheme
	11	with fewer hidden snags. Additionally, because the language has more
	12	builtin functionality, it can rely less upon external (and possibly
	13	untrustworthy) programs to accomplish its purposes.
	14
	15	Perl automatically enables a set of special security checks, called I<taint
	16	mode>, when it detects its program running with differing real and effective
	17	user or group IDs. The setuid bit in Unix permissions is mode 04000, the
	18	setgid bit mode 02000; either or both may be set. You can also enable taint
	19	mode explicitly by using the B<-T> command line flag. This flag is
	20	I<strongly> suggested for server programs and any program run on behalf of
	21	someone else, such as a CGI script. Once taint mode is on, it's on for
	22	the remainder of your script.
	23
	24	While in this mode, Perl takes special precautions called I<taint
	25	checks> to prevent both obvious and subtle traps. Some of these checks
	26	are reasonably simple, such as verifying that path directories aren't
	27	writable by others; careful programmers have always used checks like
	28	these. Other checks, however, are best supported by the language itself,
	29	and it is these checks especially that contribute to making a set-id Perl
	30	program more secure than the corresponding C program.
	31
	32	You may not use data derived from outside your program to affect
	33	something else outside your program--at least, not by accident. All
	34	command line arguments, environment variables, locale information (see
	35	L<perllocale>), results of certain system calls (C<readdir()>,
	36	C<readlink()>, the variable of C<shmread()>, the messages returned by
	37	C<msgrcv()>, the password, gcos and shell fields returned by the
	38	C<getpwxxx()> calls), and all file input are marked as "tainted".
	39	Tainted data may not be used directly or indirectly in any command
	40	that invokes a sub-shell, nor in any command that modifies files,
	41	directories, or processes, B<with the following exceptions>:
	42
	43	=over 4
	44
	45	=item *
	46
	47	Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
	48
	49	=item *
	50
	51	Symbolic methods
	52
	53	$obj->$method(@args);
	54
	55	and symbolic sub references
	56
	57	&{$foo}(@args);
	58	$foo->(@args);
	59
	60	are not checked for taintedness. This requires extra carefulness
	61	unless you want external data to affect your control flow. Unless
	62	you carefully limit what these symbolic values are, people are able
	63	to call functions B<outside> your Perl code, such as POSIX::system,
	64	in which case they are able to run arbitrary external code.
	65
	66	=back
	67
	68	For efficiency reasons, Perl takes a conservative view of
	69	whether data is tainted. If an expression contains tainted data,
	70	any subexpression may be considered tainted, even if the value
	71	of the subexpression is not itself affected by the tainted data.
	72
	73	Because taintedness is associated with each scalar value, some
	74	elements of an array or hash can be tainted and others not.
	75	The keys of a hash are never tainted.
	76
	77	For example:
	78
	79	$arg = shift; # $arg is tainted
	80	$hid = $arg, 'bar'; # $hid is also tainted
	81	$line = <>; # Tainted
	82	$line = <STDIN>; # Also tainted
	83	open FOO, "/home/me/bar" or die $!;
	84	$line = <FOO>; # Still tainted
	85	$path = $ENV{'PATH'}; # Tainted, but see below
	86	$data = 'abc'; # Not tainted
	87
	88	system "echo $arg"; # Insecure
	89	system "/bin/echo", $arg; # Considered insecure
	90	# (Perl doesn't know about /bin/echo)
	91	system "echo $hid"; # Insecure
	92	system "echo $data"; # Insecure until PATH set
	93
	94	$path = $ENV{'PATH'}; # $path now tainted
	95
	96	$ENV{'PATH'} = '/bin:/usr/bin';
	97	delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
	98
	99	$path = $ENV{'PATH'}; # $path now NOT tainted
	100	system "echo $data"; # Is secure now!
	101
	102	open(FOO, "< $arg"); # OK - read-only file
	103	open(FOO, "> $arg"); # Not OK - trying to write
	104
	105	open(FOO,"echo $arg\|"); # Not OK
	106	open(FOO,"-\|")
	107	or exec 'echo', $arg; # Also not OK
	108
	109	$shout = `echo $arg`; # Insecure, $shout now tainted
	110
	111	unlink $data, $arg; # Insecure
	112	umask $arg; # Insecure
	113
	114	exec "echo $arg"; # Insecure
	115	exec "echo", $arg; # Insecure
	116	exec "sh", '-c', $arg; # Very insecure!
	117
	118	@files = <*.c>; # insecure (uses readdir() or similar)
	119	@files = glob('*.c'); # insecure (uses readdir() or similar)
	120
	121	# In Perl releases older than 5.6.0 the <.c> and glob('.c') would
	122	# have used an external program to do the filename expansion; but in
	123	# either case the result is tainted since the list of filenames comes
	124	# from outside of the program.
	125
	126	$bad = ($arg, 23); # $bad will be tainted
	127	$arg, `true`; # Insecure (although it isn't really)
	128
	129	If you try to do something insecure, you will get a fatal error saying
	130	something like "Insecure dependency" or "Insecure $ENV{PATH}".
	131
	132	The exception to the principle of "one tainted value taints the whole
	133	expression" is with the ternary conditional operator C<?:>. Since code
	134	with a ternary conditional
	135
	136	$result = $tainted_value ? "Untainted" : "Also untainted";
	137
	138	is effectively
	139
	140	if ( $tainted_value ) {
	141	$result = "Untainted";
	142	} else {
	143	$result = "Also untainted";
	144	}
	145
	146	it doesn't make sense for C<$result> to be tainted.
	147
	148	=head2 Laundering and Detecting Tainted Data
	149
	150	To test whether a variable contains tainted data, and whose use would
	151	thus trigger an "Insecure dependency" message, you can use the
	152	C<tainted()> function of the Scalar::Util module, available in your
	153	nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
	154	Or you may be able to use the following C<is_tainted()> function.
	155
	156	sub is_tainted {
	157	return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
	158	}
	159
	160	This function makes use of the fact that the presence of tainted data
	161	anywhere within an expression renders the entire expression tainted. It
	162	would be inefficient for every operator to test every argument for
	163	taintedness. Instead, the slightly more efficient and conservative
	164	approach is used that if any tainted value has been accessed within the
	165	same expression, the whole expression is considered tainted.
	166
	167	But testing for taintedness gets you only so far. Sometimes you have just
	168	to clear your data's taintedness. Values may be untainted by using them
	169	as keys in a hash; otherwise the only way to bypass the tainting
	170	mechanism is by referencing subpatterns from a regular expression match.
	171	Perl presumes that if you reference a substring using $1, $2, etc., that
	172	you knew what you were doing when you wrote the pattern. That means using
	173	a bit of thought--don't just blindly untaint anything, or you defeat the
	174	entire mechanism. It's better to verify that the variable has only good
	175	characters (for certain values of "good") rather than checking whether it
	176	has any bad characters. That's because it's far too easy to miss bad
	177	characters that you never thought of.
	178
	179	Here's a test to make sure that the data contains nothing but "word"
	180	characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
	181	or a dot.
	182
	183	if ($data =~ /^([-\@\w.]+)$/) {
	184	$data = $1; # $data now untainted
	185	} else {
	186	die "Bad data in '$data'"; # log this somewhere
	187	}
	188
	189	This is fairly secure because C</\w+/> doesn't normally match shell
	190	metacharacters, nor are dot, dash, or at going to mean something special
	191	to the shell. Use of C</.+/> would have been insecure in theory because
	192	it lets everything through, but Perl doesn't check for that. The lesson
	193	is that when untainting, you must be exceedingly careful with your patterns.
	194	Laundering data using regular expression is the I<only> mechanism for
	195	untainting dirty data, unless you use the strategy detailed below to fork
	196	a child of lesser privilege.
	197
	198	The example does not untaint C<$data> if C<use locale> is in effect,
	199	because the characters matched by C<\w> are determined by the locale.
	200	Perl considers that locale definitions are untrustworthy because they
	201	contain data from outside the program. If you are writing a
	202	locale-aware program, and want to launder data with a regular expression
	203	containing C<\w>, put C<no locale> ahead of the expression in the same
	204	block. See L<perllocale/SECURITY> for further discussion and examples.
	205
	206	=head2 Switches On the "#!" Line
	207
	208	When you make a script executable, in order to make it usable as a
	209	command, the system will pass switches to perl from the script's #!
	210	line. Perl checks that any command line switches given to a setuid
	211	(or setgid) script actually match the ones set on the #! line. Some
	212	Unix and Unix-like environments impose a one-switch limit on the #!
	213	line, so you may need to use something like C<-wU> instead of C<-w -U>
	214	under such systems. (This issue should arise only in Unix or
	215	Unix-like environments that support #! and setuid or setgid scripts.)
	216
	217	=head2 Taint mode and @INC
	218
	219	When the taint mode (C<-T>) is in effect, the "." directory is removed
	220	from C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
	221	are ignored by Perl. You can still adjust C<@INC> from outside the
	222	program by using the C<-I> command line option as explained in
	223	L<perlrun>. The two environment variables are ignored because
	224	they are obscured, and a user running a program could be unaware that
	225	they are set, whereas the C<-I> option is clearly visible and
	226	therefore permitted.
	227
	228	Another way to modify C<@INC> without modifying the program, is to use
	229	the C<lib> pragma, e.g.:
	230
	231	perl -Mlib=/foo program
	232
	233	The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
	234	will automagically remove any duplicated directories, while the later
	235	will not.
	236
	237	Note that if a tainted string is added to C<@INC>, the following
	238	problem will be reported:
	239
	240	Insecure dependency in require while running with -T switch
	241
	242	=head2 Cleaning Up Your Path
	243
	244	For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
	245	a known value, and each directory in the path must be absolute and
	246	non-writable by others than its owner and group. You may be surprised to
	247	get this message even if the pathname to your executable is fully
	248	qualified. This is I<not> generated because you didn't supply a full path
	249	to the program; instead, it's generated because you never set your PATH
	250	environment variable, or you didn't set it to something that was safe.
	251	Because Perl can't guarantee that the executable in question isn't itself
	252	going to turn around and execute some other program that is dependent on
	253	your PATH, it makes sure you set the PATH.
	254
	255	The PATH isn't the only environment variable which can cause problems.
	256	Because some shells may use the variables IFS, CDPATH, ENV, and
	257	BASH_ENV, Perl checks that those are either empty or untainted when
	258	starting subprocesses. You may wish to add something like this to your
	259	setid and taint-checking scripts.
	260
	261	delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
	262
	263	It's also possible to get into trouble with other operations that don't
	264	care whether they use tainted values. Make judicious use of the file
	265	tests in dealing with any user-supplied filenames. When possible, do
	266	opens and such B<after> properly dropping any special user (or group!)
	267	privileges. Perl doesn't prevent you from opening tainted filenames for reading,
	268	so be careful what you print out. The tainting mechanism is intended to
	269	prevent stupid mistakes, not to remove the need for thought.
	270
	271	Perl does not call the shell to expand wild cards when you pass C<system>
	272	and C<exec> explicit parameter lists instead of strings with possible shell
	273	wildcards in them. Unfortunately, the C<open>, C<glob>, and
	274	backtick functions provide no such alternate calling convention, so more
	275	subterfuge will be required.
	276
	277	Perl provides a reasonably safe way to open a file or pipe from a setuid
	278	or setgid program: just create a child process with reduced privilege who
	279	does the dirty work for you. First, fork a child using the special
	280	C<open> syntax that connects the parent and child by a pipe. Now the
	281	child resets its ID set and any other per-process attributes, like
	282	environment variables, umasks, current working directories, back to the
	283	originals or known safe values. Then the child process, which no longer
	284	has any special permissions, does the C<open> or other system call.
	285	Finally, the child passes the data it managed to access back to the
	286	parent. Because the file or pipe was opened in the child while running
	287	under less privilege than the parent, it's not apt to be tricked into
	288	doing something it shouldn't.
	289
	290	Here's a way to do backticks reasonably safely. Notice how the C<exec> is
	291	not called with a string that the shell could expand. This is by far the
	292	best way to call something that might be subjected to shell escapes: just
	293	never call the shell at all.
	294
	295	use English '-no_match_vars';
	296	die "Can't fork: $!" unless defined($pid = open(KID, "-\|"));
	297	if ($pid) { # parent
	298	while (<KID>) {
	299	# do something
	300	}
	301	close KID;
	302	} else {
	303	my @temp = ($EUID, $EGID);
	304	my $orig_uid = $UID;
	305	my $orig_gid = $GID;
	306	$EUID = $UID;
	307	$EGID = $GID;
	308	# Drop privileges
	309	$UID = $orig_uid;
	310	$GID = $orig_gid;
	311	# Make sure privs are really gone
	312	($EUID, $EGID) = @temp;
	313	die "Can't drop privileges"
	314	unless $UID == $EUID && $GID eq $EGID;
	315	$ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
	316	# Consider sanitizing the environment even more.
	317	exec 'myprog', 'arg1', 'arg2'
	318	or die "can't exec myprog: $!";
	319	}
	320
	321	A similar strategy would work for wildcard expansion via C<glob>, although
	322	you can use C<readdir> instead.
	323
	324	Taint checking is most useful when although you trust yourself not to have
	325	written a program to give away the farm, you don't necessarily trust those
	326	who end up using it not to try to trick it into doing something bad. This
	327	is the kind of security checking that's useful for set-id programs and
	328	programs launched on someone else's behalf, like CGI programs.
	329
	330	This is quite different, however, from not even trusting the writer of the
	331	code not to try to do something evil. That's the kind of trust needed
	332	when someone hands you a program you've never seen before and says, "Here,
	333	run this." For that kind of safety, check out the Safe module,
	334	included standard in the Perl distribution. This module allows the
	335	programmer to set up special compartments in which all system operations
	336	are trapped and namespace access is carefully controlled.
	337
	338	=head2 Security Bugs
	339
	340	Beyond the obvious problems that stem from giving special privileges to
	341	systems as flexible as scripts, on many versions of Unix, set-id scripts
	342	are inherently insecure right from the start. The problem is a race
	343	condition in the kernel. Between the time the kernel opens the file to
	344	see which interpreter to run and when the (now-set-id) interpreter turns
	345	around and reopens the file to interpret it, the file in question may have
	346	changed, especially if you have symbolic links on your system.
	347
	348	Fortunately, sometimes this kernel "feature" can be disabled.
	349	Unfortunately, there are two ways to disable it. The system can simply
	350	outlaw scripts with any set-id bit set, which doesn't help much.
	351	Alternately, it can simply ignore the set-id bits on scripts. If the
	352	latter is true, Perl can emulate the setuid and setgid mechanism when it
	353	notices the otherwise useless setuid/gid bits on Perl scripts. It does
	354	this via a special executable called F<suidperl> that is automatically
	355	invoked for you if it's needed.
	356
	357	However, if the kernel set-id script feature isn't disabled, Perl will
	358	complain loudly that your set-id script is insecure. You'll need to
	359	either disable the kernel set-id script feature, or put a C wrapper around
	360	the script. A C wrapper is just a compiled program that does nothing
	361	except call your Perl program. Compiled programs are not subject to the
	362	kernel bug that plagues set-id scripts. Here's a simple wrapper, written
	363	in C:
	364
	365	#define REAL_PATH "/path/to/script"
	366	main(ac, av)
	367	char **av;
	368	{
	369	execv(REAL_PATH, av);
	370	}
	371
	372	Compile this wrapper into a binary executable and then make I<it> rather
	373	than your script setuid or setgid.
	374
	375	In recent years, vendors have begun to supply systems free of this
	376	inherent security bug. On such systems, when the kernel passes the name
	377	of the set-id script to open to the interpreter, rather than using a
	378	pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
	379	special file already opened on the script, so that there can be no race
	380	condition for evil scripts to exploit. On these systems, Perl should be
	381	compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure>
	382	program that builds Perl tries to figure this out for itself, so you
	383	should never have to specify this yourself. Most modern releases of
	384	SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
	385
	386	Prior to release 5.6.1 of Perl, bugs in the code of F<suidperl> could
	387	introduce a security hole.
	388
	389	=head2 Protecting Your Programs
	390
	391	There are a number of ways to hide the source to your Perl programs,
	392	with varying levels of "security".
	393
	394	First of all, however, you I<can't> take away read permission, because
	395	the source code has to be readable in order to be compiled and
	396	interpreted. (That doesn't mean that a CGI script's source is
	397	readable by people on the web, though.) So you have to leave the
	398	permissions at the socially friendly 0755 level. This lets
	399	people on your local system only see your source.
	400
	401	Some people mistakenly regard this as a security problem. If your program does
	402	insecure things, and relies on people not knowing how to exploit those
	403	insecurities, it is not secure. It is often possible for someone to
	404	determine the insecure things and exploit them without viewing the
	405	source. Security through obscurity, the name for hiding your bugs
	406	instead of fixing them, is little security indeed.
	407
	408	You can try using encryption via source filters (Filter::* from CPAN,
	409	or Filter::Util::Call and Filter::Simple since Perl 5.8).
	410	But crackers might be able to decrypt it. You can try using the byte
	411	code compiler and interpreter described below, but crackers might be
	412	able to de-compile it. You can try using the native-code compiler
	413	described below, but crackers might be able to disassemble it. These
	414	pose varying degrees of difficulty to people wanting to get at your
	415	code, but none can definitively conceal it (this is true of every
	416	language, not just Perl).
	417
	418	If you're concerned about people profiting from your code, then the
	419	bottom line is that nothing but a restrictive licence will give you
	420	legal security. License your software and pepper it with threatening
	421	statements like "This is unpublished proprietary software of XYZ Corp.
	422	Your access to it does not give you permission to use it blah blah
	423	blah." You should see a lawyer to be sure your licence's wording will
	424	stand up in court.
	425
	426	=head2 Unicode
	427
	428	Unicode is a new and complex technology and one may easily overlook
	429	certain security pitfalls. See L<perluniintro> for an overview and
	430	L<perlunicode> for details, and L<perlunicode/"Security Implications
	431	of Unicode"> for security implications in particular.
	432
	433	=head2 Algorithmic Complexity Attacks
	434
	435	Certain internal algorithms used in the implementation of Perl can
	436	be attacked by choosing the input carefully to consume large amounts
	437	of either time or space or both. This can lead into the so-called
	438	I<Denial of Service> (DoS) attacks.
	439
	440	=over 4
	441
	442	=item *
	443
	444	Hash Function - the algorithm used to "order" hash elements has been
	445	changed several times during the development of Perl, mainly to be
	446	reasonably fast. In Perl 5.8.1 also the security aspect was taken
	447	into account.
	448
	449	In Perls before 5.8.1 one could rather easily generate data that as
	450	hash keys would cause Perl to consume large amounts of time because
	451	internal structure of hashes would badly degenerate. In Perl 5.8.1
	452	the hash function is randomly perturbed by a pseudorandom seed which
	453	makes generating such naughty hash keys harder.
	454	See L<perlrun/PERL_HASH_SEED> for more information.
	455
	456	The random perturbation is done by default but if one wants for some
	457	reason emulate the old behaviour one can set the environment variable
	458	PERL_HASH_SEED to zero (or any other integer). One possible reason
	459	for wanting to emulate the old behaviour is that in the new behaviour
	460	consecutive runs of Perl will order hash keys differently, which may
	461	confuse some applications (like Data::Dumper: the outputs of two
	462	different runs are no more identical).
	463
	464	B<Perl has never guaranteed any ordering of the hash keys>, and the
	465	ordering has already changed several times during the lifetime of
	466	Perl 5. Also, the ordering of hash keys has always been, and
	467	continues to be, affected by the insertion order.
	468
	469	Also note that while the order of the hash elements might be
	470	randomised, this "pseudoordering" should B<not> be used for
	471	applications like shuffling a list randomly (use List::Util::shuffle()
	472	for that, see L<List::Util>, a standard core module since Perl 5.8.0;
	473	or the CPAN module Algorithm::Numerical::Shuffle), or for generating
	474	permutations (use e.g. the CPAN modules Algorithm::Permute or
	475	Algorithm::FastPermute), or for any cryptographic applications.
	476
	477	=item *
	478
	479	Regular expressions - Perl's regular expression engine is so called
	480	NFA (Non-Finite Automaton), which among other things means that it can
	481	rather easily consume large amounts of both time and space if the
	482	regular expression may match in several ways. Careful crafting of the
	483	regular expressions can help but quite often there really isn't much
	484	one can do (the book "Mastering Regular Expressions" is required
	485	reading, see L<perlfaq2>). Running out of space manifests itself by
	486	Perl running out of memory.
	487
	488	=item *
	489
	490	Sorting - the quicksort algorithm used in Perls before 5.8.0 to
	491	implement the sort() function is very easy to trick into misbehaving
	492	so that it consumes a lot of time. Nothing more is required than
	493	resorting a list already sorted. Starting from Perl 5.8.0 a different
	494	sorting algorithm, mergesort, is used. Mergesort is insensitive to
	495	its input data, so it cannot be similarly fooled.
	496
	497	=back
	498
	499	See L<http://www.cs.rice.edu/~scrosby/hash/> for more information,
	500	and any computer science text book on the algorithmic complexity.
	501
	502	=head1 SEE ALSO
	503
	504	L<perlrun> for its description of cleaning up environment variables.