| 1 | =head1 NAME |
| 2 | |
| 3 | perlfaq7 - General Perl Language Issues ($Revision: 1.28 $, $Date: 2005/12/31 00:54:37 $) |
| 4 | |
| 5 | =head1 DESCRIPTION |
| 6 | |
| 7 | This section deals with general Perl language issues that don't |
| 8 | clearly fit into any of the other sections. |
| 9 | |
| 10 | =head2 Can I get a BNF/yacc/RE for the Perl language? |
| 11 | |
| 12 | There is no BNF, but you can paw your way through the yacc grammar in |
| 13 | perly.y in the source distribution if you're particularly brave. The |
| 14 | grammar relies on very smart tokenizing code, so be prepared to |
| 15 | venture into toke.c as well. |
| 16 | |
| 17 | In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF. |
| 18 | The work of parsing perl is distributed between yacc, the lexer, smoke |
| 19 | and mirrors." |
| 20 | |
| 21 | =head2 What are all these $@%&* punctuation signs, and how do I know when to use them? |
| 22 | |
| 23 | They are type specifiers, as detailed in L<perldata>: |
| 24 | |
| 25 | $ for scalar values (number, string or reference) |
| 26 | @ for arrays |
| 27 | % for hashes (associative arrays) |
| 28 | & for subroutines (aka functions, procedures, methods) |
| 29 | * for all types of that symbol name. In version 4 you used them like |
| 30 | pointers, but in modern perls you can just use references. |
| 31 | |
| 32 | There are couple of other symbols that you're likely to encounter that aren't |
| 33 | really type specifiers: |
| 34 | |
| 35 | <> are used for inputting a record from a filehandle. |
| 36 | \ takes a reference to something. |
| 37 | |
| 38 | Note that <FILE> is I<neither> the type specifier for files |
| 39 | nor the name of the handle. It is the C<< <> >> operator applied |
| 40 | to the handle FILE. It reads one line (well, record--see |
| 41 | L<perlvar/$E<sol>>) from the handle FILE in scalar context, or I<all> lines |
| 42 | in list context. When performing open, close, or any other operation |
| 43 | besides C<< <> >> on files, or even when talking about the handle, do |
| 44 | I<not> use the brackets. These are correct: C<eof(FH)>, C<seek(FH, 0, |
| 45 | 2)> and "copying from STDIN to FILE". |
| 46 | |
| 47 | =head2 Do I always/never have to quote my strings or use semicolons and commas? |
| 48 | |
| 49 | Normally, a bareword doesn't need to be quoted, but in most cases |
| 50 | probably should be (and must be under C<use strict>). But a hash key |
| 51 | consisting of a simple word (that isn't the name of a defined |
| 52 | subroutine) and the left-hand operand to the C<< => >> operator both |
| 53 | count as though they were quoted: |
| 54 | |
| 55 | This is like this |
| 56 | ------------ --------------- |
| 57 | $foo{line} $foo{'line'} |
| 58 | bar => stuff 'bar' => stuff |
| 59 | |
| 60 | The final semicolon in a block is optional, as is the final comma in a |
| 61 | list. Good style (see L<perlstyle>) says to put them in except for |
| 62 | one-liners: |
| 63 | |
| 64 | if ($whoops) { exit 1 } |
| 65 | @nums = (1, 2, 3); |
| 66 | |
| 67 | if ($whoops) { |
| 68 | exit 1; |
| 69 | } |
| 70 | @lines = ( |
| 71 | "There Beren came from mountains cold", |
| 72 | "And lost he wandered under leaves", |
| 73 | ); |
| 74 | |
| 75 | =head2 How do I skip some return values? |
| 76 | |
| 77 | One way is to treat the return values as a list and index into it: |
| 78 | |
| 79 | $dir = (getpwnam($user))[7]; |
| 80 | |
| 81 | Another way is to use undef as an element on the left-hand-side: |
| 82 | |
| 83 | ($dev, $ino, undef, undef, $uid, $gid) = stat($file); |
| 84 | |
| 85 | You can also use a list slice to select only the elements that |
| 86 | you need: |
| 87 | |
| 88 | ($dev, $ino, $uid, $gid) = ( stat($file) )[0,1,4,5]; |
| 89 | |
| 90 | =head2 How do I temporarily block warnings? |
| 91 | |
| 92 | If you are running Perl 5.6.0 or better, the C<use warnings> pragma |
| 93 | allows fine control of what warning are produced. |
| 94 | See L<perllexwarn> for more details. |
| 95 | |
| 96 | { |
| 97 | no warnings; # temporarily turn off warnings |
| 98 | $a = $b + $c; # I know these might be undef |
| 99 | } |
| 100 | |
| 101 | Additionally, you can enable and disable categories of warnings. |
| 102 | You turn off the categories you want to ignore and you can still |
| 103 | get other categories of warnings. See L<perllexwarn> for the |
| 104 | complete details, including the category names and hierarchy. |
| 105 | |
| 106 | { |
| 107 | no warnings 'uninitialized'; |
| 108 | $a = $b + $c; |
| 109 | } |
| 110 | |
| 111 | If you have an older version of Perl, the C<$^W> variable (documented |
| 112 | in L<perlvar>) controls runtime warnings for a block: |
| 113 | |
| 114 | { |
| 115 | local $^W = 0; # temporarily turn off warnings |
| 116 | $a = $b + $c; # I know these might be undef |
| 117 | } |
| 118 | |
| 119 | Note that like all the punctuation variables, you cannot currently |
| 120 | use my() on C<$^W>, only local(). |
| 121 | |
| 122 | =head2 What's an extension? |
| 123 | |
| 124 | An extension is a way of calling compiled C code from Perl. Reading |
| 125 | L<perlxstut> is a good place to learn more about extensions. |
| 126 | |
| 127 | =head2 Why do Perl operators have different precedence than C operators? |
| 128 | |
| 129 | Actually, they don't. All C operators that Perl copies have the same |
| 130 | precedence in Perl as they do in C. The problem is with operators that C |
| 131 | doesn't have, especially functions that give a list context to everything |
| 132 | on their right, eg. print, chmod, exec, and so on. Such functions are |
| 133 | called "list operators" and appear as such in the precedence table in |
| 134 | L<perlop>. |
| 135 | |
| 136 | A common mistake is to write: |
| 137 | |
| 138 | unlink $file || die "snafu"; |
| 139 | |
| 140 | This gets interpreted as: |
| 141 | |
| 142 | unlink ($file || die "snafu"); |
| 143 | |
| 144 | To avoid this problem, either put in extra parentheses or use the |
| 145 | super low precedence C<or> operator: |
| 146 | |
| 147 | (unlink $file) || die "snafu"; |
| 148 | unlink $file or die "snafu"; |
| 149 | |
| 150 | The "English" operators (C<and>, C<or>, C<xor>, and C<not>) |
| 151 | deliberately have precedence lower than that of list operators for |
| 152 | just such situations as the one above. |
| 153 | |
| 154 | Another operator with surprising precedence is exponentiation. It |
| 155 | binds more tightly even than unary minus, making C<-2**2> product a |
| 156 | negative not a positive four. It is also right-associating, meaning |
| 157 | that C<2**3**2> is two raised to the ninth power, not eight squared. |
| 158 | |
| 159 | Although it has the same precedence as in C, Perl's C<?:> operator |
| 160 | produces an lvalue. This assigns $x to either $a or $b, depending |
| 161 | on the trueness of $maybe: |
| 162 | |
| 163 | ($maybe ? $a : $b) = $x; |
| 164 | |
| 165 | =head2 How do I declare/create a structure? |
| 166 | |
| 167 | In general, you don't "declare" a structure. Just use a (probably |
| 168 | anonymous) hash reference. See L<perlref> and L<perldsc> for details. |
| 169 | Here's an example: |
| 170 | |
| 171 | $person = {}; # new anonymous hash |
| 172 | $person->{AGE} = 24; # set field AGE to 24 |
| 173 | $person->{NAME} = "Nat"; # set field NAME to "Nat" |
| 174 | |
| 175 | If you're looking for something a bit more rigorous, try L<perltoot>. |
| 176 | |
| 177 | =head2 How do I create a module? |
| 178 | |
| 179 | (contributed by brian d foy) |
| 180 | |
| 181 | L<perlmod>, L<perlmodlib>, L<perlmodstyle> explain modules |
| 182 | in all the gory details. L<perlnewmod> gives a brief |
| 183 | overview of the process along with a couple of suggestions |
| 184 | about style. |
| 185 | |
| 186 | If you need to include C code or C library interfaces in |
| 187 | your module, you'll need h2xs. h2xs will create the module |
| 188 | distribution structure and the initial interface files |
| 189 | you'll need. L<perlxs> and L<perlxstut> explain the details. |
| 190 | |
| 191 | If you don't need to use C code, other tools such as |
| 192 | ExtUtils::ModuleMaker and Module::Starter, can help you |
| 193 | create a skeleton module distribution. |
| 194 | |
| 195 | You may also want to see Sam Tregar's "Writing Perl Modules |
| 196 | for CPAN" ( http://apress.com/book/bookDisplay.html?bID=14 ) |
| 197 | which is the best hands-on guide to creating module |
| 198 | distributions. |
| 199 | |
| 200 | =head2 How do I create a class? |
| 201 | |
| 202 | See L<perltoot> for an introduction to classes and objects, as well as |
| 203 | L<perlobj> and L<perlbot>. |
| 204 | |
| 205 | =head2 How can I tell if a variable is tainted? |
| 206 | |
| 207 | You can use the tainted() function of the Scalar::Util module, available |
| 208 | from CPAN (or included with Perl since release 5.8.0). |
| 209 | See also L<perlsec/"Laundering and Detecting Tainted Data">. |
| 210 | |
| 211 | =head2 What's a closure? |
| 212 | |
| 213 | Closures are documented in L<perlref>. |
| 214 | |
| 215 | I<Closure> is a computer science term with a precise but |
| 216 | hard-to-explain meaning. Closures are implemented in Perl as anonymous |
| 217 | subroutines with lasting references to lexical variables outside their |
| 218 | own scopes. These lexicals magically refer to the variables that were |
| 219 | around when the subroutine was defined (deep binding). |
| 220 | |
| 221 | Closures make sense in any programming language where you can have the |
| 222 | return value of a function be itself a function, as you can in Perl. |
| 223 | Note that some languages provide anonymous functions but are not |
| 224 | capable of providing proper closures: the Python language, for |
| 225 | example. For more information on closures, check out any textbook on |
| 226 | functional programming. Scheme is a language that not only supports |
| 227 | but encourages closures. |
| 228 | |
| 229 | Here's a classic function-generating function: |
| 230 | |
| 231 | sub add_function_generator { |
| 232 | return sub { shift() + shift() }; |
| 233 | } |
| 234 | |
| 235 | $add_sub = add_function_generator(); |
| 236 | $sum = $add_sub->(4,5); # $sum is 9 now. |
| 237 | |
| 238 | The closure works as a I<function template> with some customization |
| 239 | slots left out to be filled later. The anonymous subroutine returned |
| 240 | by add_function_generator() isn't technically a closure because it |
| 241 | refers to no lexicals outside its own scope. |
| 242 | |
| 243 | Contrast this with the following make_adder() function, in which the |
| 244 | returned anonymous function contains a reference to a lexical variable |
| 245 | outside the scope of that function itself. Such a reference requires |
| 246 | that Perl return a proper closure, thus locking in for all time the |
| 247 | value that the lexical had when the function was created. |
| 248 | |
| 249 | sub make_adder { |
| 250 | my $addpiece = shift; |
| 251 | return sub { shift() + $addpiece }; |
| 252 | } |
| 253 | |
| 254 | $f1 = make_adder(20); |
| 255 | $f2 = make_adder(555); |
| 256 | |
| 257 | Now C<&$f1($n)> is always 20 plus whatever $n you pass in, whereas |
| 258 | C<&$f2($n)> is always 555 plus whatever $n you pass in. The $addpiece |
| 259 | in the closure sticks around. |
| 260 | |
| 261 | Closures are often used for less esoteric purposes. For example, when |
| 262 | you want to pass in a bit of code into a function: |
| 263 | |
| 264 | my $line; |
| 265 | timeout( 30, sub { $line = <STDIN> } ); |
| 266 | |
| 267 | If the code to execute had been passed in as a string, |
| 268 | C<< '$line = <STDIN>' >>, there would have been no way for the |
| 269 | hypothetical timeout() function to access the lexical variable |
| 270 | $line back in its caller's scope. |
| 271 | |
| 272 | =head2 What is variable suicide and how can I prevent it? |
| 273 | |
| 274 | This problem was fixed in perl 5.004_05, so preventing it means upgrading |
| 275 | your version of perl. ;) |
| 276 | |
| 277 | Variable suicide is when you (temporarily or permanently) lose the value |
| 278 | of a variable. It is caused by scoping through my() and local() |
| 279 | interacting with either closures or aliased foreach() iterator variables |
| 280 | and subroutine arguments. It used to be easy to inadvertently lose a |
| 281 | variable's value this way, but now it's much harder. Take this code: |
| 282 | |
| 283 | my $f = 'foo'; |
| 284 | sub T { |
| 285 | while ($i++ < 3) { my $f = $f; $f .= $i; print $f, "\n" } |
| 286 | } |
| 287 | T; |
| 288 | print "Finally $f\n"; |
| 289 | |
| 290 | If you are experiencing variable suicide, that C<my $f> in the subroutine |
| 291 | doesn't pick up a fresh copy of the C<$f> whose value is <foo>. The output |
| 292 | shows that inside the subroutine the value of C<$f> leaks through when it |
| 293 | shouldn't, as in this output: |
| 294 | |
| 295 | foobar |
| 296 | foobarbar |
| 297 | foobarbarbar |
| 298 | Finally foo |
| 299 | |
| 300 | The $f that has "bar" added to it three times should be a new C<$f> |
| 301 | C<my $f> should create a new lexical variable each time through the loop. |
| 302 | The expected output is: |
| 303 | |
| 304 | foobar |
| 305 | foobar |
| 306 | foobar |
| 307 | Finally foo |
| 308 | |
| 309 | =head2 How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}? |
| 310 | |
| 311 | With the exception of regexes, you need to pass references to these |
| 312 | objects. See L<perlsub/"Pass by Reference"> for this particular |
| 313 | question, and L<perlref> for information on references. |
| 314 | |
| 315 | See "Passing Regexes", below, for information on passing regular |
| 316 | expressions. |
| 317 | |
| 318 | =over 4 |
| 319 | |
| 320 | =item Passing Variables and Functions |
| 321 | |
| 322 | Regular variables and functions are quite easy to pass: just pass in a |
| 323 | reference to an existing or anonymous variable or function: |
| 324 | |
| 325 | func( \$some_scalar ); |
| 326 | |
| 327 | func( \@some_array ); |
| 328 | func( [ 1 .. 10 ] ); |
| 329 | |
| 330 | func( \%some_hash ); |
| 331 | func( { this => 10, that => 20 } ); |
| 332 | |
| 333 | func( \&some_func ); |
| 334 | func( sub { $_[0] ** $_[1] } ); |
| 335 | |
| 336 | =item Passing Filehandles |
| 337 | |
| 338 | As of Perl 5.6, you can represent filehandles with scalar variables |
| 339 | which you treat as any other scalar. |
| 340 | |
| 341 | open my $fh, $filename or die "Cannot open $filename! $!"; |
| 342 | func( $fh ); |
| 343 | |
| 344 | sub func { |
| 345 | my $passed_fh = shift; |
| 346 | |
| 347 | my $line = <$fh>; |
| 348 | } |
| 349 | |
| 350 | Before Perl 5.6, you had to use the C<*FH> or C<\*FH> notations. |
| 351 | These are "typeglobs"--see L<perldata/"Typeglobs and Filehandles"> |
| 352 | and especially L<perlsub/"Pass by Reference"> for more information. |
| 353 | |
| 354 | =item Passing Regexes |
| 355 | |
| 356 | To pass regexes around, you'll need to be using a release of Perl |
| 357 | sufficiently recent as to support the C<qr//> construct, pass around |
| 358 | strings and use an exception-trapping eval, or else be very, very clever. |
| 359 | |
| 360 | Here's an example of how to pass in a string to be regex compared |
| 361 | using C<qr//>: |
| 362 | |
| 363 | sub compare($$) { |
| 364 | my ($val1, $regex) = @_; |
| 365 | my $retval = $val1 =~ /$regex/; |
| 366 | return $retval; |
| 367 | } |
| 368 | $match = compare("old McDonald", qr/d.*D/i); |
| 369 | |
| 370 | Notice how C<qr//> allows flags at the end. That pattern was compiled |
| 371 | at compile time, although it was executed later. The nifty C<qr//> |
| 372 | notation wasn't introduced until the 5.005 release. Before that, you |
| 373 | had to approach this problem much less intuitively. For example, here |
| 374 | it is again if you don't have C<qr//>: |
| 375 | |
| 376 | sub compare($$) { |
| 377 | my ($val1, $regex) = @_; |
| 378 | my $retval = eval { $val1 =~ /$regex/ }; |
| 379 | die if $@; |
| 380 | return $retval; |
| 381 | } |
| 382 | |
| 383 | $match = compare("old McDonald", q/($?i)d.*D/); |
| 384 | |
| 385 | Make sure you never say something like this: |
| 386 | |
| 387 | return eval "\$val =~ /$regex/"; # WRONG |
| 388 | |
| 389 | or someone can sneak shell escapes into the regex due to the double |
| 390 | interpolation of the eval and the double-quoted string. For example: |
| 391 | |
| 392 | $pattern_of_evil = 'danger ${ system("rm -rf * &") } danger'; |
| 393 | |
| 394 | eval "\$string =~ /$pattern_of_evil/"; |
| 395 | |
| 396 | Those preferring to be very, very clever might see the O'Reilly book, |
| 397 | I<Mastering Regular Expressions>, by Jeffrey Friedl. Page 273's |
| 398 | Build_MatchMany_Function() is particularly interesting. A complete |
| 399 | citation of this book is given in L<perlfaq2>. |
| 400 | |
| 401 | =item Passing Methods |
| 402 | |
| 403 | To pass an object method into a subroutine, you can do this: |
| 404 | |
| 405 | call_a_lot(10, $some_obj, "methname") |
| 406 | sub call_a_lot { |
| 407 | my ($count, $widget, $trick) = @_; |
| 408 | for (my $i = 0; $i < $count; $i++) { |
| 409 | $widget->$trick(); |
| 410 | } |
| 411 | } |
| 412 | |
| 413 | Or, you can use a closure to bundle up the object, its |
| 414 | method call, and arguments: |
| 415 | |
| 416 | my $whatnot = sub { $some_obj->obfuscate(@args) }; |
| 417 | func($whatnot); |
| 418 | sub func { |
| 419 | my $code = shift; |
| 420 | &$code(); |
| 421 | } |
| 422 | |
| 423 | You could also investigate the can() method in the UNIVERSAL class |
| 424 | (part of the standard perl distribution). |
| 425 | |
| 426 | =back |
| 427 | |
| 428 | =head2 How do I create a static variable? |
| 429 | |
| 430 | (contributed by brian d foy) |
| 431 | |
| 432 | Perl doesn't have "static" variables, which can only be accessed from |
| 433 | the function in which they are declared. You can get the same effect |
| 434 | with lexical variables, though. |
| 435 | |
| 436 | You can fake a static variable by using a lexical variable which goes |
| 437 | out of scope. In this example, you define the subroutine C<counter>, and |
| 438 | it uses the lexical variable C<$count>. Since you wrap this in a BEGIN |
| 439 | block, C<$count> is defined at compile-time, but also goes out of |
| 440 | scope at the end of the BEGIN block. The BEGIN block also ensures that |
| 441 | the subroutine and the value it uses is defined at compile-time so the |
| 442 | subroutine is ready to use just like any other subroutine, and you can |
| 443 | put this code in the same place as other subroutines in the program |
| 444 | text (i.e. at the end of the code, typically). The subroutine |
| 445 | C<counter> still has a reference to the data, and is the only way you |
| 446 | can access the value (and each time you do, you increment the value). |
| 447 | The data in chunk of memory defined by C<$count> is private to |
| 448 | C<counter>. |
| 449 | |
| 450 | BEGIN { |
| 451 | my $count = 1; |
| 452 | sub counter { $count++ } |
| 453 | } |
| 454 | |
| 455 | my $start = count(); |
| 456 | |
| 457 | .... # code that calls count(); |
| 458 | |
| 459 | my $end = count(); |
| 460 | |
| 461 | In the previous example, you created a function-private variable |
| 462 | because only one function remembered its reference. You could define |
| 463 | multiple functions while the variable is in scope, and each function |
| 464 | can share the "private" variable. It's not really "static" because you |
| 465 | can access it outside the function while the lexical variable is in |
| 466 | scope, and even create references to it. In this example, |
| 467 | C<increment_count> and C<return_count> share the variable. One |
| 468 | function adds to the value and the other simply returns the value. |
| 469 | They can both access C<$count>, and since it has gone out of scope, |
| 470 | there is no other way to access it. |
| 471 | |
| 472 | BEGIN { |
| 473 | my $count = 1; |
| 474 | sub increment_count { $count++ } |
| 475 | sub return_count { $count } |
| 476 | } |
| 477 | |
| 478 | To declare a file-private variable, you still use a lexical variable. |
| 479 | A file is also a scope, so a lexical variable defined in the file |
| 480 | cannot be seen from any other file. |
| 481 | |
| 482 | See L<perlsub/"Persistent Private Variables"> for more information. |
| 483 | The discussion of closures in L<perlref> may help you even though we |
| 484 | did not use anonymous subroutines in this answer. See |
| 485 | L<perlsub/"Persistent Private Variables"> for details. |
| 486 | |
| 487 | =head2 What's the difference between dynamic and lexical (static) scoping? Between local() and my()? |
| 488 | |
| 489 | C<local($x)> saves away the old value of the global variable C<$x> |
| 490 | and assigns a new value for the duration of the subroutine I<which is |
| 491 | visible in other functions called from that subroutine>. This is done |
| 492 | at run-time, so is called dynamic scoping. local() always affects global |
| 493 | variables, also called package variables or dynamic variables. |
| 494 | |
| 495 | C<my($x)> creates a new variable that is only visible in the current |
| 496 | subroutine. This is done at compile-time, so it is called lexical or |
| 497 | static scoping. my() always affects private variables, also called |
| 498 | lexical variables or (improperly) static(ly scoped) variables. |
| 499 | |
| 500 | For instance: |
| 501 | |
| 502 | sub visible { |
| 503 | print "var has value $var\n"; |
| 504 | } |
| 505 | |
| 506 | sub dynamic { |
| 507 | local $var = 'local'; # new temporary value for the still-global |
| 508 | visible(); # variable called $var |
| 509 | } |
| 510 | |
| 511 | sub lexical { |
| 512 | my $var = 'private'; # new private variable, $var |
| 513 | visible(); # (invisible outside of sub scope) |
| 514 | } |
| 515 | |
| 516 | $var = 'global'; |
| 517 | |
| 518 | visible(); # prints global |
| 519 | dynamic(); # prints local |
| 520 | lexical(); # prints global |
| 521 | |
| 522 | Notice how at no point does the value "private" get printed. That's |
| 523 | because $var only has that value within the block of the lexical() |
| 524 | function, and it is hidden from called subroutine. |
| 525 | |
| 526 | In summary, local() doesn't make what you think of as private, local |
| 527 | variables. It gives a global variable a temporary value. my() is |
| 528 | what you're looking for if you want private variables. |
| 529 | |
| 530 | See L<perlsub/"Private Variables via my()"> and |
| 531 | L<perlsub/"Temporary Values via local()"> for excruciating details. |
| 532 | |
| 533 | =head2 How can I access a dynamic variable while a similarly named lexical is in scope? |
| 534 | |
| 535 | If you know your package, you can just mention it explicitly, as in |
| 536 | $Some_Pack::var. Note that the notation $::var is B<not> the dynamic $var |
| 537 | in the current package, but rather the one in the "main" package, as |
| 538 | though you had written $main::var. |
| 539 | |
| 540 | use vars '$var'; |
| 541 | local $var = "global"; |
| 542 | my $var = "lexical"; |
| 543 | |
| 544 | print "lexical is $var\n"; |
| 545 | print "global is $main::var\n"; |
| 546 | |
| 547 | Alternatively you can use the compiler directive our() to bring a |
| 548 | dynamic variable into the current lexical scope. |
| 549 | |
| 550 | require 5.006; # our() did not exist before 5.6 |
| 551 | use vars '$var'; |
| 552 | |
| 553 | local $var = "global"; |
| 554 | my $var = "lexical"; |
| 555 | |
| 556 | print "lexical is $var\n"; |
| 557 | |
| 558 | { |
| 559 | our $var; |
| 560 | print "global is $var\n"; |
| 561 | } |
| 562 | |
| 563 | =head2 What's the difference between deep and shallow binding? |
| 564 | |
| 565 | In deep binding, lexical variables mentioned in anonymous subroutines |
| 566 | are the same ones that were in scope when the subroutine was created. |
| 567 | In shallow binding, they are whichever variables with the same names |
| 568 | happen to be in scope when the subroutine is called. Perl always uses |
| 569 | deep binding of lexical variables (i.e., those created with my()). |
| 570 | However, dynamic variables (aka global, local, or package variables) |
| 571 | are effectively shallowly bound. Consider this just one more reason |
| 572 | not to use them. See the answer to L<"What's a closure?">. |
| 573 | |
| 574 | =head2 Why doesn't "my($foo) = E<lt>FILEE<gt>;" work right? |
| 575 | |
| 576 | C<my()> and C<local()> give list context to the right hand side |
| 577 | of C<=>. The <FH> read operation, like so many of Perl's |
| 578 | functions and operators, can tell which context it was called in and |
| 579 | behaves appropriately. In general, the scalar() function can help. |
| 580 | This function does nothing to the data itself (contrary to popular myth) |
| 581 | but rather tells its argument to behave in whatever its scalar fashion is. |
| 582 | If that function doesn't have a defined scalar behavior, this of course |
| 583 | doesn't help you (such as with sort()). |
| 584 | |
| 585 | To enforce scalar context in this particular case, however, you need |
| 586 | merely omit the parentheses: |
| 587 | |
| 588 | local($foo) = <FILE>; # WRONG |
| 589 | local($foo) = scalar(<FILE>); # ok |
| 590 | local $foo = <FILE>; # right |
| 591 | |
| 592 | You should probably be using lexical variables anyway, although the |
| 593 | issue is the same here: |
| 594 | |
| 595 | my($foo) = <FILE>; # WRONG |
| 596 | my $foo = <FILE>; # right |
| 597 | |
| 598 | =head2 How do I redefine a builtin function, operator, or method? |
| 599 | |
| 600 | Why do you want to do that? :-) |
| 601 | |
| 602 | If you want to override a predefined function, such as open(), |
| 603 | then you'll have to import the new definition from a different |
| 604 | module. See L<perlsub/"Overriding Built-in Functions">. There's |
| 605 | also an example in L<perltoot/"Class::Template">. |
| 606 | |
| 607 | If you want to overload a Perl operator, such as C<+> or C<**>, |
| 608 | then you'll want to use the C<use overload> pragma, documented |
| 609 | in L<overload>. |
| 610 | |
| 611 | If you're talking about obscuring method calls in parent classes, |
| 612 | see L<perltoot/"Overridden Methods">. |
| 613 | |
| 614 | =head2 What's the difference between calling a function as &foo and foo()? |
| 615 | |
| 616 | When you call a function as C<&foo>, you allow that function access to |
| 617 | your current @_ values, and you bypass prototypes. |
| 618 | The function doesn't get an empty @_--it gets yours! While not |
| 619 | strictly speaking a bug (it's documented that way in L<perlsub>), it |
| 620 | would be hard to consider this a feature in most cases. |
| 621 | |
| 622 | When you call your function as C<&foo()>, then you I<do> get a new @_, |
| 623 | but prototyping is still circumvented. |
| 624 | |
| 625 | Normally, you want to call a function using C<foo()>. You may only |
| 626 | omit the parentheses if the function is already known to the compiler |
| 627 | because it already saw the definition (C<use> but not C<require>), |
| 628 | or via a forward reference or C<use subs> declaration. Even in this |
| 629 | case, you get a clean @_ without any of the old values leaking through |
| 630 | where they don't belong. |
| 631 | |
| 632 | =head2 How do I create a switch or case statement? |
| 633 | |
| 634 | This is explained in more depth in the L<perlsyn>. Briefly, there's |
| 635 | no official case statement, because of the variety of tests possible |
| 636 | in Perl (numeric comparison, string comparison, glob comparison, |
| 637 | regex matching, overloaded comparisons, ...). |
| 638 | Larry couldn't decide how best to do this, so he left it out, even |
| 639 | though it's been on the wish list since perl1. |
| 640 | |
| 641 | Starting from Perl 5.8 to get switch and case one can use the |
| 642 | Switch extension and say: |
| 643 | |
| 644 | use Switch; |
| 645 | |
| 646 | after which one has switch and case. It is not as fast as it could be |
| 647 | because it's not really part of the language (it's done using source |
| 648 | filters) but it is available, and it's very flexible. |
| 649 | |
| 650 | But if one wants to use pure Perl, the general answer is to write a |
| 651 | construct like this: |
| 652 | |
| 653 | for ($variable_to_test) { |
| 654 | if (/pat1/) { } # do something |
| 655 | elsif (/pat2/) { } # do something else |
| 656 | elsif (/pat3/) { } # do something else |
| 657 | else { } # default |
| 658 | } |
| 659 | |
| 660 | Here's a simple example of a switch based on pattern matching, this |
| 661 | time lined up in a way to make it look more like a switch statement. |
| 662 | We'll do a multiway conditional based on the type of reference stored |
| 663 | in $whatchamacallit: |
| 664 | |
| 665 | SWITCH: for (ref $whatchamacallit) { |
| 666 | |
| 667 | /^$/ && die "not a reference"; |
| 668 | |
| 669 | /SCALAR/ && do { |
| 670 | print_scalar($$ref); |
| 671 | last SWITCH; |
| 672 | }; |
| 673 | |
| 674 | /ARRAY/ && do { |
| 675 | print_array(@$ref); |
| 676 | last SWITCH; |
| 677 | }; |
| 678 | |
| 679 | /HASH/ && do { |
| 680 | print_hash(%$ref); |
| 681 | last SWITCH; |
| 682 | }; |
| 683 | |
| 684 | /CODE/ && do { |
| 685 | warn "can't print function ref"; |
| 686 | last SWITCH; |
| 687 | }; |
| 688 | |
| 689 | # DEFAULT |
| 690 | |
| 691 | warn "User defined type skipped"; |
| 692 | |
| 693 | } |
| 694 | |
| 695 | See C<perlsyn/"Basic BLOCKs and Switch Statements"> for many other |
| 696 | examples in this style. |
| 697 | |
| 698 | Sometimes you should change the positions of the constant and the variable. |
| 699 | For example, let's say you wanted to test which of many answers you were |
| 700 | given, but in a case-insensitive way that also allows abbreviations. |
| 701 | You can use the following technique if the strings all start with |
| 702 | different characters or if you want to arrange the matches so that |
| 703 | one takes precedence over another, as C<"SEND"> has precedence over |
| 704 | C<"STOP"> here: |
| 705 | |
| 706 | chomp($answer = <>); |
| 707 | if ("SEND" =~ /^\Q$answer/i) { print "Action is send\n" } |
| 708 | elsif ("STOP" =~ /^\Q$answer/i) { print "Action is stop\n" } |
| 709 | elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" } |
| 710 | elsif ("LIST" =~ /^\Q$answer/i) { print "Action is list\n" } |
| 711 | elsif ("EDIT" =~ /^\Q$answer/i) { print "Action is edit\n" } |
| 712 | |
| 713 | A totally different approach is to create a hash of function references. |
| 714 | |
| 715 | my %commands = ( |
| 716 | "happy" => \&joy, |
| 717 | "sad", => \&sullen, |
| 718 | "done" => sub { die "See ya!" }, |
| 719 | "mad" => \&angry, |
| 720 | ); |
| 721 | |
| 722 | print "How are you? "; |
| 723 | chomp($string = <STDIN>); |
| 724 | if ($commands{$string}) { |
| 725 | $commands{$string}->(); |
| 726 | } else { |
| 727 | print "No such command: $string\n"; |
| 728 | } |
| 729 | |
| 730 | =head2 How can I catch accesses to undefined variables, functions, or methods? |
| 731 | |
| 732 | The AUTOLOAD method, discussed in L<perlsub/"Autoloading"> and |
| 733 | L<perltoot/"AUTOLOAD: Proxy Methods">, lets you capture calls to |
| 734 | undefined functions and methods. |
| 735 | |
| 736 | When it comes to undefined variables that would trigger a warning |
| 737 | under C<use warnings>, you can promote the warning to an error. |
| 738 | |
| 739 | use warnings FATAL => qw(uninitialized); |
| 740 | |
| 741 | =head2 Why can't a method included in this same file be found? |
| 742 | |
| 743 | Some possible reasons: your inheritance is getting confused, you've |
| 744 | misspelled the method name, or the object is of the wrong type. Check |
| 745 | out L<perltoot> for details about any of the above cases. You may |
| 746 | also use C<print ref($object)> to find out the class C<$object> was |
| 747 | blessed into. |
| 748 | |
| 749 | Another possible reason for problems is because you've used the |
| 750 | indirect object syntax (eg, C<find Guru "Samy">) on a class name |
| 751 | before Perl has seen that such a package exists. It's wisest to make |
| 752 | sure your packages are all defined before you start using them, which |
| 753 | will be taken care of if you use the C<use> statement instead of |
| 754 | C<require>. If not, make sure to use arrow notation (eg., |
| 755 | C<< Guru->find("Samy") >>) instead. Object notation is explained in |
| 756 | L<perlobj>. |
| 757 | |
| 758 | Make sure to read about creating modules in L<perlmod> and |
| 759 | the perils of indirect objects in L<perlobj/"Method Invocation">. |
| 760 | |
| 761 | =head2 How can I find out my current package? |
| 762 | |
| 763 | If you're just a random program, you can do this to find |
| 764 | out what the currently compiled package is: |
| 765 | |
| 766 | my $packname = __PACKAGE__; |
| 767 | |
| 768 | But, if you're a method and you want to print an error message |
| 769 | that includes the kind of object you were called on (which is |
| 770 | not necessarily the same as the one in which you were compiled): |
| 771 | |
| 772 | sub amethod { |
| 773 | my $self = shift; |
| 774 | my $class = ref($self) || $self; |
| 775 | warn "called me from a $class object"; |
| 776 | } |
| 777 | |
| 778 | =head2 How can I comment out a large block of perl code? |
| 779 | |
| 780 | You can use embedded POD to discard it. Enclose the blocks you want |
| 781 | to comment out in POD markers. The <=begin> directive marks a section |
| 782 | for a specific formatter. Use the C<comment> format, which no formatter |
| 783 | should claim to understand (by policy). Mark the end of the block |
| 784 | with <=end>. |
| 785 | |
| 786 | # program is here |
| 787 | |
| 788 | =begin comment |
| 789 | |
| 790 | all of this stuff |
| 791 | |
| 792 | here will be ignored |
| 793 | by everyone |
| 794 | |
| 795 | =end comment |
| 796 | |
| 797 | =cut |
| 798 | |
| 799 | # program continues |
| 800 | |
| 801 | The pod directives cannot go just anywhere. You must put a |
| 802 | pod directive where the parser is expecting a new statement, |
| 803 | not just in the middle of an expression or some other |
| 804 | arbitrary grammar production. |
| 805 | |
| 806 | See L<perlpod> for more details. |
| 807 | |
| 808 | =head2 How do I clear a package? |
| 809 | |
| 810 | Use this code, provided by Mark-Jason Dominus: |
| 811 | |
| 812 | sub scrub_package { |
| 813 | no strict 'refs'; |
| 814 | my $pack = shift; |
| 815 | die "Shouldn't delete main package" |
| 816 | if $pack eq "" || $pack eq "main"; |
| 817 | my $stash = *{$pack . '::'}{HASH}; |
| 818 | my $name; |
| 819 | foreach $name (keys %$stash) { |
| 820 | my $fullname = $pack . '::' . $name; |
| 821 | # Get rid of everything with that name. |
| 822 | undef $$fullname; |
| 823 | undef @$fullname; |
| 824 | undef %$fullname; |
| 825 | undef &$fullname; |
| 826 | undef *$fullname; |
| 827 | } |
| 828 | } |
| 829 | |
| 830 | Or, if you're using a recent release of Perl, you can |
| 831 | just use the Symbol::delete_package() function instead. |
| 832 | |
| 833 | =head2 How can I use a variable as a variable name? |
| 834 | |
| 835 | Beginners often think they want to have a variable contain the name |
| 836 | of a variable. |
| 837 | |
| 838 | $fred = 23; |
| 839 | $varname = "fred"; |
| 840 | ++$$varname; # $fred now 24 |
| 841 | |
| 842 | This works I<sometimes>, but it is a very bad idea for two reasons. |
| 843 | |
| 844 | The first reason is that this technique I<only works on global |
| 845 | variables>. That means that if $fred is a lexical variable created |
| 846 | with my() in the above example, the code wouldn't work at all: you'd |
| 847 | accidentally access the global and skip right over the private lexical |
| 848 | altogether. Global variables are bad because they can easily collide |
| 849 | accidentally and in general make for non-scalable and confusing code. |
| 850 | |
| 851 | Symbolic references are forbidden under the C<use strict> pragma. |
| 852 | They are not true references and consequently are not reference counted |
| 853 | or garbage collected. |
| 854 | |
| 855 | The other reason why using a variable to hold the name of another |
| 856 | variable is a bad idea is that the question often stems from a lack of |
| 857 | understanding of Perl data structures, particularly hashes. By using |
| 858 | symbolic references, you are just using the package's symbol-table hash |
| 859 | (like C<%main::>) instead of a user-defined hash. The solution is to |
| 860 | use your own hash or a real reference instead. |
| 861 | |
| 862 | $USER_VARS{"fred"} = 23; |
| 863 | $varname = "fred"; |
| 864 | $USER_VARS{$varname}++; # not $$varname++ |
| 865 | |
| 866 | There we're using the %USER_VARS hash instead of symbolic references. |
| 867 | Sometimes this comes up in reading strings from the user with variable |
| 868 | references and wanting to expand them to the values of your perl |
| 869 | program's variables. This is also a bad idea because it conflates the |
| 870 | program-addressable namespace and the user-addressable one. Instead of |
| 871 | reading a string and expanding it to the actual contents of your program's |
| 872 | own variables: |
| 873 | |
| 874 | $str = 'this has a $fred and $barney in it'; |
| 875 | $str =~ s/(\$\w+)/$1/eeg; # need double eval |
| 876 | |
| 877 | it would be better to keep a hash around like %USER_VARS and have |
| 878 | variable references actually refer to entries in that hash: |
| 879 | |
| 880 | $str =~ s/\$(\w+)/$USER_VARS{$1}/g; # no /e here at all |
| 881 | |
| 882 | That's faster, cleaner, and safer than the previous approach. Of course, |
| 883 | you don't need to use a dollar sign. You could use your own scheme to |
| 884 | make it less confusing, like bracketed percent symbols, etc. |
| 885 | |
| 886 | $str = 'this has a %fred% and %barney% in it'; |
| 887 | $str =~ s/%(\w+)%/$USER_VARS{$1}/g; # no /e here at all |
| 888 | |
| 889 | Another reason that folks sometimes think they want a variable to |
| 890 | contain the name of a variable is because they don't know how to build |
| 891 | proper data structures using hashes. For example, let's say they |
| 892 | wanted two hashes in their program: %fred and %barney, and that they |
| 893 | wanted to use another scalar variable to refer to those by name. |
| 894 | |
| 895 | $name = "fred"; |
| 896 | $$name{WIFE} = "wilma"; # set %fred |
| 897 | |
| 898 | $name = "barney"; |
| 899 | $$name{WIFE} = "betty"; # set %barney |
| 900 | |
| 901 | This is still a symbolic reference, and is still saddled with the |
| 902 | problems enumerated above. It would be far better to write: |
| 903 | |
| 904 | $folks{"fred"}{WIFE} = "wilma"; |
| 905 | $folks{"barney"}{WIFE} = "betty"; |
| 906 | |
| 907 | And just use a multilevel hash to start with. |
| 908 | |
| 909 | The only times that you absolutely I<must> use symbolic references are |
| 910 | when you really must refer to the symbol table. This may be because it's |
| 911 | something that can't take a real reference to, such as a format name. |
| 912 | Doing so may also be important for method calls, since these always go |
| 913 | through the symbol table for resolution. |
| 914 | |
| 915 | In those cases, you would turn off C<strict 'refs'> temporarily so you |
| 916 | can play around with the symbol table. For example: |
| 917 | |
| 918 | @colors = qw(red blue green yellow orange purple violet); |
| 919 | for my $name (@colors) { |
| 920 | no strict 'refs'; # renege for the block |
| 921 | *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; |
| 922 | } |
| 923 | |
| 924 | All those functions (red(), blue(), green(), etc.) appear to be separate, |
| 925 | but the real code in the closure actually was compiled only once. |
| 926 | |
| 927 | So, sometimes you might want to use symbolic references to directly |
| 928 | manipulate the symbol table. This doesn't matter for formats, handles, and |
| 929 | subroutines, because they are always global--you can't use my() on them. |
| 930 | For scalars, arrays, and hashes, though--and usually for subroutines-- |
| 931 | you probably only want to use hard references. |
| 932 | |
| 933 | =head2 What does "bad interpreter" mean? |
| 934 | |
| 935 | (contributed by brian d foy) |
| 936 | |
| 937 | The "bad interpreter" message comes from the shell, not perl. The |
| 938 | actual message may vary depending on your platform, shell, and locale |
| 939 | settings. |
| 940 | |
| 941 | If you see "bad interpreter - no such file or directory", the first |
| 942 | line in your perl script (the "shebang" line) does not contain the |
| 943 | right path to perl (or any other program capable of running scripts). |
| 944 | Sometimes this happens when you move the script from one machine to |
| 945 | another and each machine has a different path to perl---/usr/bin/perl |
| 946 | versus /usr/local/bin/perl for instance. It may also indicate |
| 947 | that the source machine has CRLF line terminators and the |
| 948 | destination machine has LF only: the shell tries to find |
| 949 | /usr/bin/perl<CR>, but can't. |
| 950 | |
| 951 | If you see "bad interpreter: Permission denied", you need to make your |
| 952 | script executable. |
| 953 | |
| 954 | In either case, you should still be able to run the scripts with perl |
| 955 | explicitly: |
| 956 | |
| 957 | % perl script.pl |
| 958 | |
| 959 | If you get a message like "perl: command not found", perl is not in |
| 960 | your PATH, which might also mean that the location of perl is not |
| 961 | where you expect it so you need to adjust your shebang line. |
| 962 | |
| 963 | =head1 AUTHOR AND COPYRIGHT |
| 964 | |
| 965 | Copyright (c) 1997-2006 Tom Christiansen, Nathan Torkington, and |
| 966 | other authors as noted. All rights reserved. |
| 967 | |
| 968 | This documentation is free; you can redistribute it and/or modify it |
| 969 | under the same terms as Perl itself. |
| 970 | |
| 971 | Irrespective of its distribution, all code examples in this file |
| 972 | are hereby placed into the public domain. You are permitted and |
| 973 | encouraged to use this code in your own programs for fun |
| 974 | or for profit as you see fit. A simple comment in the code giving |
| 975 | credit would be courteous but is not required. |
| 976 | |