Commit | Line | Data |
---|---|---|
86530b38 AT |
1 | =head1 NAME |
2 | ||
3 | perltie - how to hide an object class in a simple variable | |
4 | ||
5 | =head1 SYNOPSIS | |
6 | ||
7 | tie VARIABLE, CLASSNAME, LIST | |
8 | ||
9 | $object = tied VARIABLE | |
10 | ||
11 | untie VARIABLE | |
12 | ||
13 | =head1 DESCRIPTION | |
14 | ||
15 | Prior to release 5.0 of Perl, a programmer could use dbmopen() | |
16 | to connect an on-disk database in the standard Unix dbm(3x) | |
17 | format magically to a %HASH in their program. However, their Perl was either | |
18 | built with one particular dbm library or another, but not both, and | |
19 | you couldn't extend this mechanism to other packages or types of variables. | |
20 | ||
21 | Now you can. | |
22 | ||
23 | The tie() function binds a variable to a class (package) that will provide | |
24 | the implementation for access methods for that variable. Once this magic | |
25 | has been performed, accessing a tied variable automatically triggers | |
26 | method calls in the proper class. The complexity of the class is | |
27 | hidden behind magic methods calls. The method names are in ALL CAPS, | |
28 | which is a convention that Perl uses to indicate that they're called | |
29 | implicitly rather than explicitly--just like the BEGIN() and END() | |
30 | functions. | |
31 | ||
32 | In the tie() call, C<VARIABLE> is the name of the variable to be | |
33 | enchanted. C<CLASSNAME> is the name of a class implementing objects of | |
34 | the correct type. Any additional arguments in the C<LIST> are passed to | |
35 | the appropriate constructor method for that class--meaning TIESCALAR(), | |
36 | TIEARRAY(), TIEHASH(), or TIEHANDLE(). (Typically these are arguments | |
37 | such as might be passed to the dbminit() function of C.) The object | |
38 | returned by the "new" method is also returned by the tie() function, | |
39 | which would be useful if you wanted to access other methods in | |
40 | C<CLASSNAME>. (You don't actually have to return a reference to a right | |
41 | "type" (e.g., HASH or C<CLASSNAME>) so long as it's a properly blessed | |
42 | object.) You can also retrieve a reference to the underlying object | |
43 | using the tied() function. | |
44 | ||
45 | Unlike dbmopen(), the tie() function will not C<use> or C<require> a module | |
46 | for you--you need to do that explicitly yourself. | |
47 | ||
48 | =head2 Tying Scalars | |
49 | ||
50 | A class implementing a tied scalar should define the following methods: | |
51 | TIESCALAR, FETCH, STORE, and possibly UNTIE and/or DESTROY. | |
52 | ||
53 | Let's look at each in turn, using as an example a tie class for | |
54 | scalars that allows the user to do something like: | |
55 | ||
56 | tie $his_speed, 'Nice', getppid(); | |
57 | tie $my_speed, 'Nice', $$; | |
58 | ||
59 | And now whenever either of those variables is accessed, its current | |
60 | system priority is retrieved and returned. If those variables are set, | |
61 | then the process's priority is changed! | |
62 | ||
63 | We'll use Jarkko Hietaniemi <F<jhi@iki.fi>>'s BSD::Resource class (not | |
64 | included) to access the PRIO_PROCESS, PRIO_MIN, and PRIO_MAX constants | |
65 | from your system, as well as the getpriority() and setpriority() system | |
66 | calls. Here's the preamble of the class. | |
67 | ||
68 | package Nice; | |
69 | use Carp; | |
70 | use BSD::Resource; | |
71 | use strict; | |
72 | $Nice::DEBUG = 0 unless defined $Nice::DEBUG; | |
73 | ||
74 | =over 4 | |
75 | ||
76 | =item TIESCALAR classname, LIST | |
77 | ||
78 | This is the constructor for the class. That means it is | |
79 | expected to return a blessed reference to a new scalar | |
80 | (probably anonymous) that it's creating. For example: | |
81 | ||
82 | sub TIESCALAR { | |
83 | my $class = shift; | |
84 | my $pid = shift || $$; # 0 means me | |
85 | ||
86 | if ($pid !~ /^\d+$/) { | |
87 | carp "Nice::Tie::Scalar got non-numeric pid $pid" if $^W; | |
88 | return undef; | |
89 | } | |
90 | ||
91 | unless (kill 0, $pid) { # EPERM or ERSCH, no doubt | |
92 | carp "Nice::Tie::Scalar got bad pid $pid: $!" if $^W; | |
93 | return undef; | |
94 | } | |
95 | ||
96 | return bless \$pid, $class; | |
97 | } | |
98 | ||
99 | This tie class has chosen to return an error rather than raising an | |
100 | exception if its constructor should fail. While this is how dbmopen() works, | |
101 | other classes may well not wish to be so forgiving. It checks the global | |
102 | variable C<$^W> to see whether to emit a bit of noise anyway. | |
103 | ||
104 | =item FETCH this | |
105 | ||
106 | This method will be triggered every time the tied variable is accessed | |
107 | (read). It takes no arguments beyond its self reference, which is the | |
108 | object representing the scalar we're dealing with. Because in this case | |
109 | we're using just a SCALAR ref for the tied scalar object, a simple $$self | |
110 | allows the method to get at the real value stored there. In our example | |
111 | below, that real value is the process ID to which we've tied our variable. | |
112 | ||
113 | sub FETCH { | |
114 | my $self = shift; | |
115 | confess "wrong type" unless ref $self; | |
116 | croak "usage error" if @_; | |
117 | my $nicety; | |
118 | local($!) = 0; | |
119 | $nicety = getpriority(PRIO_PROCESS, $$self); | |
120 | if ($!) { croak "getpriority failed: $!" } | |
121 | return $nicety; | |
122 | } | |
123 | ||
124 | This time we've decided to blow up (raise an exception) if the renice | |
125 | fails--there's no place for us to return an error otherwise, and it's | |
126 | probably the right thing to do. | |
127 | ||
128 | =item STORE this, value | |
129 | ||
130 | This method will be triggered every time the tied variable is set | |
131 | (assigned). Beyond its self reference, it also expects one (and only one) | |
132 | argument--the new value the user is trying to assign. | |
133 | ||
134 | sub STORE { | |
135 | my $self = shift; | |
136 | confess "wrong type" unless ref $self; | |
137 | my $new_nicety = shift; | |
138 | croak "usage error" if @_; | |
139 | ||
140 | if ($new_nicety < PRIO_MIN) { | |
141 | carp sprintf | |
142 | "WARNING: priority %d less than minimum system priority %d", | |
143 | $new_nicety, PRIO_MIN if $^W; | |
144 | $new_nicety = PRIO_MIN; | |
145 | } | |
146 | ||
147 | if ($new_nicety > PRIO_MAX) { | |
148 | carp sprintf | |
149 | "WARNING: priority %d greater than maximum system priority %d", | |
150 | $new_nicety, PRIO_MAX if $^W; | |
151 | $new_nicety = PRIO_MAX; | |
152 | } | |
153 | ||
154 | unless (defined setpriority(PRIO_PROCESS, $$self, $new_nicety)) { | |
155 | confess "setpriority failed: $!"; | |
156 | } | |
157 | return $new_nicety; | |
158 | } | |
159 | ||
160 | =item UNTIE this | |
161 | ||
162 | This method will be triggered when the C<untie> occurs. This can be useful | |
163 | if the class needs to know when no further calls will be made. (Except DESTROY | |
164 | of course.) See L<The C<untie> Gotcha> below for more details. | |
165 | ||
166 | =item DESTROY this | |
167 | ||
168 | This method will be triggered when the tied variable needs to be destructed. | |
169 | As with other object classes, such a method is seldom necessary, because Perl | |
170 | deallocates its moribund object's memory for you automatically--this isn't | |
171 | C++, you know. We'll use a DESTROY method here for debugging purposes only. | |
172 | ||
173 | sub DESTROY { | |
174 | my $self = shift; | |
175 | confess "wrong type" unless ref $self; | |
176 | carp "[ Nice::DESTROY pid $$self ]" if $Nice::DEBUG; | |
177 | } | |
178 | ||
179 | =back | |
180 | ||
181 | That's about all there is to it. Actually, it's more than all there | |
182 | is to it, because we've done a few nice things here for the sake | |
183 | of completeness, robustness, and general aesthetics. Simpler | |
184 | TIESCALAR classes are certainly possible. | |
185 | ||
186 | =head2 Tying Arrays | |
187 | ||
188 | A class implementing a tied ordinary array should define the following | |
189 | methods: TIEARRAY, FETCH, STORE, FETCHSIZE, STORESIZE and perhaps UNTIE and/or DESTROY. | |
190 | ||
191 | FETCHSIZE and STORESIZE are used to provide C<$#array> and | |
192 | equivalent C<scalar(@array)> access. | |
193 | ||
194 | The methods POP, PUSH, SHIFT, UNSHIFT, SPLICE, DELETE, and EXISTS are | |
195 | required if the perl operator with the corresponding (but lowercase) name | |
196 | is to operate on the tied array. The B<Tie::Array> class can be used as a | |
197 | base class to implement the first five of these in terms of the basic | |
198 | methods above. The default implementations of DELETE and EXISTS in | |
199 | B<Tie::Array> simply C<croak>. | |
200 | ||
201 | In addition EXTEND will be called when perl would have pre-extended | |
202 | allocation in a real array. | |
203 | ||
204 | For this discussion, we'll implement an array whose elements are a fixed | |
205 | size at creation. If you try to create an element larger than the fixed | |
206 | size, you'll take an exception. For example: | |
207 | ||
208 | use FixedElem_Array; | |
209 | tie @array, 'FixedElem_Array', 3; | |
210 | $array[0] = 'cat'; # ok. | |
211 | $array[1] = 'dogs'; # exception, length('dogs') > 3. | |
212 | ||
213 | The preamble code for the class is as follows: | |
214 | ||
215 | package FixedElem_Array; | |
216 | use Carp; | |
217 | use strict; | |
218 | ||
219 | =over 4 | |
220 | ||
221 | =item TIEARRAY classname, LIST | |
222 | ||
223 | This is the constructor for the class. That means it is expected to | |
224 | return a blessed reference through which the new array (probably an | |
225 | anonymous ARRAY ref) will be accessed. | |
226 | ||
227 | In our example, just to show you that you don't I<really> have to return an | |
228 | ARRAY reference, we'll choose a HASH reference to represent our object. | |
229 | A HASH works out well as a generic record type: the C<{ELEMSIZE}> field will | |
230 | store the maximum element size allowed, and the C<{ARRAY}> field will hold the | |
231 | true ARRAY ref. If someone outside the class tries to dereference the | |
232 | object returned (doubtless thinking it an ARRAY ref), they'll blow up. | |
233 | This just goes to show you that you should respect an object's privacy. | |
234 | ||
235 | sub TIEARRAY { | |
236 | my $class = shift; | |
237 | my $elemsize = shift; | |
238 | if ( @_ || $elemsize =~ /\D/ ) { | |
239 | croak "usage: tie ARRAY, '" . __PACKAGE__ . "', elem_size"; | |
240 | } | |
241 | return bless { | |
242 | ELEMSIZE => $elemsize, | |
243 | ARRAY => [], | |
244 | }, $class; | |
245 | } | |
246 | ||
247 | =item FETCH this, index | |
248 | ||
249 | This method will be triggered every time an individual element the tied array | |
250 | is accessed (read). It takes one argument beyond its self reference: the | |
251 | index whose value we're trying to fetch. | |
252 | ||
253 | sub FETCH { | |
254 | my $self = shift; | |
255 | my $index = shift; | |
256 | return $self->{ARRAY}->[$index]; | |
257 | } | |
258 | ||
259 | If a negative array index is used to read from an array, the index | |
260 | will be translated to a positive one internally by calling FETCHSIZE | |
261 | before being passed to FETCH. | |
262 | ||
263 | As you may have noticed, the name of the FETCH method (et al.) is the same | |
264 | for all accesses, even though the constructors differ in names (TIESCALAR | |
265 | vs TIEARRAY). While in theory you could have the same class servicing | |
266 | several tied types, in practice this becomes cumbersome, and it's easiest | |
267 | to keep them at simply one tie type per class. | |
268 | ||
269 | =item STORE this, index, value | |
270 | ||
271 | This method will be triggered every time an element in the tied array is set | |
272 | (written). It takes two arguments beyond its self reference: the index at | |
273 | which we're trying to store something and the value we're trying to put | |
274 | there. | |
275 | ||
276 | In our example, C<undef> is really C<$self-E<gt>{ELEMSIZE}> number of | |
277 | spaces so we have a little more work to do here: | |
278 | ||
279 | sub STORE { | |
280 | my $self = shift; | |
281 | my( $index, $value ) = @_; | |
282 | if ( length $value > $self->{ELEMSIZE} ) { | |
283 | croak "length of $value is greater than $self->{ELEMSIZE}"; | |
284 | } | |
285 | # fill in the blanks | |
286 | $self->EXTEND( $index ) if $index > $self->FETCHSIZE(); | |
287 | # right justify to keep element size for smaller elements | |
288 | $self->{ARRAY}->[$index] = sprintf "%$self->{ELEMSIZE}s", $value; | |
289 | } | |
290 | ||
291 | Negative indexes are treated the same as with FETCH. | |
292 | ||
293 | =item FETCHSIZE this | |
294 | ||
295 | Returns the total number of items in the tied array associated with | |
296 | object I<this>. (Equivalent to C<scalar(@array)>). For example: | |
297 | ||
298 | sub FETCHSIZE { | |
299 | my $self = shift; | |
300 | return scalar @{$self->{ARRAY}}; | |
301 | } | |
302 | ||
303 | =item STORESIZE this, count | |
304 | ||
305 | Sets the total number of items in the tied array associated with | |
306 | object I<this> to be I<count>. If this makes the array larger then | |
307 | class's mapping of C<undef> should be returned for new positions. | |
308 | If the array becomes smaller then entries beyond count should be | |
309 | deleted. | |
310 | ||
311 | In our example, 'undef' is really an element containing | |
312 | C<$self-E<gt>{ELEMSIZE}> number of spaces. Observe: | |
313 | ||
314 | sub STORESIZE { | |
315 | my $self = shift; | |
316 | my $count = shift; | |
317 | if ( $count > $self->FETCHSIZE() ) { | |
318 | foreach ( $count - $self->FETCHSIZE() .. $count ) { | |
319 | $self->STORE( $_, '' ); | |
320 | } | |
321 | } elsif ( $count < $self->FETCHSIZE() ) { | |
322 | foreach ( 0 .. $self->FETCHSIZE() - $count - 2 ) { | |
323 | $self->POP(); | |
324 | } | |
325 | } | |
326 | } | |
327 | ||
328 | =item EXTEND this, count | |
329 | ||
330 | Informative call that array is likely to grow to have I<count> entries. | |
331 | Can be used to optimize allocation. This method need do nothing. | |
332 | ||
333 | In our example, we want to make sure there are no blank (C<undef>) | |
334 | entries, so C<EXTEND> will make use of C<STORESIZE> to fill elements | |
335 | as needed: | |
336 | ||
337 | sub EXTEND { | |
338 | my $self = shift; | |
339 | my $count = shift; | |
340 | $self->STORESIZE( $count ); | |
341 | } | |
342 | ||
343 | =item EXISTS this, key | |
344 | ||
345 | Verify that the element at index I<key> exists in the tied array I<this>. | |
346 | ||
347 | In our example, we will determine that if an element consists of | |
348 | C<$self-E<gt>{ELEMSIZE}> spaces only, it does not exist: | |
349 | ||
350 | sub EXISTS { | |
351 | my $self = shift; | |
352 | my $index = shift; | |
353 | return 0 if ! defined $self->{ARRAY}->[$index] || | |
354 | $self->{ARRAY}->[$index] eq ' ' x $self->{ELEMSIZE}; | |
355 | return 1; | |
356 | } | |
357 | ||
358 | =item DELETE this, key | |
359 | ||
360 | Delete the element at index I<key> from the tied array I<this>. | |
361 | ||
362 | In our example, a deleted item is C<$self->{ELEMSIZE}> spaces: | |
363 | ||
364 | sub DELETE { | |
365 | my $self = shift; | |
366 | my $index = shift; | |
367 | return $self->STORE( $index, '' ); | |
368 | } | |
369 | ||
370 | =item CLEAR this | |
371 | ||
372 | Clear (remove, delete, ...) all values from the tied array associated with | |
373 | object I<this>. For example: | |
374 | ||
375 | sub CLEAR { | |
376 | my $self = shift; | |
377 | return $self->{ARRAY} = []; | |
378 | } | |
379 | ||
380 | =item PUSH this, LIST | |
381 | ||
382 | Append elements of I<LIST> to the array. For example: | |
383 | ||
384 | sub PUSH { | |
385 | my $self = shift; | |
386 | my @list = @_; | |
387 | my $last = $self->FETCHSIZE(); | |
388 | $self->STORE( $last + $_, $list[$_] ) foreach 0 .. $#list; | |
389 | return $self->FETCHSIZE(); | |
390 | } | |
391 | ||
392 | =item POP this | |
393 | ||
394 | Remove last element of the array and return it. For example: | |
395 | ||
396 | sub POP { | |
397 | my $self = shift; | |
398 | return pop @{$self->{ARRAY}}; | |
399 | } | |
400 | ||
401 | =item SHIFT this | |
402 | ||
403 | Remove the first element of the array (shifting other elements down) | |
404 | and return it. For example: | |
405 | ||
406 | sub SHIFT { | |
407 | my $self = shift; | |
408 | return shift @{$self->{ARRAY}}; | |
409 | } | |
410 | ||
411 | =item UNSHIFT this, LIST | |
412 | ||
413 | Insert LIST elements at the beginning of the array, moving existing elements | |
414 | up to make room. For example: | |
415 | ||
416 | sub UNSHIFT { | |
417 | my $self = shift; | |
418 | my @list = @_; | |
419 | my $size = scalar( @list ); | |
420 | # make room for our list | |
421 | @{$self->{ARRAY}}[ $size .. $#{$self->{ARRAY}} + $size ] | |
422 | = @{$self->{ARRAY}}; | |
423 | $self->STORE( $_, $list[$_] ) foreach 0 .. $#list; | |
424 | } | |
425 | ||
426 | =item SPLICE this, offset, length, LIST | |
427 | ||
428 | Perform the equivalent of C<splice> on the array. | |
429 | ||
430 | I<offset> is optional and defaults to zero, negative values count back | |
431 | from the end of the array. | |
432 | ||
433 | I<length> is optional and defaults to rest of the array. | |
434 | ||
435 | I<LIST> may be empty. | |
436 | ||
437 | Returns a list of the original I<length> elements at I<offset>. | |
438 | ||
439 | In our example, we'll use a little shortcut if there is a I<LIST>: | |
440 | ||
441 | sub SPLICE { | |
442 | my $self = shift; | |
443 | my $offset = shift || 0; | |
444 | my $length = shift || $self->FETCHSIZE() - $offset; | |
445 | my @list = (); | |
446 | if ( @_ ) { | |
447 | tie @list, __PACKAGE__, $self->{ELEMSIZE}; | |
448 | @list = @_; | |
449 | } | |
450 | return splice @{$self->{ARRAY}}, $offset, $length, @list; | |
451 | } | |
452 | ||
453 | =item UNTIE this | |
454 | ||
455 | Will be called when C<untie> happens. (See L<The C<untie> Gotcha> below.) | |
456 | ||
457 | =item DESTROY this | |
458 | ||
459 | This method will be triggered when the tied variable needs to be destructed. | |
460 | As with the scalar tie class, this is almost never needed in a | |
461 | language that does its own garbage collection, so this time we'll | |
462 | just leave it out. | |
463 | ||
464 | =back | |
465 | ||
466 | =head2 Tying Hashes | |
467 | ||
468 | Hashes were the first Perl data type to be tied (see dbmopen()). A class | |
469 | implementing a tied hash should define the following methods: TIEHASH is | |
470 | the constructor. FETCH and STORE access the key and value pairs. EXISTS | |
471 | reports whether a key is present in the hash, and DELETE deletes one. | |
472 | CLEAR empties the hash by deleting all the key and value pairs. FIRSTKEY | |
473 | and NEXTKEY implement the keys() and each() functions to iterate over all | |
474 | the keys. UNTIE is called when C<untie> happens, and DESTROY is called when | |
475 | the tied variable is garbage collected. | |
476 | ||
477 | If this seems like a lot, then feel free to inherit from merely the | |
478 | standard Tie::StdHash module for most of your methods, redefining only the | |
479 | interesting ones. See L<Tie::Hash> for details. | |
480 | ||
481 | Remember that Perl distinguishes between a key not existing in the hash, | |
482 | and the key existing in the hash but having a corresponding value of | |
483 | C<undef>. The two possibilities can be tested with the C<exists()> and | |
484 | C<defined()> functions. | |
485 | ||
486 | Here's an example of a somewhat interesting tied hash class: it gives you | |
487 | a hash representing a particular user's dot files. You index into the hash | |
488 | with the name of the file (minus the dot) and you get back that dot file's | |
489 | contents. For example: | |
490 | ||
491 | use DotFiles; | |
492 | tie %dot, 'DotFiles'; | |
493 | if ( $dot{profile} =~ /MANPATH/ || | |
494 | $dot{login} =~ /MANPATH/ || | |
495 | $dot{cshrc} =~ /MANPATH/ ) | |
496 | { | |
497 | print "you seem to set your MANPATH\n"; | |
498 | } | |
499 | ||
500 | Or here's another sample of using our tied class: | |
501 | ||
502 | tie %him, 'DotFiles', 'daemon'; | |
503 | foreach $f ( keys %him ) { | |
504 | printf "daemon dot file %s is size %d\n", | |
505 | $f, length $him{$f}; | |
506 | } | |
507 | ||
508 | In our tied hash DotFiles example, we use a regular | |
509 | hash for the object containing several important | |
510 | fields, of which only the C<{LIST}> field will be what the | |
511 | user thinks of as the real hash. | |
512 | ||
513 | =over 5 | |
514 | ||
515 | =item USER | |
516 | ||
517 | whose dot files this object represents | |
518 | ||
519 | =item HOME | |
520 | ||
521 | where those dot files live | |
522 | ||
523 | =item CLOBBER | |
524 | ||
525 | whether we should try to change or remove those dot files | |
526 | ||
527 | =item LIST | |
528 | ||
529 | the hash of dot file names and content mappings | |
530 | ||
531 | =back | |
532 | ||
533 | Here's the start of F<Dotfiles.pm>: | |
534 | ||
535 | package DotFiles; | |
536 | use Carp; | |
537 | sub whowasi { (caller(1))[3] . '()' } | |
538 | my $DEBUG = 0; | |
539 | sub debug { $DEBUG = @_ ? shift : 1 } | |
540 | ||
541 | For our example, we want to be able to emit debugging info to help in tracing | |
542 | during development. We keep also one convenience function around | |
543 | internally to help print out warnings; whowasi() returns the function name | |
544 | that calls it. | |
545 | ||
546 | Here are the methods for the DotFiles tied hash. | |
547 | ||
548 | =over 4 | |
549 | ||
550 | =item TIEHASH classname, LIST | |
551 | ||
552 | This is the constructor for the class. That means it is expected to | |
553 | return a blessed reference through which the new object (probably but not | |
554 | necessarily an anonymous hash) will be accessed. | |
555 | ||
556 | Here's the constructor: | |
557 | ||
558 | sub TIEHASH { | |
559 | my $self = shift; | |
560 | my $user = shift || $>; | |
561 | my $dotdir = shift || ''; | |
562 | croak "usage: @{[&whowasi]} [USER [DOTDIR]]" if @_; | |
563 | $user = getpwuid($user) if $user =~ /^\d+$/; | |
564 | my $dir = (getpwnam($user))[7] | |
565 | || croak "@{[&whowasi]}: no user $user"; | |
566 | $dir .= "/$dotdir" if $dotdir; | |
567 | ||
568 | my $node = { | |
569 | USER => $user, | |
570 | HOME => $dir, | |
571 | LIST => {}, | |
572 | CLOBBER => 0, | |
573 | }; | |
574 | ||
575 | opendir(DIR, $dir) | |
576 | || croak "@{[&whowasi]}: can't opendir $dir: $!"; | |
577 | foreach $dot ( grep /^\./ && -f "$dir/$_", readdir(DIR)) { | |
578 | $dot =~ s/^\.//; | |
579 | $node->{LIST}{$dot} = undef; | |
580 | } | |
581 | closedir DIR; | |
582 | return bless $node, $self; | |
583 | } | |
584 | ||
585 | It's probably worth mentioning that if you're going to filetest the | |
586 | return values out of a readdir, you'd better prepend the directory | |
587 | in question. Otherwise, because we didn't chdir() there, it would | |
588 | have been testing the wrong file. | |
589 | ||
590 | =item FETCH this, key | |
591 | ||
592 | This method will be triggered every time an element in the tied hash is | |
593 | accessed (read). It takes one argument beyond its self reference: the key | |
594 | whose value we're trying to fetch. | |
595 | ||
596 | Here's the fetch for our DotFiles example. | |
597 | ||
598 | sub FETCH { | |
599 | carp &whowasi if $DEBUG; | |
600 | my $self = shift; | |
601 | my $dot = shift; | |
602 | my $dir = $self->{HOME}; | |
603 | my $file = "$dir/.$dot"; | |
604 | ||
605 | unless (exists $self->{LIST}->{$dot} || -f $file) { | |
606 | carp "@{[&whowasi]}: no $dot file" if $DEBUG; | |
607 | return undef; | |
608 | } | |
609 | ||
610 | if (defined $self->{LIST}->{$dot}) { | |
611 | return $self->{LIST}->{$dot}; | |
612 | } else { | |
613 | return $self->{LIST}->{$dot} = `cat $dir/.$dot`; | |
614 | } | |
615 | } | |
616 | ||
617 | It was easy to write by having it call the Unix cat(1) command, but it | |
618 | would probably be more portable to open the file manually (and somewhat | |
619 | more efficient). Of course, because dot files are a Unixy concept, we're | |
620 | not that concerned. | |
621 | ||
622 | =item STORE this, key, value | |
623 | ||
624 | This method will be triggered every time an element in the tied hash is set | |
625 | (written). It takes two arguments beyond its self reference: the index at | |
626 | which we're trying to store something, and the value we're trying to put | |
627 | there. | |
628 | ||
629 | Here in our DotFiles example, we'll be careful not to let | |
630 | them try to overwrite the file unless they've called the clobber() | |
631 | method on the original object reference returned by tie(). | |
632 | ||
633 | sub STORE { | |
634 | carp &whowasi if $DEBUG; | |
635 | my $self = shift; | |
636 | my $dot = shift; | |
637 | my $value = shift; | |
638 | my $file = $self->{HOME} . "/.$dot"; | |
639 | my $user = $self->{USER}; | |
640 | ||
641 | croak "@{[&whowasi]}: $file not clobberable" | |
642 | unless $self->{CLOBBER}; | |
643 | ||
644 | open(F, "> $file") || croak "can't open $file: $!"; | |
645 | print F $value; | |
646 | close(F); | |
647 | } | |
648 | ||
649 | If they wanted to clobber something, they might say: | |
650 | ||
651 | $ob = tie %daemon_dots, 'daemon'; | |
652 | $ob->clobber(1); | |
653 | $daemon_dots{signature} = "A true daemon\n"; | |
654 | ||
655 | Another way to lay hands on a reference to the underlying object is to | |
656 | use the tied() function, so they might alternately have set clobber | |
657 | using: | |
658 | ||
659 | tie %daemon_dots, 'daemon'; | |
660 | tied(%daemon_dots)->clobber(1); | |
661 | ||
662 | The clobber method is simply: | |
663 | ||
664 | sub clobber { | |
665 | my $self = shift; | |
666 | $self->{CLOBBER} = @_ ? shift : 1; | |
667 | } | |
668 | ||
669 | =item DELETE this, key | |
670 | ||
671 | This method is triggered when we remove an element from the hash, | |
672 | typically by using the delete() function. Again, we'll | |
673 | be careful to check whether they really want to clobber files. | |
674 | ||
675 | sub DELETE { | |
676 | carp &whowasi if $DEBUG; | |
677 | ||
678 | my $self = shift; | |
679 | my $dot = shift; | |
680 | my $file = $self->{HOME} . "/.$dot"; | |
681 | croak "@{[&whowasi]}: won't remove file $file" | |
682 | unless $self->{CLOBBER}; | |
683 | delete $self->{LIST}->{$dot}; | |
684 | my $success = unlink($file); | |
685 | carp "@{[&whowasi]}: can't unlink $file: $!" unless $success; | |
686 | $success; | |
687 | } | |
688 | ||
689 | The value returned by DELETE becomes the return value of the call | |
690 | to delete(). If you want to emulate the normal behavior of delete(), | |
691 | you should return whatever FETCH would have returned for this key. | |
692 | In this example, we have chosen instead to return a value which tells | |
693 | the caller whether the file was successfully deleted. | |
694 | ||
695 | =item CLEAR this | |
696 | ||
697 | This method is triggered when the whole hash is to be cleared, usually by | |
698 | assigning the empty list to it. | |
699 | ||
700 | In our example, that would remove all the user's dot files! It's such a | |
701 | dangerous thing that they'll have to set CLOBBER to something higher than | |
702 | 1 to make it happen. | |
703 | ||
704 | sub CLEAR { | |
705 | carp &whowasi if $DEBUG; | |
706 | my $self = shift; | |
707 | croak "@{[&whowasi]}: won't remove all dot files for $self->{USER}" | |
708 | unless $self->{CLOBBER} > 1; | |
709 | my $dot; | |
710 | foreach $dot ( keys %{$self->{LIST}}) { | |
711 | $self->DELETE($dot); | |
712 | } | |
713 | } | |
714 | ||
715 | =item EXISTS this, key | |
716 | ||
717 | This method is triggered when the user uses the exists() function | |
718 | on a particular hash. In our example, we'll look at the C<{LIST}> | |
719 | hash element for this: | |
720 | ||
721 | sub EXISTS { | |
722 | carp &whowasi if $DEBUG; | |
723 | my $self = shift; | |
724 | my $dot = shift; | |
725 | return exists $self->{LIST}->{$dot}; | |
726 | } | |
727 | ||
728 | =item FIRSTKEY this | |
729 | ||
730 | This method will be triggered when the user is going | |
731 | to iterate through the hash, such as via a keys() or each() | |
732 | call. | |
733 | ||
734 | sub FIRSTKEY { | |
735 | carp &whowasi if $DEBUG; | |
736 | my $self = shift; | |
737 | my $a = keys %{$self->{LIST}}; # reset each() iterator | |
738 | each %{$self->{LIST}} | |
739 | } | |
740 | ||
741 | =item NEXTKEY this, lastkey | |
742 | ||
743 | This method gets triggered during a keys() or each() iteration. It has a | |
744 | second argument which is the last key that had been accessed. This is | |
745 | useful if you're carrying about ordering or calling the iterator from more | |
746 | than one sequence, or not really storing things in a hash anywhere. | |
747 | ||
748 | For our example, we're using a real hash so we'll do just the simple | |
749 | thing, but we'll have to go through the LIST field indirectly. | |
750 | ||
751 | sub NEXTKEY { | |
752 | carp &whowasi if $DEBUG; | |
753 | my $self = shift; | |
754 | return each %{ $self->{LIST} } | |
755 | } | |
756 | ||
757 | =item UNTIE this | |
758 | ||
759 | This is called when C<untie> occurs. See L<The C<untie> Gotcha> below. | |
760 | ||
761 | =item DESTROY this | |
762 | ||
763 | This method is triggered when a tied hash is about to go out of | |
764 | scope. You don't really need it unless you're trying to add debugging | |
765 | or have auxiliary state to clean up. Here's a very simple function: | |
766 | ||
767 | sub DESTROY { | |
768 | carp &whowasi if $DEBUG; | |
769 | } | |
770 | ||
771 | =back | |
772 | ||
773 | Note that functions such as keys() and values() may return huge lists | |
774 | when used on large objects, like DBM files. You may prefer to use the | |
775 | each() function to iterate over such. Example: | |
776 | ||
777 | # print out history file offsets | |
778 | use NDBM_File; | |
779 | tie(%HIST, 'NDBM_File', '/usr/lib/news/history', 1, 0); | |
780 | while (($key,$val) = each %HIST) { | |
781 | print $key, ' = ', unpack('L',$val), "\n"; | |
782 | } | |
783 | untie(%HIST); | |
784 | ||
785 | =head2 Tying FileHandles | |
786 | ||
787 | This is partially implemented now. | |
788 | ||
789 | A class implementing a tied filehandle should define the following | |
790 | methods: TIEHANDLE, at least one of PRINT, PRINTF, WRITE, READLINE, GETC, | |
791 | READ, and possibly CLOSE, UNTIE and DESTROY. The class can also provide: BINMODE, | |
792 | OPEN, EOF, FILENO, SEEK, TELL - if the corresponding perl operators are | |
793 | used on the handle. | |
794 | ||
795 | It is especially useful when perl is embedded in some other program, | |
796 | where output to STDOUT and STDERR may have to be redirected in some | |
797 | special way. See nvi and the Apache module for examples. | |
798 | ||
799 | In our example we're going to create a shouting handle. | |
800 | ||
801 | package Shout; | |
802 | ||
803 | =over 4 | |
804 | ||
805 | =item TIEHANDLE classname, LIST | |
806 | ||
807 | This is the constructor for the class. That means it is expected to | |
808 | return a blessed reference of some sort. The reference can be used to | |
809 | hold some internal information. | |
810 | ||
811 | sub TIEHANDLE { print "<shout>\n"; my $i; bless \$i, shift } | |
812 | ||
813 | =item WRITE this, LIST | |
814 | ||
815 | This method will be called when the handle is written to via the | |
816 | C<syswrite> function. | |
817 | ||
818 | sub WRITE { | |
819 | $r = shift; | |
820 | my($buf,$len,$offset) = @_; | |
821 | print "WRITE called, \$buf=$buf, \$len=$len, \$offset=$offset"; | |
822 | } | |
823 | ||
824 | =item PRINT this, LIST | |
825 | ||
826 | This method will be triggered every time the tied handle is printed to | |
827 | with the C<print()> function. | |
828 | Beyond its self reference it also expects the list that was passed to | |
829 | the print function. | |
830 | ||
831 | sub PRINT { $r = shift; $$r++; print join($,,map(uc($_),@_)),$\ } | |
832 | ||
833 | =item PRINTF this, LIST | |
834 | ||
835 | This method will be triggered every time the tied handle is printed to | |
836 | with the C<printf()> function. | |
837 | Beyond its self reference it also expects the format and list that was | |
838 | passed to the printf function. | |
839 | ||
840 | sub PRINTF { | |
841 | shift; | |
842 | my $fmt = shift; | |
843 | print sprintf($fmt, @_)."\n"; | |
844 | } | |
845 | ||
846 | =item READ this, LIST | |
847 | ||
848 | This method will be called when the handle is read from via the C<read> | |
849 | or C<sysread> functions. | |
850 | ||
851 | sub READ { | |
852 | my $self = shift; | |
853 | my $bufref = \$_[0]; | |
854 | my(undef,$len,$offset) = @_; | |
855 | print "READ called, \$buf=$bufref, \$len=$len, \$offset=$offset"; | |
856 | # add to $$bufref, set $len to number of characters read | |
857 | $len; | |
858 | } | |
859 | ||
860 | =item READLINE this | |
861 | ||
862 | This method will be called when the handle is read from via <HANDLE>. | |
863 | The method should return undef when there is no more data. | |
864 | ||
865 | sub READLINE { $r = shift; "READLINE called $$r times\n"; } | |
866 | ||
867 | =item GETC this | |
868 | ||
869 | This method will be called when the C<getc> function is called. | |
870 | ||
871 | sub GETC { print "Don't GETC, Get Perl"; return "a"; } | |
872 | ||
873 | =item CLOSE this | |
874 | ||
875 | This method will be called when the handle is closed via the C<close> | |
876 | function. | |
877 | ||
878 | sub CLOSE { print "CLOSE called.\n" } | |
879 | ||
880 | =item UNTIE this | |
881 | ||
882 | As with the other types of ties, this method will be called when C<untie> happens. | |
883 | It may be appropriate to "auto CLOSE" when this occurs. See | |
884 | L<The C<untie> Gotcha> below. | |
885 | ||
886 | =item DESTROY this | |
887 | ||
888 | As with the other types of ties, this method will be called when the | |
889 | tied handle is about to be destroyed. This is useful for debugging and | |
890 | possibly cleaning up. | |
891 | ||
892 | sub DESTROY { print "</shout>\n" } | |
893 | ||
894 | =back | |
895 | ||
896 | Here's how to use our little example: | |
897 | ||
898 | tie(*FOO,'Shout'); | |
899 | print FOO "hello\n"; | |
900 | $a = 4; $b = 6; | |
901 | print FOO $a, " plus ", $b, " equals ", $a + $b, "\n"; | |
902 | print <FOO>; | |
903 | ||
904 | =head2 UNTIE this | |
905 | ||
906 | You can define for all tie types an UNTIE method that will be called | |
907 | at untie(). See L<The C<untie> Gotcha> below. | |
908 | ||
909 | =head2 The C<untie> Gotcha | |
910 | ||
911 | If you intend making use of the object returned from either tie() or | |
912 | tied(), and if the tie's target class defines a destructor, there is a | |
913 | subtle gotcha you I<must> guard against. | |
914 | ||
915 | As setup, consider this (admittedly rather contrived) example of a | |
916 | tie; all it does is use a file to keep a log of the values assigned to | |
917 | a scalar. | |
918 | ||
919 | package Remember; | |
920 | ||
921 | use strict; | |
922 | use warnings; | |
923 | use IO::File; | |
924 | ||
925 | sub TIESCALAR { | |
926 | my $class = shift; | |
927 | my $filename = shift; | |
928 | my $handle = new IO::File "> $filename" | |
929 | or die "Cannot open $filename: $!\n"; | |
930 | ||
931 | print $handle "The Start\n"; | |
932 | bless {FH => $handle, Value => 0}, $class; | |
933 | } | |
934 | ||
935 | sub FETCH { | |
936 | my $self = shift; | |
937 | return $self->{Value}; | |
938 | } | |
939 | ||
940 | sub STORE { | |
941 | my $self = shift; | |
942 | my $value = shift; | |
943 | my $handle = $self->{FH}; | |
944 | print $handle "$value\n"; | |
945 | $self->{Value} = $value; | |
946 | } | |
947 | ||
948 | sub DESTROY { | |
949 | my $self = shift; | |
950 | my $handle = $self->{FH}; | |
951 | print $handle "The End\n"; | |
952 | close $handle; | |
953 | } | |
954 | ||
955 | 1; | |
956 | ||
957 | Here is an example that makes use of this tie: | |
958 | ||
959 | use strict; | |
960 | use Remember; | |
961 | ||
962 | my $fred; | |
963 | tie $fred, 'Remember', 'myfile.txt'; | |
964 | $fred = 1; | |
965 | $fred = 4; | |
966 | $fred = 5; | |
967 | untie $fred; | |
968 | system "cat myfile.txt"; | |
969 | ||
970 | This is the output when it is executed: | |
971 | ||
972 | The Start | |
973 | 1 | |
974 | 4 | |
975 | 5 | |
976 | The End | |
977 | ||
978 | So far so good. Those of you who have been paying attention will have | |
979 | spotted that the tied object hasn't been used so far. So lets add an | |
980 | extra method to the Remember class to allow comments to be included in | |
981 | the file -- say, something like this: | |
982 | ||
983 | sub comment { | |
984 | my $self = shift; | |
985 | my $text = shift; | |
986 | my $handle = $self->{FH}; | |
987 | print $handle $text, "\n"; | |
988 | } | |
989 | ||
990 | And here is the previous example modified to use the C<comment> method | |
991 | (which requires the tied object): | |
992 | ||
993 | use strict; | |
994 | use Remember; | |
995 | ||
996 | my ($fred, $x); | |
997 | $x = tie $fred, 'Remember', 'myfile.txt'; | |
998 | $fred = 1; | |
999 | $fred = 4; | |
1000 | comment $x "changing..."; | |
1001 | $fred = 5; | |
1002 | untie $fred; | |
1003 | system "cat myfile.txt"; | |
1004 | ||
1005 | When this code is executed there is no output. Here's why: | |
1006 | ||
1007 | When a variable is tied, it is associated with the object which is the | |
1008 | return value of the TIESCALAR, TIEARRAY, or TIEHASH function. This | |
1009 | object normally has only one reference, namely, the implicit reference | |
1010 | from the tied variable. When untie() is called, that reference is | |
1011 | destroyed. Then, as in the first example above, the object's | |
1012 | destructor (DESTROY) is called, which is normal for objects that have | |
1013 | no more valid references; and thus the file is closed. | |
1014 | ||
1015 | In the second example, however, we have stored another reference to | |
1016 | the tied object in $x. That means that when untie() gets called | |
1017 | there will still be a valid reference to the object in existence, so | |
1018 | the destructor is not called at that time, and thus the file is not | |
1019 | closed. The reason there is no output is because the file buffers | |
1020 | have not been flushed to disk. | |
1021 | ||
1022 | Now that you know what the problem is, what can you do to avoid it? | |
1023 | Prior to the introduction of the optional UNTIE method the only way | |
1024 | was the good old C<-w> flag. Which will spot any instances where you call | |
1025 | untie() and there are still valid references to the tied object. If | |
1026 | the second script above this near the top C<use warnings 'untie'> | |
1027 | or was run with the C<-w> flag, Perl prints this | |
1028 | warning message: | |
1029 | ||
1030 | untie attempted while 1 inner references still exist | |
1031 | ||
1032 | To get the script to work properly and silence the warning make sure | |
1033 | there are no valid references to the tied object I<before> untie() is | |
1034 | called: | |
1035 | ||
1036 | undef $x; | |
1037 | untie $fred; | |
1038 | ||
1039 | Now that UNTIE exists the class designer can decide which parts of the | |
1040 | class functionality are really associated with C<untie> and which with | |
1041 | the object being destroyed. What makes sense for a given class depends | |
1042 | on whether the inner references are being kept so that non-tie-related | |
1043 | methods can be called on the object. But in most cases it probably makes | |
1044 | sense to move the functionality that would have been in DESTROY to the UNTIE | |
1045 | method. | |
1046 | ||
1047 | If the UNTIE method exists then the warning above does not occur. Instead the | |
1048 | UNTIE method is passed the count of "extra" references and can issue its own | |
1049 | warning if appropriate. e.g. to replicate the no UNTIE case this method can | |
1050 | be used: | |
1051 | ||
1052 | sub UNTIE | |
1053 | { | |
1054 | my ($obj,$count) = @_; | |
1055 | carp "untie attempted while $count inner references still exist" if $count; | |
1056 | } | |
1057 | ||
1058 | =head1 SEE ALSO | |
1059 | ||
1060 | See L<DB_File> or L<Config> for some interesting tie() implementations. | |
1061 | A good starting point for many tie() implementations is with one of the | |
1062 | modules L<Tie::Scalar>, L<Tie::Array>, L<Tie::Hash>, or L<Tie::Handle>. | |
1063 | ||
1064 | =head1 BUGS | |
1065 | ||
1066 | You cannot easily tie a multilevel data structure (such as a hash of | |
1067 | hashes) to a dbm file. The first problem is that all but GDBM and | |
1068 | Berkeley DB have size limitations, but beyond that, you also have problems | |
1069 | with how references are to be represented on disk. One experimental | |
1070 | module that does attempt to address this need partially is the MLDBM | |
1071 | module. Check your nearest CPAN site as described in L<perlmodlib> for | |
1072 | source code to MLDBM. | |
1073 | ||
1074 | Tied filehandles are still incomplete. sysopen(), truncate(), | |
1075 | flock(), fcntl(), stat() and -X can't currently be trapped. | |
1076 | ||
1077 | =head1 AUTHOR | |
1078 | ||
1079 | Tom Christiansen | |
1080 | ||
1081 | TIEHANDLE by Sven Verdoolaege <F<skimo@dns.ufsia.ac.be>> and Doug MacEachern <F<dougm@osf.org>> | |
1082 | ||
1083 | UNTIE by Nick Ing-Simmons <F<nick@ing-simmons.net>> | |
1084 | ||
1085 | Tying Arrays by Casey West <F<casey@geeknest.com>> |