Commit | Line | Data |
---|---|---|
920dae64 AT |
1 | =head1 NAME |
2 | ||
3 | perlhack - How to hack at the Perl internals | |
4 | ||
5 | =head1 DESCRIPTION | |
6 | ||
7 | This document attempts to explain how Perl development takes place, | |
8 | and ends with some suggestions for people wanting to become bona fide | |
9 | porters. | |
10 | ||
11 | The perl5-porters mailing list is where the Perl standard distribution | |
12 | is maintained and developed. The list can get anywhere from 10 to 150 | |
13 | messages a day, depending on the heatedness of the debate. Most days | |
14 | there are two or three patches, extensions, features, or bugs being | |
15 | discussed at a time. | |
16 | ||
17 | A searchable archive of the list is at either: | |
18 | ||
19 | http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/ | |
20 | ||
21 | or | |
22 | ||
23 | http://archive.develooper.com/perl5-porters@perl.org/ | |
24 | ||
25 | List subscribers (the porters themselves) come in several flavours. | |
26 | Some are quiet curious lurkers, who rarely pitch in and instead watch | |
27 | the ongoing development to ensure they're forewarned of new changes or | |
28 | features in Perl. Some are representatives of vendors, who are there | |
29 | to make sure that Perl continues to compile and work on their | |
30 | platforms. Some patch any reported bug that they know how to fix, | |
31 | some are actively patching their pet area (threads, Win32, the regexp | |
32 | engine), while others seem to do nothing but complain. In other | |
33 | words, it's your usual mix of technical people. | |
34 | ||
35 | Over this group of porters presides Larry Wall. He has the final word | |
36 | in what does and does not change in the Perl language. Various | |
37 | releases of Perl are shepherded by a "pumpking", a porter | |
38 | responsible for gathering patches, deciding on a patch-by-patch, | |
39 | feature-by-feature basis what will and will not go into the release. | |
40 | For instance, Gurusamy Sarathy was the pumpking for the 5.6 release of | |
41 | Perl, and Jarkko Hietaniemi was the pumpking for the 5.8 release, and | |
42 | Rafael Garcia-Suarez holds the pumpking crown for the 5.10 release. | |
43 | ||
44 | In addition, various people are pumpkings for different things. For | |
45 | instance, Andy Dougherty and Jarkko Hietaniemi did a grand job as the | |
46 | I<Configure> pumpkin up till the 5.8 release. For the 5.10 release | |
47 | H.Merijn Brand took over. | |
48 | ||
49 | Larry sees Perl development along the lines of the US government: | |
50 | there's the Legislature (the porters), the Executive branch (the | |
51 | pumpkings), and the Supreme Court (Larry). The legislature can | |
52 | discuss and submit patches to the executive branch all they like, but | |
53 | the executive branch is free to veto them. Rarely, the Supreme Court | |
54 | will side with the executive branch over the legislature, or the | |
55 | legislature over the executive branch. Mostly, however, the | |
56 | legislature and the executive branch are supposed to get along and | |
57 | work out their differences without impeachment or court cases. | |
58 | ||
59 | You might sometimes see reference to Rule 1 and Rule 2. Larry's power | |
60 | as Supreme Court is expressed in The Rules: | |
61 | ||
62 | =over 4 | |
63 | ||
64 | =item 1 | |
65 | ||
66 | Larry is always by definition right about how Perl should behave. | |
67 | This means he has final veto power on the core functionality. | |
68 | ||
69 | =item 2 | |
70 | ||
71 | Larry is allowed to change his mind about any matter at a later date, | |
72 | regardless of whether he previously invoked Rule 1. | |
73 | ||
74 | =back | |
75 | ||
76 | Got that? Larry is always right, even when he was wrong. It's rare | |
77 | to see either Rule exercised, but they are often alluded to. | |
78 | ||
79 | New features and extensions to the language are contentious, because | |
80 | the criteria used by the pumpkings, Larry, and other porters to decide | |
81 | which features should be implemented and incorporated are not codified | |
82 | in a few small design goals as with some other languages. Instead, | |
83 | the heuristics are flexible and often difficult to fathom. Here is | |
84 | one person's list, roughly in decreasing order of importance, of | |
85 | heuristics that new features have to be weighed against: | |
86 | ||
87 | =over 4 | |
88 | ||
89 | =item Does concept match the general goals of Perl? | |
90 | ||
91 | These haven't been written anywhere in stone, but one approximation | |
92 | is: | |
93 | ||
94 | 1. Keep it fast, simple, and useful. | |
95 | 2. Keep features/concepts as orthogonal as possible. | |
96 | 3. No arbitrary limits (platforms, data sizes, cultures). | |
97 | 4. Keep it open and exciting to use/patch/advocate Perl everywhere. | |
98 | 5. Either assimilate new technologies, or build bridges to them. | |
99 | ||
100 | =item Where is the implementation? | |
101 | ||
102 | All the talk in the world is useless without an implementation. In | |
103 | almost every case, the person or people who argue for a new feature | |
104 | will be expected to be the ones who implement it. Porters capable | |
105 | of coding new features have their own agendas, and are not available | |
106 | to implement your (possibly good) idea. | |
107 | ||
108 | =item Backwards compatibility | |
109 | ||
110 | It's a cardinal sin to break existing Perl programs. New warnings are | |
111 | contentious--some say that a program that emits warnings is not | |
112 | broken, while others say it is. Adding keywords has the potential to | |
113 | break programs, changing the meaning of existing token sequences or | |
114 | functions might break programs. | |
115 | ||
116 | =item Could it be a module instead? | |
117 | ||
118 | Perl 5 has extension mechanisms, modules and XS, specifically to avoid | |
119 | the need to keep changing the Perl interpreter. You can write modules | |
120 | that export functions, you can give those functions prototypes so they | |
121 | can be called like built-in functions, you can even write XS code to | |
122 | mess with the runtime data structures of the Perl interpreter if you | |
123 | want to implement really complicated things. If it can be done in a | |
124 | module instead of in the core, it's highly unlikely to be added. | |
125 | ||
126 | =item Is the feature generic enough? | |
127 | ||
128 | Is this something that only the submitter wants added to the language, | |
129 | or would it be broadly useful? Sometimes, instead of adding a feature | |
130 | with a tight focus, the porters might decide to wait until someone | |
131 | implements the more generalized feature. For instance, instead of | |
132 | implementing a "delayed evaluation" feature, the porters are waiting | |
133 | for a macro system that would permit delayed evaluation and much more. | |
134 | ||
135 | =item Does it potentially introduce new bugs? | |
136 | ||
137 | Radical rewrites of large chunks of the Perl interpreter have the | |
138 | potential to introduce new bugs. The smaller and more localized the | |
139 | change, the better. | |
140 | ||
141 | =item Does it preclude other desirable features? | |
142 | ||
143 | A patch is likely to be rejected if it closes off future avenues of | |
144 | development. For instance, a patch that placed a true and final | |
145 | interpretation on prototypes is likely to be rejected because there | |
146 | are still options for the future of prototypes that haven't been | |
147 | addressed. | |
148 | ||
149 | =item Is the implementation robust? | |
150 | ||
151 | Good patches (tight code, complete, correct) stand more chance of | |
152 | going in. Sloppy or incorrect patches might be placed on the back | |
153 | burner until the pumpking has time to fix, or might be discarded | |
154 | altogether without further notice. | |
155 | ||
156 | =item Is the implementation generic enough to be portable? | |
157 | ||
158 | The worst patches make use of a system-specific features. It's highly | |
159 | unlikely that nonportable additions to the Perl language will be | |
160 | accepted. | |
161 | ||
162 | =item Is the implementation tested? | |
163 | ||
164 | Patches which change behaviour (fixing bugs or introducing new features) | |
165 | must include regression tests to verify that everything works as expected. | |
166 | Without tests provided by the original author, how can anyone else changing | |
167 | perl in the future be sure that they haven't unwittingly broken the behaviour | |
168 | the patch implements? And without tests, how can the patch's author be | |
169 | confident that his/her hard work put into the patch won't be accidentally | |
170 | thrown away by someone in the future? | |
171 | ||
172 | =item Is there enough documentation? | |
173 | ||
174 | Patches without documentation are probably ill-thought out or | |
175 | incomplete. Nothing can be added without documentation, so submitting | |
176 | a patch for the appropriate manpages as well as the source code is | |
177 | always a good idea. | |
178 | ||
179 | =item Is there another way to do it? | |
180 | ||
181 | Larry said "Although the Perl Slogan is I<There's More Than One Way | |
182 | to Do It>, I hesitate to make 10 ways to do something". This is a | |
183 | tricky heuristic to navigate, though--one man's essential addition is | |
184 | another man's pointless cruft. | |
185 | ||
186 | =item Does it create too much work? | |
187 | ||
188 | Work for the pumpking, work for Perl programmers, work for module | |
189 | authors, ... Perl is supposed to be easy. | |
190 | ||
191 | =item Patches speak louder than words | |
192 | ||
193 | Working code is always preferred to pie-in-the-sky ideas. A patch to | |
194 | add a feature stands a much higher chance of making it to the language | |
195 | than does a random feature request, no matter how fervently argued the | |
196 | request might be. This ties into "Will it be useful?", as the fact | |
197 | that someone took the time to make the patch demonstrates a strong | |
198 | desire for the feature. | |
199 | ||
200 | =back | |
201 | ||
202 | If you're on the list, you might hear the word "core" bandied | |
203 | around. It refers to the standard distribution. "Hacking on the | |
204 | core" means you're changing the C source code to the Perl | |
205 | interpreter. "A core module" is one that ships with Perl. | |
206 | ||
207 | =head2 Keeping in sync | |
208 | ||
209 | The source code to the Perl interpreter, in its different versions, is | |
210 | kept in a repository managed by a revision control system ( which is | |
211 | currently the Perforce program, see http://perforce.com/ ). The | |
212 | pumpkings and a few others have access to the repository to check in | |
213 | changes. Periodically the pumpking for the development version of Perl | |
214 | will release a new version, so the rest of the porters can see what's | |
215 | changed. The current state of the main trunk of repository, and patches | |
216 | that describe the individual changes that have happened since the last | |
217 | public release are available at this location: | |
218 | ||
219 | http://public.activestate.com/pub/apc/ | |
220 | ftp://public.activestate.com/pub/apc/ | |
221 | ||
222 | If you're looking for a particular change, or a change that affected | |
223 | a particular set of files, you may find the B<Perl Repository Browser> | |
224 | useful: | |
225 | ||
226 | http://public.activestate.com/cgi-bin/perlbrowse | |
227 | ||
228 | You may also want to subscribe to the perl5-changes mailing list to | |
229 | receive a copy of each patch that gets submitted to the maintenance | |
230 | and development "branches" of the perl repository. See | |
231 | http://lists.perl.org/ for subscription information. | |
232 | ||
233 | If you are a member of the perl5-porters mailing list, it is a good | |
234 | thing to keep in touch with the most recent changes. If not only to | |
235 | verify if what you would have posted as a bug report isn't already | |
236 | solved in the most recent available perl development branch, also | |
237 | known as perl-current, bleading edge perl, bleedperl or bleadperl. | |
238 | ||
239 | Needless to say, the source code in perl-current is usually in a perpetual | |
240 | state of evolution. You should expect it to be very buggy. Do B<not> use | |
241 | it for any purpose other than testing and development. | |
242 | ||
243 | Keeping in sync with the most recent branch can be done in several ways, | |
244 | but the most convenient and reliable way is using B<rsync>, available at | |
245 | ftp://rsync.samba.org/pub/rsync/ . (You can also get the most recent | |
246 | branch by FTP.) | |
247 | ||
248 | If you choose to keep in sync using rsync, there are two approaches | |
249 | to doing so: | |
250 | ||
251 | =over 4 | |
252 | ||
253 | =item rsync'ing the source tree | |
254 | ||
255 | Presuming you are in the directory where your perl source resides | |
256 | and you have rsync installed and available, you can "upgrade" to | |
257 | the bleadperl using: | |
258 | ||
259 | # rsync -avz rsync://public.activestate.com/perl-current/ . | |
260 | ||
261 | This takes care of updating every single item in the source tree to | |
262 | the latest applied patch level, creating files that are new (to your | |
263 | distribution) and setting date/time stamps of existing files to | |
264 | reflect the bleadperl status. | |
265 | ||
266 | Note that this will not delete any files that were in '.' before | |
267 | the rsync. Once you are sure that the rsync is running correctly, | |
268 | run it with the --delete and the --dry-run options like this: | |
269 | ||
270 | # rsync -avz --delete --dry-run rsync://public.activestate.com/perl-current/ . | |
271 | ||
272 | This will I<simulate> an rsync run that also deletes files not | |
273 | present in the bleadperl master copy. Observe the results from | |
274 | this run closely. If you are sure that the actual run would delete | |
275 | no files precious to you, you could remove the '--dry-run' option. | |
276 | ||
277 | You can than check what patch was the latest that was applied by | |
278 | looking in the file B<.patch>, which will show the number of the | |
279 | latest patch. | |
280 | ||
281 | If you have more than one machine to keep in sync, and not all of | |
282 | them have access to the WAN (so you are not able to rsync all the | |
283 | source trees to the real source), there are some ways to get around | |
284 | this problem. | |
285 | ||
286 | =over 4 | |
287 | ||
288 | =item Using rsync over the LAN | |
289 | ||
290 | Set up a local rsync server which makes the rsynced source tree | |
291 | available to the LAN and sync the other machines against this | |
292 | directory. | |
293 | ||
294 | From http://rsync.samba.org/README.html : | |
295 | ||
296 | "Rsync uses rsh or ssh for communication. It does not need to be | |
297 | setuid and requires no special privileges for installation. It | |
298 | does not require an inetd entry or a daemon. You must, however, | |
299 | have a working rsh or ssh system. Using ssh is recommended for | |
300 | its security features." | |
301 | ||
302 | =item Using pushing over the NFS | |
303 | ||
304 | Having the other systems mounted over the NFS, you can take an | |
305 | active pushing approach by checking the just updated tree against | |
306 | the other not-yet synced trees. An example would be | |
307 | ||
308 | #!/usr/bin/perl -w | |
309 | ||
310 | use strict; | |
311 | use File::Copy; | |
312 | ||
313 | my %MF = map { | |
314 | m/(\S+)/; | |
315 | $1 => [ (stat $1)[2, 7, 9] ]; # mode, size, mtime | |
316 | } `cat MANIFEST`; | |
317 | ||
318 | my %remote = map { $_ => "/$_/pro/3gl/CPAN/perl-5.7.1" } qw(host1 host2); | |
319 | ||
320 | foreach my $host (keys %remote) { | |
321 | unless (-d $remote{$host}) { | |
322 | print STDERR "Cannot Xsync for host $host\n"; | |
323 | next; | |
324 | } | |
325 | foreach my $file (keys %MF) { | |
326 | my $rfile = "$remote{$host}/$file"; | |
327 | my ($mode, $size, $mtime) = (stat $rfile)[2, 7, 9]; | |
328 | defined $size or ($mode, $size, $mtime) = (0, 0, 0); | |
329 | $size == $MF{$file}[1] && $mtime == $MF{$file}[2] and next; | |
330 | printf "%4s %-34s %8d %9d %8d %9d\n", | |
331 | $host, $file, $MF{$file}[1], $MF{$file}[2], $size, $mtime; | |
332 | unlink $rfile; | |
333 | copy ($file, $rfile); | |
334 | utime time, $MF{$file}[2], $rfile; | |
335 | chmod $MF{$file}[0], $rfile; | |
336 | } | |
337 | } | |
338 | ||
339 | though this is not perfect. It could be improved with checking | |
340 | file checksums before updating. Not all NFS systems support | |
341 | reliable utime support (when used over the NFS). | |
342 | ||
343 | =back | |
344 | ||
345 | =item rsync'ing the patches | |
346 | ||
347 | The source tree is maintained by the pumpking who applies patches to | |
348 | the files in the tree. These patches are either created by the | |
349 | pumpking himself using C<diff -c> after updating the file manually or | |
350 | by applying patches sent in by posters on the perl5-porters list. | |
351 | These patches are also saved and rsync'able, so you can apply them | |
352 | yourself to the source files. | |
353 | ||
354 | Presuming you are in a directory where your patches reside, you can | |
355 | get them in sync with | |
356 | ||
357 | # rsync -avz rsync://public.activestate.com/perl-current-diffs/ . | |
358 | ||
359 | This makes sure the latest available patch is downloaded to your | |
360 | patch directory. | |
361 | ||
362 | It's then up to you to apply these patches, using something like | |
363 | ||
364 | # last=`ls -t *.gz | sed q` | |
365 | # rsync -avz rsync://public.activestate.com/perl-current-diffs/ . | |
366 | # find . -name '*.gz' -newer $last -exec gzcat {} \; >blead.patch | |
367 | # cd ../perl-current | |
368 | # patch -p1 -N <../perl-current-diffs/blead.patch | |
369 | ||
370 | or, since this is only a hint towards how it works, use CPAN-patchaperl | |
371 |