Commit | Line | Data |
---|---|---|
74966f01 C |
1 | .\ Format this file with: |
2 | .\ pic file | tbl | troff -ms | |
3 | ||
4 | .\ PS and PE center pic diagrams. (The corresponding ms-macros may not.) | |
5 | .de PS | |
6 | .nr pE (\\n(.lu-\\$2u)/2u | |
7 | .in +\\n(pEu | |
8 | .ne \\$1u | |
9 | .. | |
10 | .de PE | |
11 | .in -\\n(pEu | |
12 | .. | |
13 | .de D( | |
14 | .DS | |
15 | .nr VS 12pts | |
16 | .vs 12pts | |
17 | .I | |
18 | .. | |
19 | .de D) | |
20 | .DE | |
21 | .nr VS 18pts | |
22 | .vs 18pts | |
23 | .R | |
24 | .. | |
25 | ||
26 | .ND July 1985 | |
27 | .RP | |
28 | .TL | |
29 | RCS \- A System for Version Control | |
30 | ||
31 | .AU | |
32 | Walter F. Tichy | |
33 | .AI | |
34 | Department of Computer Sciences | |
35 | Purdue University | |
36 | West Lafayette, Indiana 47907 | |
37 | ||
38 | .AB | |
39 | An important problem in program development and maintenance is version control, | |
40 | i.e., the task of keeping a software system consisting of many versions and | |
41 | configurations well organized. | |
42 | The Revision Control System (RCS) | |
43 | is a software tool that assists with that task. | |
44 | RCS manages revisions of text documents, in particular source programs, | |
45 | documentation, and test data. | |
46 | It automates the storing, retrieval, logging and identification of revisions, | |
47 | and it provides selection mechanisms for composing configurations. | |
48 | This paper introduces basic version control concepts and | |
49 | discusses the practice of version control | |
50 | using RCS. | |
51 | For conserving space, RCS stores deltas, i.e., differences between | |
52 | successive revisions. Several delta storage methods are discusses. | |
53 | Usage statistics show that RCS's delta storage method is | |
54 | space and time efficient. | |
55 | The paper concludes with a detailed survey of version control tools. | |
56 | ||
57 | \fBKeywords\fR: configuration management, history management, | |
58 | version control, revisions, deltas. | |
59 | .AE | |
60 | .FS | |
61 | A version of this paper was published in Software--Practice and Experience, | |
62 | Vol. 15(7), 637-654 (July 1985). | |
63 | .FE | |
64 | .nr VS 18pts | |
65 | .EQ | |
66 | delim $$ | |
67 | .EN | |
68 | .LP | |
69 | .NH | |
70 | Introduction | |
71 | .PP | |
72 | Version control is the task of keeping software | |
73 | systems consisting of many versions and configurations well organized. | |
74 | The Revision Control System (RCS) is a set of UNIX | |
75 | commands that assist with that task. | |
76 | .PP | |
77 | RCS' primary function is to manage \fIrevision groups\fR. | |
78 | A revision group is a set of text documents, called \fIrevisions\fR, | |
79 | that evolved from each other. A new revision is | |
80 | created by manually editing an existing one. | |
81 | RCS organizes the revisions into an ancestral tree. The initial revision | |
82 | is the root of the tree, and the tree edges indicate | |
83 | from which revision a given one evolved. | |
84 | Besides managing individual revision groups, RCS provides | |
85 | flexible selection functions for composing configurations. | |
86 | RCS may be combined with MAKE\u1\d, | |
87 | resulting in a powerful package for version control. | |
88 | .PP | |
89 | RCS also offers facilities for | |
90 | merging updates with customer modifications, | |
91 | for distributed software development, and | |
92 | for automatic identification. | |
93 | Identification is the `stamping' | |
94 | of revisions and configurations with unique markers. | |
95 | These markers are akin to serial numbers, | |
96 | telling software maintainers unambiguously which configuration | |
97 | is before them. | |
98 | .PP | |
99 | RCS is designed for both production and experimental | |
100 | environments. | |
101 | In production environments, | |
102 | access controls detect update conflicts and prevent overlapping changes. | |
103 | In experimental environments, where strong controls are | |
104 | counterproductive, it is possible to loosen the controls. | |
105 | .PP | |
106 | Although RCS was originally intended for programs, it is useful for any | |
107 | text that is revised frequently and whose previous revisions must be | |
108 | preserved. RCS has been applied successfully to store the source | |
109 | text for drawings, VLSI layouts, documentation, specifications, | |
110 | test data, form letters and articles. | |
111 | .PP | |
112 | This paper discusses the practice of | |
113 | version control using RCS. | |
114 | It also introduces basic version control concepts, | |
115 | useful for clarifying current practice and designing similar systems. | |
116 | Revision groups of individual components are treated in the next three sections, | |
117 | and the extensions to configurations follow. | |
118 | Because of its size, a survey of version control tools | |
119 | appears at the end of the paper. | |
120 | .NH | |
121 | Getting started with RCS | |
122 | .PP | |
123 | Suppose a text file \fIf.c\fR is to be placed under control of RCS. | |
124 | Invoking the check-in command | |
125 | .D( | |
126 | ci f.c | |
127 | .D) | |
128 | creates a new revision group with the contents of | |
129 | \fIf.c\fR as the initial | |
130 | revision (numbered 1.1) | |
131 | and stores the group into the file \fIf.c,v\fR. | |
132 | Unless told otherwise, the command deletes \fIf.c\fR. | |
133 | It also asks for a description of the group. | |
134 | The description should state the common purpose of all revisions in the group, | |
135 | and becomes part of the group's documentation. | |
136 | All later check-in commands will ask for a log entry, | |
137 | which should summarize the changes made. | |
138 | (The first revision is assigned a default log message, | |
139 | which just records the fact that it is the initial revision.) | |
140 | .PP | |
141 | Files ending in \fI,v\fR | |
142 | are called \fIRCS files\fR (\fIv\fR stands for \fIv\fRersions); | |
143 | the others are called working files. | |
144 | To get back the working file \fIf.c\fR in the previous example, | |
145 | execute the check-out command: | |
146 | .D( | |
147 | co f.c | |
148 | .D) | |
149 | .R | |
150 | This command extracts the latest revision from | |
151 | the revision group \fIf.c,v\fR and writes | |
152 | it into \fIf.c\fR. | |
153 | The file \fIf.c\fR can now be edited and, when finished, | |
154 | checked back in with \fIci\fR: | |
155 | .D( | |
156 | ci f.c | |
157 | .D) | |
158 | \fICi\fR assigns number 1.2 to | |
159 | the new revision. | |
160 | If \fIci\fR complains with the message | |
161 | .D( | |
162 | ci error: no lock set by <login> | |
163 | .D) | |
164 | then the system administrator has decided to configure RCS for a | |
165 | production environment by enabling the `strict locking feature'. | |
166 | If this feature is enabled, all RCS files are initialized | |
167 | such that check-in operations require a lock on the previous revision | |
168 | (the one from which the current one evolved). | |
169 | Locking prevents overlapping modifications if several people work on the same file. | |
170 | If locking is required, the revision should | |
171 | have been locked during the check-out by using | |
172 | the option \fI-l\fR: | |
173 | .D( | |
174 | co -l f.c | |
175 | .D) | |
176 | Of course it is too late now for the check-out with locking, because | |
177 | \fIf.c\fR has already been changed; checking out the file again | |
178 | would overwrite the modifications. | |
179 | (To prevent accidental overwrites, \fIco\fR senses the presence | |
180 | of a working file and asks whether the user really intended to overwrite it. | |
181 | The overwriting check-out is sometimes useful for | |
182 | backing up to the previous revision.) | |
183 | To be able to proceed with the check-in in the present case, first execute | |
184 | .D( | |
185 | rcs -l f.c | |
186 | .D) | |
187 | This command retroactively locks the latest revision, unless someone | |
188 | else locked it in the meantime. In this case, the two programmers | |
189 | involved have to negotiate whose | |
190 | modifications should take precedence. | |
191 | .PP | |
192 | If an RCS file is private, i.e., if only the owner of the file is expected | |
193 | to deposit revisions into it, the strict locking feature is unnecessary and | |
194 | may be disabled. | |
195 | If strict locking is disabled, | |
196 | the owner of the RCS file need not have a lock for check-in. | |
197 | For safety reasons, all others | |
198 | still do. Turning strict locking off and on is done with the commands: | |
199 | .D( | |
200 | rcs -U f.c \fRand\fI rcs -L f.c | |
201 | .D) | |
202 | These commands enable or disable the strict locking feature for each RCS file | |
203 | individually. | |
204 | The system administrator only decides whether strict locking is | |
205 | enabled initially. | |
206 | .PP | |
207 | To reduce the clutter in a working directory, all RCS files can be moved | |
208 | to a subdirectory with the name \fIRCS\fR. | |
209 | RCS commands look first into that directory for RCS files. | |
210 | All the commands presented above work | |
211 | with the \fIRCS\fR subdirectory without change.\(dg | |
212 | .FS | |
213 | \(dg Pairs of RCS and working files can actually be specified in 3 ways: | |
214 | a) both are given, b) only the working file is given, c) only the | |
215 | RCS file is given. | |
216 | If a pair is given, both files may have arbitrary path prefixes; | |
217 | RCS commands pair them up intelligently. | |
218 | .FE | |
219 | .PP | |
220 | It may be undesirable that \fIci\fR deletes the working file. | |
221 | For instance, sometimes one would like to save the current revision, | |
222 | but continue editing. | |
223 | Invoking | |
224 | .D( | |
225 | ci -l f.c | |
226 | .D) | |
227 | checks in \fIf.c\fR as usual, but performs an additional | |
228 | check-out with locking afterwards. Thus, the working file does | |
229 | not disappear after the check-in. | |
230 | Similarly, the option | |
231 | \fI-u\fR does a check-in followed by a check-out without | |
232 | locking. This option is useful if the file is needed for compilation after the check-in. | |
233 | Both options update the identification markers in the working file | |
234 | (see below). | |
235 | .PP | |
236 | Besides the operations \fIci\fR and \fIco\fR, RCS provides the following | |
237 | commands: | |
238 | \fIident\fR (extract identification markers), | |
239 | \fIrcs\fR (change RCS file attributes), | |
240 | \fIrcsclean\fR (remove unchanged working files), | |
241 | \fIrcsdiff\fR (compare revisions), | |
242 | \fIrcsfreeze\fR (record a configuration), | |
243 | \fIrcsmerge\fR (merge revisions), | |
244 | and \fIrlog\fR (read log messages and other information in RCS files). | |
245 | A synopsis of these commands appears in the Appendix. | |
246 | .NH 2 | |
247 | Automatic Identification | |
248 | .PP | |
249 | RCS can stamp source and object code with special identification strings, | |
250 | similar to product and serial numbers. | |
251 | To obtain such identification, place the marker | |
252 | .D( | |
253 | $Header$ | |
254 | .D) | |
255 | into the text of a revision, for instance inside a comment. | |
256 | The check-out operation will replace this marker with a string of the form | |
257 | .D( | |
258 | $Header: filename revisionnumber date time author state locker $ | |
259 | .D) | |
260 | This string need never be touched, because \fIco\fR keeps it | |
261 | up to date automatically. | |
262 | To propagate the marker into object code, simply put | |
263 | it into a literal character string. In C, this is done as follows: | |
264 | .D( | |
265 | static char rcsid[] = "$Header$"; | |
266 | .D) | |
267 | The command \fIident\fR extracts such markers from any file, in particular from | |
268 | object code. | |
269 | \fIIdent\fR helps to find out | |
270 | which revisions of which modules were used in a given program. | |
271 | It returns a complete and unambiguous component list, | |
272 | from which a copy of the program can be reconstructed. | |
273 | This facility is invaluable for program maintenance. | |
274 | .PP | |
275 | There are several additional identification markers, one for each component | |
276 | of $Header$. | |
277 | The marker | |
278 | .D( | |
279 | $Log$ | |
280 | .D) | |
281 | has a similar function. It accumulates | |
282 | the log messages that are requested during check-in. | |
283 | Thus, one can maintain the complete history of a revision directly inside it, | |
284 | by enclosing it in a comment. | |
285 | Figure 1 is a partial reproduction of a log contained in revision 4.1 of | |
286 | the file \fIci.c\fR. The log appears at the beginning of the file, | |
287 | and makes it easy to determine what the recent modifications were. | |
288 | .sp | |
289 | .nr VS 12pts | |
290 | .vs 12pts | |
291 | .ne 18 | |
292 | .nf | |
293 | .in +0.5i | |
294 | /* $Log: ci.c,v $ | |
295 | * Revision 4.1 83/05/10 17:03:06 wft | |
296 | * Added option -d and -w, and updated assignment of date, etc. to new delta. | |
297 | * Added handling of default branches. | |
298 | * | |
299 | * Revision 3.9 83/02/15 15:25:44 wft | |
300 | * Added call to fastcopy() to copy remainder of RCS file. | |
301 | * | |
302 | * Revision 3.8 83/01/14 15:34:05 wft | |
303 | * Added ignoring of interrupts while new RCS file is renamed; | |
304 | * avoids deletion of RCS files by interrupts. | |
305 | * | |
306 | * Revision 3.7 82/12/10 16:09:20 wft | |
307 | * Corrected checking of return code from diff. | |
308 | * An RCS file now inherits its mode during the first ci from the working file, | |
309 | * except that write permission is removed. | |
310 | */ | |
311 | .in 0 | |
312 | .ce 1 | |
313 | Figure 1. Log entries produced by the marker $Log$. | |
314 | .fi | |
315 | .nr VS 18pts | |
316 | .vs 18pts | |
317 | .sp 0 | |
318 | .LP | |
319 | Since revisions are stored in the form of differences, | |
320 | each log message is | |
321 | physically stored once, | |
322 | independent of the number of revisions present. | |
323 | Thus, the $Log$ marker incurs negligible space overhead. | |
324 | .NH | |
325 | The RCS Revision Tree | |
326 | .PP | |
327 | RCS arranges revisions in an ancestral tree. | |
328 | The \fIci\fR command builds this tree; the auxiliary command \fIrcs\fR | |
329 | prunes it. | |
330 | The tree has a root revision, normally numbered 1.1, and successive revisions | |
331 | are numbered 1.2, 1.3, etc. The first field of a revision number | |
332 | is called the \fIrelease number\fR and the second one | |
333 | the \fIlevel number\fR. Unless given explicitly, | |
334 | the \fIci\fR command assigns a new revision number | |
335 | by incrementing the level number of the previous revision. | |
336 | The release number must be incremented explicitly, using the | |
337 | \fI-r\fR option of \fIci\fR. | |
338 | Assuming there are revisions 1.1, 1.2, and 1.3 in the RCS file f.c,v, the command | |
339 | .D( | |
340 | ci -r2.1 f.c \fRor\fI ci -r2 f.c | |
341 | .D) | |
342 | assigns the number 2.1 to the new revision. | |
343 | Later check-ins without the \fI-r\fR option will assign the numbers 2.2, 2.3, | |
344 | and so on. | |
345 | The release number should be incremented only at major transition points | |
346 | in the development, for instance when a new release of a software product has | |
347 | been completed. | |
348 | .NH 2 | |
349 | When are branches needed? | |
350 | .PP | |
351 | A young revision tree is slender: | |
352 | It consists of only one branch, called the trunk. | |
353 | As the tree ages, side branches may form. | |
354 | Branches are needed in the following 4 situations. | |
355 | .IP "\fITemporary fixes\fR" | |
356 | .sp 0 | |
357 | Suppose a tree has 5 revisions grouped in 2 releases, | |
358 | as illustrated in Figure 2. | |
359 | Revision 1.3, the last one of release 1, is in operation at customer sites, | |
360 | while release 2 is in active development. | |
361 | .ne 4 | |
362 | .PS 4i | |
363 | .ps -2 | |
364 | box "1.1" | |
365 | arrow | |
366 | box "1.2" | |
367 | arrow | |
368 | box "1.3" | |
369 | arrow | |
370 | box "2.1" | |
371 | arrow | |
372 | box "2.2" | |
373 | arrow dashed | |
374 | .ps +2 | |
375 | .PE | |
376 | .ce 1 | |
377 | Figure 2. A slender revision tree. | |
378 | .sp 0 | |
379 | Now imagine a customer requesting a fix of | |
380 | a problem in revision 1.3, although actual development has moved on | |
381 | to release 2. RCS does not permit an extra | |
382 | revision to be spliced in between 1.3 and 2.1, since that would not reflect | |
383 | the actual development history. Instead, create a branch | |
384 | at revision 1.3, and check in the fix on that branch. | |
385 | The first branch starting at 1.3 has number 1.3.1, and | |
386 | the revisions on that branch are numbered 1.3.1.1, 1.3.1.2, etc. | |
387 | The double numbering is needed to allow for another | |
388 | branch at 1.3, say 1.3.2. | |
389 | Revisions on the second branch would be numbered | |
390 | 1.3.2.1, 1.3.2.2, and so on. | |
391 | The following steps create | |
392 | branch 1.3.1 and add revision 1.3.1.1: | |
393 | .sp 0 | |
394 | .I | |
395 | .nr VS 12pts | |
396 | .vs 12pts | |
397 | .TS | |
398 | tab(%); | |
399 | l l l. | |
400 | %co -r1.3 f.c % -- check out revision 1.3 | |
401 | %edit f.c % -- change it | |
402 | %ci -r1.3.1 f.c % -- check it in on branch 1.3.1 | |
403 | .TE | |
404 | .nr VS 18pts | |
405 | .vs 18pts | |
406 | .R | |
407 | This sequence of commands transforms the tree of Figure 2 into | |
408 | the one in Figure 3. | |
409 | Note that it may be necessary to incorporate the differences | |
410 | between 1.3 and 1.3.1.1 | |
411 | into a revision at level 2. The operation \fIrcsmerge\fR automates this | |
412 | process (see the Appendix). | |
413 | .ne 7 | |
414 | .PS 4i | |
415 | .ps -2 | |
416 | box "1.1" | |
417 | arrow | |
418 | box "1.2" | |
419 | arrow | |
420 | R13: box "1.3" | |
421 | arrow | |
422 | R21: box "2.1" | |
423 | arrow | |
424 | R22: box "2.2" | |
425 | arrow dashed | |
426 | line invis down from R21.s | |
427 | RB1: box "1.3.1.1" | |
428 | arrow dashed right from RB1.e | |
429 | arrow from R13.s to RB1.w | |
430 | .ps +2 | |
431 | .PE | |
432 | .ce 1 | |
433 | Figure 3. A revision tree with one side branch | |
434 | ||
435 | .IP "\fIDistributed development and customer modifications\fR" | |
436 | .sp 0 | |
437 | Assume a situation as in Figure 2, where revision 1.3 is in operation | |
438 | at several customer sites, | |
439 | while release 2 is in development. | |
440 | Customer sites should use RCS to store the distributed software. | |
441 | However, customer modifications should not be placed on the same branch | |
442 | as the distributed source; instead, they should be placed on a side branch. | |
443 | When the next software distribution arrives, | |
444 | it should be appended to the trunk of | |
445 | the customer's RCS file, and the customer | |
446 | can then merge the local modifications back into the new release. | |
447 | In the above example, a | |
448 | customer's RCS file would contain the following tree, assuming | |
449 | that the customer has received revision 1.3, added his local modifications | |
450 | as revision 1.3.1.1, then received revision 2.4, and merged | |
451 | 2.4 and 1.3.1.1, resulting in 2.4.1.1. | |
452 | .ne 7 | |
453 | .PS 4i | |
454 | .ps -2 | |
455 | R13: box "1.3" | |
456 | line invis | |
457 | R21: box invis | |
458 | line invis | |
459 | R22: box invis | |
460 | line invis | |
461 | R24: box "2.4" | |
462 | line invis | |
463 | R25: box invis | |
464 | line invis | |
465 | arrow from R13.e to R24.w | |
466 | line invis down from R21.s | |
467 | RB1: box "1.3.1.1" | |
468 | arrow from R13.s to RB1.w | |
469 | right | |
470 | line invis down from R25.s | |
471 | RB2: box "2.4.1.1" | |
472 | arrow from R24.s to RB2.w | |
473 | .ps +2 | |
474 | .PE | |
475 | .ce 1 | |
476 | Figure 4. A customer's revision tree with local modifications. | |
477 | .sp 1 | |
478 | This approach is actually practiced in the CSNET project, | |
479 | where several universities and a company cooperate | |
480 | in developing a national computer network. | |
481 | .IP "\fIParallel development\fR" | |
482 | .sp 0 | |
483 | Sometimes it is desirable to explore an alternate design or | |
484 | a different implementation technique in parallel with the | |
485 | main line development. Such development | |
486 | should be carried out on a side branch. | |
487 | The experimental changes may later be moved into the main line, or abandoned. | |
488 | .IP "\fIConflicting updates\fR" | |
489 | .sp 0 | |
490 | A common occurrence is that one programmer | |
491 | has checked out a revision, but cannot complete the assignment | |
492 | for some reason. In the meantime, another person | |
493 | must perform another modification | |
494 | immediately. In that case, the second person should check-out the same revision, | |
495 | modify it, and check it in on a side branch, for later merging. | |
496 | .PP | |
497 | Every node in a revision tree consists of the following attributes: | |
498 | a revision number, a check-in date and time, the author's identification, | |
499 | a log entry, a state and the actual text. All these attributes | |
500 | are determined at the time the revision is checked in. | |
501 | The state attribute indicates the status of a revision. | |
502 | It is set automatically to `experimental' during check-in. | |
503 | A revision can later be promoted to a higher status, for example | |
504 | `stable' or `released'. The set of states is user-defined. | |
505 | .NH 2 | |
506 | Revisions are represented as deltas | |
507 | .PP | |
508 | For conserving space, RCS stores revisions in the form | |
509 | of deltas, i.e., as differences between revisions. | |
510 | The user interface completely hides this fact. | |
511 | .PP | |
512 | A delta is a sequence of edit commands that transforms one string | |
513 | into another. The deltas employed by RCS are line-based, which means | |
514 | that the only edit commands allowed are insertion and deletion of lines. | |
515 | If a single character in a line is changed, the | |
516 | edit scripts consider the entire line changed. | |
517 | The program \fIdiff\fR\u2\d | |
518 | produces a small, line-based delta between pairs of text files. | |
519 | A character-based edit script would take much longer to compute, | |
520 | and would not be significantly shorter. | |
521 | .PP | |
522 | Using deltas is a classical space-time tradeoff: deltas reduce the | |
523 | space consumed, but increase access time. | |
524 | However, a version control tool should impose as little delay | |
525 | as possible on programmers. | |
526 | Excessive delays discourage the use of version controls, | |
527 | or induce programmers to take shortcuts that compromise system integrity. | |
528 | To gain reasonably fast access time for both editing and compiling, | |
529 | RCS arranges deltas in the following way. | |
530 | The most recent revision on the trunk is stored intact. | |
531 | All other revisions on the trunk are stored as reverse deltas. | |
532 | A reverse delta describes how to go backward in the development history: | |
533 | it produces the desired revision if applied to the successor of that revision. | |
534 | This implementation has the advantage | |
535 | that extraction of the latest revision is a simple and fast copy | |
536 | operation. | |
537 | Adding a new revision to the trunk is also fast: \fIci\fR simply | |
538 | adds the new revision intact, replaces the previous | |
539 | revision with a reverse delta, and keeps the rest of the old deltas. | |
540 | Thus, \fIci\fR requires the computation | |
541 | of only one new delta. | |
542 | .PP | |
543 | Branches need special treatment. The naive solution would be to | |
544 | store complete copies for the tips of all branches. | |
545 | Clearly, this approach would cost too much space. Instead, | |
546 | RCS uses \fIforward\fR deltas for branches. Regenerating a revision | |
547 | on a side branch proceeds as follows. First, extract the latest revision | |
548 | on the trunk; secondly, apply reverse deltas until the fork revision for | |
549 | the branch is obtained; thirdly, apply forward deltas until the desired | |
550 | branch revision is reached. Figure 5 illustrates a tree with | |
551 | one side branch. Triangles pointing to the left and right represent | |
552 | reverse and forward deltas, respectively. | |
553 | .ne 8 | |
554 | .PS 4i | |
555 | .ps -2 | |
556 | define BD X [line invis $1 right .5; | |
557 | line up .3 then left .5 down .3 then right .5 down .3 then up .3] X | |
558 | ||
559 | define FD X [line invis $1 right .5; | |
560 | line left .5 down .3 then up .6 then right .5 down .3;] X | |
561 | ||
562 | right | |
563 | D11: BD(" 1.1") | |
564 | arrow right from D11.e | |
565 | D12: BD(" 1.2") | |
566 | arrow right from D12.e | |
567 | D13: BD(" 1.3") | |
568 | arrow right from D13.e | |
569 | D21: BD(" 2.1") | |
570 | arrow right from D21.e | |
571 | D22: box "2.2" | |
572 | line invis down from D21.s | |
573 | F1: FD("1.3.1.1 ") | |
574 | arrow from D13.se to F1.w | |
575 | arrow from F1.e right | |
576 | right | |
577 | F2: FD("1.3.1.2 ") | |
578 | .ps +2 | |
579 | .PE | |
580 | .ce 1 | |
581 | Figure 5. A revision tree with reverse and forward deltas. | |
582 | .sp 0 | |
583 | .PP | |
584 | Although implementing fast check-out for the latest trunk revision, | |
585 | this arrangement has the disadvantage that generation of other revisions | |
586 | takes time proportional to the number of deltas applied. For example, | |
587 | regenerating the branch tip in Figure 5 requires application of five | |
588 | deltas (including the initial one). Since usage statistics show that | |
589 | the latest trunk revision is the one that is retrieved in 95 per cent | |
590 | of all cases (see the section on usage statistics), biasing check-out time | |
591 | in favor of that revision results in significant savings. | |
592 | However, careful implementation of the delta application process is | |
593 | necessary to provide low retrieval overhead for other revisions, in | |
594 | particular for branch tips. | |
595 | .PP | |
596 | There are several techniques for delta application. | |
597 | The naive one is to pass each delta to a general-purpose text editor. | |
598 | A prototype of RCS invoked the UNIX editor \fIed\fR both | |
599 | for applying deltas and for expanding the identification markers. | |
600 | Although easy to implement, performance was poor, owing to the | |
601 | high start-up costs and excess generality of \fIed\fR. An intermediate | |
602 | version of RCS used a special-purpose, stream-oriented editor. | |
603 | This technique reduced the cost of applying a delta to the cost of | |
604 | checking out the latest trunk revision. The reason for this behavior | |
605 | is that each delta application involves a complete pass over | |
606 | the preceding revision. | |
607 | .PP | |
608 | However, there is a much better algorithm. Note that the deltas are | |
609 | line oriented and that most of the work of a stream editor involves | |
610 | copying unchanged lines from one revision to the next. A faster | |
611 | algorithm avoids unnecessary copying of character strings by using | |
612 | a \fIpiece table\fR. | |
613 | A piece table is a one-dimensional array, specifying how a given | |
614 | revision is `pieced together' from lines in the RCS file. | |
615 | Suppose piece table \fIPT\dr\u\fR represents revision \fIr\fR. | |
616 | Then \fIPT\dr\u[i]\fR contains the starting position of line \fIi\fR | |
617 | of revision \fIr\fR. | |
618 | Application of the next delta transforms piece table \fIPT\dr\u\fR | |
619 | into \fIPT\dr+1\u\fR. For instance, a delete command removes a | |
620 | series of entries from the piece table. An insertion command inserts | |
621 | new entries, moving the entries following the insertion point further down the | |
622 | array. The inserted entries point to the text lines in the delta. | |
623 | Thus, no I/O is involved except for reading the delta itself. When all | |
624 | deltas have been applied to the piece table, a sequential pass | |
625 | through the table looks up each line in the RCS file and copies it to | |
626 | the output file, updating identification markers at the same time. | |
627 | Of course, the RCS file must permit random access, since the copied | |
628 | lines are scattered throughout that file. Figure 6 illustrates an | |
629 | RCS file with two revisions and the corresponding piece tables. | |
630 | .ne 13 | |
631 | .sp 12 | |
632 | .ce 1 | |
633 | Figure 6. An RCS file and its piece tables | |
634 | .sp 0 | |
635 | .PP | |
636 | The piece table approach has the property that the time for applying a single | |
637 | delta is roughly determined by the size of the delta, and not by the | |
638 | size of the revision. For example, if a delta is | |
639 | 10 per cent of the size of a revision, then applying it takes only | |
640 | 10 per cent of the time to generate the latest trunk revision. (The stream | |
641 | editor would take 100 per cent.) | |
642 | .PP | |
643 | There is an important alternative for representing deltas that affects | |
644 | performance. SCCS\u3\d, | |
645 | a precursor of RCS, uses \fIinterleaved\fR deltas. | |
646 | A file containing interleaved deltas is partitioned into blocks of lines. | |
647 | Each block has a header that specifies to which revision(s) the block | |
648 | belongs. The blocks are sorted out in such a way that a single | |
649 | pass over the file can pick up all the lines belonging to a given | |
650 | revision. Thus, the regeneration time for all revisions is the same: | |
651 | all headers must be inspected, and the associated blocks either copied | |
652 | or skipped. As the number of revisions increases, the cost of retrieving | |
653 | any revision is much higher than the cost of checking out the | |
654 | latest trunk revision with reverse deltas. A detailed comparison | |
655 | of SCCS's interleaved deltas and RCS's reverse deltas can be found | |
656 | in Reference 4. | |
657 | This reference considers the version of RCS with the | |
658 | stream editor only. The piece table method improves performance | |
659 | further, so that RCS is always faster than SCCS, except if 10 | |
660 | or more deltas are applied. | |
661 | .PP | |
662 | Additional speed-up for both delta methods can be obtained by caching | |
663 | the most recently generated revision, as has been implemented in DSEE.\u5\d | |
664 | With caching, access time to frequently used revisions can approach normal file | |
665 | access time, at the cost of some additional space. | |
666 | .NH | |
667 | Locking: A Controversial Issue | |
668 | .PP | |
669 | The locking mechanism for RCS was difficult to design. | |
670 | The problem and its solution are first presented in their `pure' form, | |
671 | followed by a discussion of the complications | |
672 | caused by `real-world' considerations. | |
673 | .PP | |
674 | RCS must prevent two or more persons from depositing competing changes of the | |
675 | same revision. | |
676 | Suppose two programmers check out revision 2.4 and | |
677 | modify it. Programmer A checks in a revision before programmer B. | |
678 | Unfortunately, programmer B has not seen A's | |
679 | changes, so the effect is that A's changes are covered up by B's deposit. | |
680 | A's changes are not lost since all revisions | |
681 | are saved, but they are confined to a single revision\(dd. | |
682 | .FS | |
683 | \(dd Note that this problem is entirely different from the atomicity problem. | |
684 | Atomicity means that | |
685 | concurrent update operations on the same RCS file cannot be permitted, | |
686 | because that may result in inconsistent data. | |
687 | Atomic updates are essential (and implemented in RCS), | |
688 | but do not solve the conflict discussed here. | |
689 | .FE | |
690 | .PP | |
691 | This conflict is prevented in RCS by locking. | |
692 | Whenever someone intends to edit a revision (as opposed | |
693 | to reading or compiling it), the revision should be checked out | |
694 | and locked, | |
695 | using the \fI-l\fR option on \fIco\fR. On subsequent check-in, | |
696 | \fIci\fR tests the lock and then removes it. | |
697 | At most one programmer at a time may | |
698 | lock a particular revision, and only this programmer may check in | |
699 | the succeeding revision. | |
700 | Thus, while a revision is locked, it is the exclusive responsibility | |
701 | of the locker. | |
702 | .PP | |
703 | An important maxim for software tools like RCS is that they must | |
704 | not stand in the way of making progress with a project. | |
705 | This consideration leads to several weakenings of the locking mechanism. | |
706 | First of all, even if a revision is locked, it can | |
707 | still be checked out. This is necessary if other people | |
708 | wish to compile or inspect the locked revision | |
709 | while the next one is in preparation. The only operations they | |
710 | cannot do are to lock the revision or to check in the succeeding one. Secondly, | |
711 | check-in operations on other branches in the RCS file are still possible; the | |
712 | locking of one revision does not affect any other revision. | |
713 | Thirdly, revisions are occasionally locked for a long period of time | |
714 | because a programmer is absent or otherwise unable to complete | |
715 | the assignment. If another programmer has to make a pressing change, | |
716 | there are the following three alternatives for making progress: | |
717 | a) find out who is holding the lock and ask that person to release it; | |
718 | b) check out the locked revision, modify it, check it | |
719 | in on a branch, and merge the changes later; | |
720 | c) break the lock. Breaking a lock leaves a highly visible | |
721 | trace, namely an electronic mail message that is sent automatically to the | |
722 | holder of the lock, recording the breaker and a commentary requested from him. | |
723 | Thus, breaking locks is tolerated under certain circumstances, | |
724 | but will not go unnoticed. | |
725 | Experience has shown that the automatic mail message attaches a high enough | |
726 | stigma to lock breaking, | |
727 | such that programmers break locks only in real emergencies, | |
728 | or when a co-worker resigns and leaves locked revisions behind. | |
729 | .PP | |
730 | If an RCS file is private, i.e., when a programmer owns an RCS file | |
731 | and does not expect anyone else to perform check-in operations, | |
732 | locking is an unnecessary nuisance. | |
733 | In this case, | |
734 | the `strict locking feature' discussed earlier may be disabled, | |
735 | provided that file protection | |
736 | is set such that only the owner may write the RCS file. | |
737 | This has the effect that only the owner can check-in revisions, | |
738 | and that no lock is needed for doing so. | |
739 | .PP | |
740 | As added protection, | |
741 | each RCS file contains an access list that specifies the users | |
742 | who may execute update operations. If an access list is empty, | |
743 | only normal UNIX file protection applies. Thus, the access list is | |
744 | useful for restricting the set of people who would otherwise have update | |
745 | permission. Just as with locking, the access list | |
746 | has no effect on read-only operations such as \fIco\fR. This approach | |
747 | is consistent with the UNIX philosophy of openness, which contributes | |
748 | to a productive software development environment. | |
749 | .NH | |
750 | Configuration Management | |
751 | .PP | |
752 | The preceding sections described how RCS deals with revisions of individual | |
753 | components; this section discusses how to handle configurations. | |
754 | A configuration is a set of revisions, where each revision comes | |
755 | from a different revision group, and the revisions are selected | |
756 | according to a certain criterion. | |
757 | For example, | |
758 | in order to build a functioning compiler, the `right' | |
759 | revisions from the scanner, the parser, the optimizer | |
760 | and the code generator must be combined. | |
761 | RCS, in conjunction with MAKE, | |
762 | provides a number of facilities to effect a smooth selection. | |
763 | .NH 2 | |
764 | RCS Selection Functions | |
765 | .PP | |
766 | .IP "\fIDefault selection\fR" | |
767 | .sp 0 | |
768 | During development, the usual selection criterion is to choose | |
769 | the latest revision of all components. The \fIco\fR command | |
770 | makes this selection by default. For example, the command | |
771 | .D( | |
772 | co *,v | |
773 | .D) | |
774 | retrieves the latest revision on the default branch of each RCS file | |
775 | in the current directory. | |
776 | The default branch is usually the trunk, but may be | |
777 | set to be a side branch. | |
778 | Side branches as defaults are needed in distributed software development, | |
779 | as discussed in the section on the RCS revision tree. | |
780 | ||
781 | .IP "\fIRelease based selection\fR" | |
782 | .sp 0 | |
783 | Specifying a release or branch number selects the latest revision in | |
784 | that release or branch. | |
785 | For instance, | |
786 | .D( | |
787 | co -r2 *,v | |
788 | .D) | |
789 | retrieves the latest revision with release number 2 from each RCS file. | |
790 | This selection is convenient if a release has been completed and | |
791 | development has moved on to the next release. | |
792 | ||
793 | .IP "\fIState and author based selection\fR" | |
794 | .sp 0 | |
795 | If the highest level number within a given release number | |
796 | is not the desired one, | |
797 | the state attribute can help. For example, | |
798 | .D( | |
799 | co -r2 -sReleased *,v | |
800 | .D) | |
801 | retrieves the latest revision with release number 2 whose state attribute | |
802 | is `Released'. | |
803 | Of course, the state attribute has to be set appropriately, using the | |
804 | \fIci\fR or \fIrcs\fR commands. | |
805 | Another alternative is to select a revision by its author, | |
806 | using the \fI-w\fR option. | |
807 | ||
808 | .IP "\fIDate based selection\fR" | |
809 | .sp 0 | |
810 | Revisions may also be selected by date. | |
811 | Suppose a release of an entire system was | |
812 | completed and current on March 4, at 1:00 p.m. Then the command | |
813 | .D( | |
814 | co -d"March 4, 1:00 pm" *,v | |
815 | .D) | |
816 | checks out all the components of that release, independent of the numbering. | |
817 | The \fI-d\fR option specifies a `cutoff date', i.e., | |
818 | the revision selected has a check-in date that | |
819 | is closest to, but not after the date given. | |
820 | 208z | |
821 | .IP "\fIName based selection\fR" | |
822 | .sp 0 | |
823 | The most powerful selection function is based on assigning symbolic | |
824 | names to revisions and branches. | |
825 | In large systems, a single release number or date is not sufficient | |
826 | to collect the appropriate revisions from all groups. | |
827 | For example, suppose one wishes to combine release 2 | |
828 | of one subsystem and release 15 of another. | |
829 | Most likely, the creation dates of those releases differ also. | |
830 | Thus, a single revision number or date passed to the \fIco\fR command | |
831 | will not suffice to select the right revisions. | |
832 | Symbolic revision numbers solve this problem. | |
833 | Each RCS file may contain a set of symbolic names that are mapped | |
834 | to numeric revision numbers. For example, assume | |
835 | the symbol \fIV3\fR is bound to release number 2 in file \fIs,v\fR, and to | |
836 | revision number 15.9 in \fIt,v\fR. | |
837 | Then the single command | |
838 | .D( | |
839 | co -rV3 s,v t,v | |
840 | .D) | |
841 | retrieves the latest revision of release 2 from \fIs,v\fR, | |
842 | and revision 15.9 from \fIt,v\fR. | |
843 | In a large system with many modules, checking out all | |
844 | revisions with one command greatly simplifies configuration management. | |
845 | .PP | |
846 | Judicious use of symbolic revision numbers helps with organizing | |
847 | large configurations. | |
848 | A special command, \fIrcsfreeze\fR, | |
849 | assigns a symbolic revision number to a selected revision | |
850 | in every RCS file. | |
851 | \fIRcsfreeze\fR effectively freezes a configuration. | |
852 | The assigned symbolic revision number selects all components | |
853 | of the configuration. | |
854 | If necessary, symbolic numbers | |
855 | may even be intermixed with numeric ones. Thus, \fIV3.5\fR in the | |
856 | above example | |
857 | would select revision 2.5 in \fIs,v\fR and branch 15.9.5 in \fIt,v\fR. | |
858 | .PP | |
859 | The options \fI-r\fR, \fI-s\fR, \fI-w\fR and \fI-d\fR | |
860 | may be combined. If a branch is given, the latest revision | |
861 | on that branch satisfying all conditions is retrieved; | |
862 | otherwise, the default branch is used. | |
863 | .NH 2 | |
864 | Combining MAKE and RCS | |
865 | .PP | |
866 | MAKE\u1\d | |
867 | is a program that processes configurations. | |
868 | It is driven by configuration specifications | |
869 | recorded in a special file, called a `Makefile'. | |
870 | MAKE avoids redundant processing steps | |
871 | by comparing creation dates of source and processed objects. | |
872 | For example, when instructed to compile all | |
873 | modules of a given system, it only recompiles | |
874 | those source modules that were changed | |
875 | since they were processed last. | |
876 | .PP | |
877 | MAKE has been extended with an auto-checkout feature for RCS. | |
878 | When a certain file to be processed is not present, | |
879 | MAKE attempts a check-out operation. | |
880 | If successful, MAKE performs the required processing, and then deletes | |
881 | the checked out file to conserve space. | |
882 | The selection parameters discussed above can be passed to MAKE | |
883 | either as parameters, or directly embedded in the Makefile. | |
884 | MAKE has also been extended to search the subdirectory named \fIRCS\fR | |
885 | for needed files, rather than just the current working directory. | |
886 | However, if a working file is present, MAKE totally ignores the corresponding | |
887 | RCS file and uses the working file. | |
888 | (In newer versions of MAKE distributed by AT&T and others, | |
889 | auto-checkout can be | |
890 | achieved with the rule DEFAULT, instead of a special extension of MAKE. | |
891 | However, a file checked out by the rule DEFAULT | |
892 | will not be deleted after processing. \fIRcsclean\fR can be | |
893 | used for that purpose.) | |
894 | .PP | |
895 | With auto-checkout, RCS/MAKE can effect a selection rule | |
896 | especially tuned for multi-person software development and maintenance. | |
897 | In these situations, | |
898 | programmers should obtain configurations that consist of | |
899 | the revisions they have personally checked out plus the latest | |
900 | checked in revision of all other revision groups. | |
901 | This schema can be set up as follows. | |
902 | .PP | |
903 | Each programmer chooses a working directory | |
904 | and places into it a symbolic link, named \fIRCS\fR, | |
905 | to the directory containing the relevant RCS files. | |
906 | The symbolic link makes sure that \fIco\fR and \fIci\fR | |
907 | operations need only specify the working files, and that | |
908 | the Makefile need not be changed. | |
909 | The programmer then checks out the needed files and modifies them. | |
910 | If MAKE is invoked, | |
911 | it composes configurations by selecting those | |
912 | revisions that are checked out, and the rest from the | |
913 | subdirectory \fIRCS\fR. | |
914 | The latter selection may be controlled by a symbolic | |
915 | revision number or any of the other selection criteria. | |
916 | If there are several programmers editing in separate working directories, | |
917 | they are insulated from each other's changes until checking in their | |
918 | modifications. | |
919 | .PP | |
920 | Similarly, a maintainer can recreate an older configuration | |
921 | by starting to work in an empty working directory. | |
922 | During the initial MAKE invocation, all revisions are selected from RCS files. | |
923 | As the maintainer checks out files and modifies them, | |
924 | a new configuration is gradually built up. | |
925 | Every time MAKE is invoked, it substitutes the modified revisions | |
926 | into the configuration being manipulated. | |
927 | .PP | |
928 | A final application of RCS is to use it for storing Makefiles. | |
929 | Revision groups of Makefiles represent | |
930 | multiple versions of configurations. | |
931 | Whenever a configuration is baselined or distributed, | |
932 | the best approach is to unambiguously fix | |
933 | the configuration with a symbolic revision number by calling | |
934 | \fIrcsfreeze\fR, | |
935 | to embed that symbol into the Makefile, and to | |
936 | check in the Makefile (using the same symbolic revision number). | |
937 | With this approach, old configurations | |
938 | can be regenerated easily and reliably. | |
939 | .NH | |
940 | Usage Statistics | |
941 | .PP | |
942 | The following usage statistics were collected on two DEC VAX-11/780 | |
943 | computers of the Purdue Computer Science Department. Both machines | |
944 | are mainly used for research purposes. Thus, the data | |
945 | reflect an environment in which the majority of projects | |
946 | involve prototyping and advanced software development, | |
947 | but relatively little long-term maintenance. | |
948 | .PP | |
949 | For the first experiment, | |
950 | the \fIci\fR and \fIco\fR operations were instrumented | |
951 | to log the number of backward and forward deltas applied. | |
952 | The data were collected during a 13 month period | |
953 | from Dec. 1982 to Dec. 1983. | |
954 | Table I summarizes the results. | |
955 | .sp 0 | |
956 | .nr VS 12pts | |
957 | .vs 12pts | |
958 | .TS | |
959 | center,box,tab(#); | |
960 | c|c|c|c|c s|c s | |
961 | c|c|c|c|c s|c s | |
962 | l|n|n|n|n n|n n. | |
963 | Operation#Total#Total deltas#Mean deltas#Operations#Branch | |
964 | #operations #applied#applied#with >1 delta#operations | |
965 | _ | |
966 | co # 7867# 9320#1.18#509#(6%)#203#(3%) | |
967 | ci # 3468# 2207#0.64# 85#(2%)# 75#(2%) | |
968 | ci & co#11335#11527#1.02#594#(5%)#278#(2%) | |
969 | .TE | |
970 | .ce 1 | |
971 | Table I. Statistics for \fIco\fR and \fIci\fR operations. | |
972 | .nr VS 18pts | |
973 | .vs 18pts | |
974 | .PP | |
975 | The first two lines show statistics for check-out and check-in; | |
976 | the third line shows the combination. | |
977 | Recall that \fIci\fR performs an implicit check-out to obtain | |
978 | a revision for computing the delta. | |
979 | In all measures presented, the most recent revision (stored intact) | |
980 | counts as one delta. The number of deltas applied represents | |
981 | the number of passes necessary, where the first `pass' is a copying step. | |
982 | .PP | |
983 | Note that the check-out operation is executed more than | |
984 | twice as frequently as the check-in operation. | |
985 | The fourth column gives the mean number of deltas | |
986 | applied in all three cases. | |
987 | For \fIci\fR, the mean number of deltas applied is less | |
988 | than one. | |
989 | The reasons are that the initial check-in requires no delta at all, and that | |
990 | the only time \fIci\fR requires more than one delta is for branches. | |
991 | Column 5 shows the actual number of operations that applied more than one | |
992 | delta. | |
993 | The last column indicates that branches were not used often. | |
994 | .PP | |
995 | The last three columns demonstrate that the most recent trunk revision | |
996 | is by far the most frequently accessed. | |
997 | For RCS, check-out of | |
998 | this revision is a simple copy operation, which is the absolute minimum | |
999 | given the copy-semantics of \fIco\fR. | |
1000 | Access to older revisions and branches | |
1001 | is more common in non-academic environments, | |
1002 | yet even if access to older deltas were an order | |
1003 | of magnitude more frequent, | |
1004 | the combined average number of deltas applied would still be below 1.2. | |
1005 | Since RCS is faster than SCCS until up to 10 delta applications, | |
1006 | reverse deltas are clearly the method of choice. | |
1007 | .PP | |
1008 | The second experiment, conducted in March of 1984, | |
1009 | involved surveying the existing RCS files | |
1010 | on our two machines. The goal was to determine the mean number of | |
1011 | revisions per RCS file, as well as the space consumed by them. | |
1012 | Table II shows the results. (Tables I and II were produced at different | |
1013 | times and are unrelated.) | |
1014 | .sp 0 | |
1015 | .nr VS 12pts | |
1016 | .vs 12pts | |
1017 | .TS | |
1018 | center,box,tab(#); | |
1019 | c | c | c | c | c | c | c | |
1020 | c | c | c | c | c | c | c | |
1021 | l | n | n | n | n | n | n | |
1022 | . | |
1023 | #Total RCS#Total#Mean#Mean size of#Mean size of#Overhead | |
1024 | #files#revisions#revisions#RCS files#revisions | |
1025 | _ | |
1026 | All files #8033#11133#1.39#6156#5585#1.10 | |
1027 | Files with#1477# 4578#3.10#8074#6041#1.34 | |
1028 | \(>= 2 deltas | |
1029 | .TE | |
1030 | .ce 1 | |
1031 | Table II. Statistics for RCS files. | |
1032 | .nr VS 18pts | |
1033 | .vs 18pts | |
1034 | .PP | |
1035 | The mean number of revisions per RCS file is 1.39. | |
1036 | Columns 5 and 6 show the mean sizes (in bytes) of an RCS file | |
1037 | and of the latest revision of each RCS file, respectively. | |
1038 | The `overhead' column contains the ratio of the mean sizes. | |
1039 | Assuming that all revisions in an RCS file are approximately the same size, | |
1040 | this ratio gives a measure of the space consumed by the extra revisions. | |
1041 | .PP | |
1042 | In our sample, over 80 per cent of the RCS files contained only a single revision. | |
1043 | The reason is that our | |
1044 | systems programmers routinely check in all source files | |
1045 | on the distribution tapes, even though they may never touch them again. | |
1046 | To get a better indication of how much space savings are possible | |
1047 | with deltas, all measures with those files | |
1048 | that contained 2 or more revisions were recomputed. Only for those files | |
1049 | is RCS necessary. | |
1050 | As shown in the second line, the average number of revisions for those files is | |
1051 | 3.10, with an overhead of 1.34. This means that the extra 2.10 deltas | |
1052 | require 34 per cent extra space, or | |
1053 | 16 per cent per extra revision. | |
1054 | Rochkind\u3\d | |
1055 | measured the space consumed by SCCS, and | |
1056 | reported an average of 5 revisions per group | |
1057 | and an overhead of 1.37 (or about 9 per cent per extra revision). | |
1058 | In a later paper, Glasser\u6\d | |
1059 | observed an average of 7 revisions per group in a single, large project, | |
1060 | but provided no overhead figure. | |
1061 | In his paper on DSEE\u5\d, | |
1062 | Leblang reported that delta storage combined with blank compression | |
1063 | results in an overhead of a mere 1\-2 per cent per revision. | |
1064 | Since leading blanks accounted for about 20 per cent of the surveyed Pascal | |
1065 | programs, a revision group with 5\-10 members was smaller | |
1066 | than a single cleartext copy. | |
1067 | .PP | |
1068 | The above observations demonstrate clearly that the space needed | |
1069 | for extra revisions is small. With delta storage, the luxury of | |
1070 | keeping multiple revisions online is certainly affordable. | |
1071 | In fact, introducing a system with delta storage may reduce | |
1072 | storage requirements, because programmers often save back-up copies | |
1073 | anyway. Since back-up copies are stored much more efficiently with deltas, | |
1074 | introducing a system such as RCS may | |
1075 | actually free a considerable amount of space. | |
1076 | .NH | |
1077 | Survey of Version Control Tools | |
1078 | .PP | |
1079 | The need to keep back-up copies of software arose when | |
1080 | programs and data were no longer stored on paper media, but were entered | |
1081 | from terminals and stored on disk. | |
1082 | Back-up copies are desirable for reliability, and many modern editors | |
1083 | automatically save a back-up copy for every file touched. | |
1084 | This strategy | |
1085 | is valuable for short-term back-ups, but not suitable for long-term | |
1086 | version control, since an existing back-up copy is overwritten whenever the | |
1087 | corresponding file is edited. | |
1088 | .PP | |
1089 | Tape archives are suitable for long-term, offline storage. | |
1090 | If all changed files are dumped on a back-up tape once per day, old revisions | |
1091 | remain accessible. However, tape archives are unsatisfactory | |
1092 | for version control in several ways. First, backing up the file | |
1093 | system every 24 hours does not capture intermediate revisions. | |
1094 | Secondly, the old revisions are not online, | |
1095 | and accessing them is tedious and time-consuming. | |
1096 | In particular, it is impractical to | |
1097 | compare several old revisions of a group, | |
1098 | because that may require mounting and searching several tapes. | |
1099 | Tape archives are important fail-safe tools in the | |
1100 | event of catastrophic disk failures or accidental deletions, | |
1101 | but they are ill-suited for version control. | |
1102 | Conversely, version control tools do not obviate the | |
1103 | need for tape archives. | |
1104 | .PP | |
1105 | A natural technique for keeping several old revisions online is | |
1106 | to never delete a file. | |
1107 | Editing a file | |
1108 | simply creates a new file with the same | |
1109 | name, but with a different sequence number. | |
1110 | This technique, available as an option in DEC's VMS operating system, | |
1111 | turns out to be inadequate for version control. | |
1112 | First, it is prohibitively expensive in terms of storage costs, | |
1113 | especially since no data compression techniques are employed. | |
1114 | Secondly, indiscriminately storing every change produces too many | |
1115 | revisions, and programmers have difficulties distinguishing them. | |
1116 | The proliferation of revisions forces programmers to spend much time on | |
1117 | finding and deleting useless files. | |
1118 | Thirdly, most of the support functions like locking, logging, | |
1119 | revision selection, | |
1120 | and identification described in this paper are not available. | |
1121 | .PP | |
1122 | An alternative approach is to separate editing from revision control. | |
1123 | The user may repeatedly edit a given revision, | |
1124 | until freezing it with an explicit command. | |
1125 | Once a revision is frozen, it is stored permanently and can no longer be modified. | |
1126 | (In RCS, freezing a revisions is done with \fIci\fR.) | |
1127 | Editing a frozen revision implicitly creates a new one, which | |
1128 | can again be changed repeatedly until it is frozen itself. | |
1129 | This approach saves exactly those revisions that the user | |
1130 | considers important, and keeps the number of revisions manageable. | |
1131 | IBM's CLEAR/CASTER\u7\d, | |
1132 | AT&T's SCCS\u3\d, | |
1133 | CMU's SDC\u8\d | |
1134 | and DEC's CMS\u9\d, | |
1135 | are examples of version control systems using this approach. | |
1136 | CLEAR/CASTER maintains a data base of programs, specifications, | |
1137 | documentation and messages, using deltas. | |
1138 | Its goal is to provide control over the development process from a | |
1139 | management viewpoint. | |
1140 | SCCS stores multiple revisions of source text in an ancestral tree, | |
1141 | records a log entry for each revision, | |
1142 | provides access control, and has facilities | |
1143 | for uniquely identifying each revision. | |
1144 | An efficient delta technique | |
1145 | reduces the space consumed by each revision group. | |
1146 | SDC is much simpler than SCCS because it stores not more than | |
1147 | two revisions. However, it maintains a complete log for all old | |
1148 | revisions, some of which may be on back-up tape. | |
1149 | CMS, like SCCS, manages tree-structured revision groups, | |
1150 | but offers no identification mechanism. | |
1151 | .PP | |
1152 | Tools for dealing with configurations are still in a state of flux. | |
1153 | SCCS, SDC and CMS can be combined with MAKE or MAKE-like programs. | |
1154 | Since flexible selection rules are missing from all these tools, | |
1155 | it is sometimes difficult | |
1156 | to specify precisely which revision of each group | |
1157 | should be passed to MAKE for building a desired configuration. | |
1158 | The Xerox Cedar system\u10\d | |
1159 | provides a `System Modeller' that can rebuild | |
1160 | a configuration from an arbitrary set of module revisions. | |
1161 | The revisions of a module are only distinguished by creation time, | |
1162 | and there is no tool for managing groups. | |
1163 | Since the selection rules are primitive, | |
1164 | the System Modeller appears to be somewhat tedious to use. | |
1165 | Apollo's DSEE\u5\d | |
1166 | is a sophisticated software engineering environment. | |
1167 | It manages revision groups in a way similar to SCCS and CMS. Configurations | |
1168 | are built using `configuration threads'. | |
1169 | A configuration thread states which revision of each group | |
1170 | named in a configuration should be chosen. | |
1171 | A configuration thread may contain dynamic specifiers | |
1172 | (e.g., `choose the revisions I am currently working on, | |
1173 | and the most recent revisions otherwise'), which are bound | |
1174 | automatically at build time. | |
1175 | It also provides a notification mechanism for alerting | |
1176 | maintainers about the need to rebuild a system after a change. | |
1177 | .PP | |
1178 | RCS is based on a general model for describing | |
1179 | multi-version/multi-configuration systems\u11\d. | |
1180 | The model describes systems using AND/OR graphs, where AND nodes represent | |
1181 | configurations, and OR nodes represent version groups. | |
1182 | The model gives rise to a suit of selection rules for | |
1183 | composing configurations, almost all of which are implemented in RCS. | |
1184 | The revisions selected by RCS are passed to MAKE for configuration building. | |
1185 | Revision group management is modelled after SCCS. | |
1186 | RCS retains SCCS's best features, | |
1187 | but offers a significantly simpler user interface, | |
1188 | flexible selection rules, adequate integration with MAKE | |
1189 | and improved identification. | |
1190 | A detailed comparison of RCS and SCCS appears in Reference 4. | |
1191 | .PP | |
1192 | An important component of all revision control systems | |
1193 | is a program for computing deltas. | |
1194 | SCCS and RCS use the program \fIdiff\fR\u2\d, | |
1195 | which first computes the longest common substring of two | |
1196 | revisions, and then produces the delta from that substring. | |
1197 | The delta is simply an edit script consisting of deletion and | |
1198 | insertion commands that generate one revision from the other. | |
1199 | .PP | |
1200 | A delta based on a longest common substring is not necessarily minimal, | |
1201 | because it does not take advantage of crossing block moves. | |
1202 | Crossing block moves arise if two or more blocks of lines | |
1203 | (e.g., procedures) | |
1204 | appear in a different order in two revisions. | |
1205 | An edit script derived from a longest common substring | |
1206 | first deletes the shorter of the two blocks, and then reinserts it. | |
1207 | Heckel\u12\d | |
1208 | proposed an algorithm for detecting block moves, but | |
1209 | since the algorithm is based on heuristics, | |
1210 | there are conditions | |
1211 | under which the generated delta is far from minimal. | |
1212 | DSEE uses this algorithm combined with blank compression, | |
1213 | apparently with satisfactory overall results. | |
1214 | A new algorithm that is guaranteed to produce a minimal delta based on | |
1215 | block moves appears in Reference 13. | |
1216 | A future release of RCS will use this algorithm. | |
1217 | .PP | |
1218 | \fIAcknowledgements\fR: | |
1219 | Many people have helped make RCS a success by contributed criticisms, suggestions, | |
1220 | corrections, and even whole new commands (including manual pages). | |
1221 | The list of people is too long to be | |
1222 | reproduced here, but my sincere thanks for their help and | |
1223 | goodwill goes to all of them. | |
1224 | ||
1225 | .nr VS 12 pts | |
1226 | .vs 12pts | |
1227 | .SH | |
1228 | Appendix: Synopsis of RCS Operations | |
1229 | .LP | |
1230 | .IP "\fIci \fB\- check in revisions\fR" | |
1231 | .sp 0 | |
1232 | \fICi\fR stores the contents of a working file into the | |
1233 | corresponding RCS file as a new revision. | |
1234 | If the RCS file doesn't exist, \fIci\fR creates it. | |
1235 | \fICi\fR removes the working file, unless one of the options | |
1236 | \fI-u\fR or \fI-l\fR is present. | |
1237 | For each check-in, \fIci\fR asks for a commentary | |
1238 | describing the changes relative to the previous revision. | |
1239 | .sp 1 | |
1240 | \fICi\fR assigns the revision number given by the \fI-r\fR option; | |
1241 | if that option is missing, it derives the number from the | |
1242 | lock held by the user; if there is no lock and locking is not strict, | |
1243 | \fIci\fR increments the number of the latest revision on the trunk. | |
1244 | A side branch can only be started by explicitly specifying its | |
1245 | number with the \fI-r\fR option during check-in. | |
1246 | .sp 1 | |
1247 | \fICi\fR also determines | |
1248 | whether the revision to be checked in is different from the | |
1249 | previous one, and asks whether to proceed if not. | |
1250 | This facility simplifies check-in operations for large systems, | |
1251 | because one need not remember which files were changed. | |
1252 | .sp 1 | |
1253 | The option \fI-k\fR searches the checked in file for identification | |
1254 | markers containing | |
1255 | the attributes | |
1256 | revision number, check-in date, author and state, and assigns these | |
1257 | to the new revision rather than computing them. This option is | |
1258 | useful for software distribution: Recipients of distributed software | |
1259 | using RCS should check in updates with the \fI-k\fR option. | |
1260 | This convention guarantees that revision numbers, check-in dates, | |
1261 | etc., are the same at all sites. | |
1262 | .IP "\fIco \fB\- check out revisions\fR" | |
1263 | .sp 0 | |
1264 | \fICo\fR retrieves revisions according to revision number, | |
1265 | date, author and state attributes. It either places the revision | |
1266 | into the working file, or prints it on the std. output. | |
1267 | \fICo\fR always expands the identification markers. | |
1268 | .IP "\fIident \fB\- extract identification markers\fR" | |
1269 | .sp 0 | |
1270 | \fIIdent\fR extracts the identification markers expanded by \fIco\fR | |
1271 | from any file and prints them. | |
1272 | .IP "\fIrcs \fB\- change RCS file attributes\fR" | |
1273 | .sp 0 | |
1274 | \fIRcs\fR is an administrative operation that changes access lists, | |
1275 | locks, unlocks, breaks locks, toggles the strict-locking feature, | |
1276 | sets state attributes and symbolic revision numbers, changes the | |
1277 | description, and deletes revisions. A revision can | |
1278 | only be deleted if it is not the fork of a side branch. | |
1279 | .IP "\fIrcsclean \fB\- clean working directory\fR" | |
1280 | .sp 0 | |
1281 | \fIRcsclean\fR removes working files that were checked out but never changed. | |
1282 | .IP "\fIrcsdiff \fB\- compare revisions\fR" | |
1283 | .sp 0 | |
1284 | \fIRcsdiff\fR compares two revisions and prints their | |
1285 | difference, using the UNIX tool \fIdiff\fR. | |
1286 | One of the revisions compared may be checked out. | |
1287 | This command is useful for finding out about changes. | |
1288 | .IP "\fIrcsfreeze \fB\- freeze a configuration\fR" | |
1289 | .sp 0 | |
1290 | \fIRcsfreeze\fR assigns the same symbolic revision number | |
1291 | to a given revision in all RCS files. | |
1292 | This command is useful for accurately recording a configuration. | |
1293 | .IP "\fIrcsmerge \fB\- merge revisions\fR" | |
1294 | .sp 0 | |
1295 | \fIRcsmerge\fR merges two revisions, \fIrev1\fR and \fIrev2\fR, | |
1296 | with respect to a common ancestor. | |
1297 | A 3-way file comparison determines the segments of lines that | |
1298 | are (a) the same in all three revisions, or (b) the same in 2 revisions, | |
1299 | or (c) different in all three. For all segments of type (b) where | |
1300 | \fIrev1\fR is the differing revision, | |
1301 | the segment in \fIrev1\fR replaces the corresponding segment of \fIrev2\fR. | |
1302 | Type (c) indicates an overlapping change, is flagged as an error, and requires user | |
1303 | intervention to select the correct alternative. | |
1304 | .IP "\fIrlog \fB\- read log messages\fR" | |
1305 | .sp 0 | |
1306 | \fIRlog\fR prints the log messages and other information in an RCS file. | |
1307 | .bp | |
1308 | .LP | |
1309 | .nr VS 12 pts | |
1310 | .vs 12pts | |
1311 | .]< | |
1312 | .ds [F 1 | |
1313 | .]- | |
1314 | .ds [K FELD02 | |
1315 | .ds [K MakeArticle | |
1316 | .ds [A Feldman, Stuart I. | |
1317 | .ds [D March 1979 | |
1318 | .ds [T Make - A Program for Maintaining Computer Programs | |
1319 | .ds [J Software -- Practice and Experience | |
1320 | .ds [V 9 | |
1321 | .ds [N 3 | |
1322 | .ds [P 255-265 | |
1323 | .nr [P 1 | |
1324 | .nr [T 0 | |
1325 | .nr [A 1 | |
1326 | .nr [O 0 | |
1327 | .][ 1 journal-article | |
1328 | .ds [F 2 | |
1329 | .]- | |
1330 | .ds [K HUNT01 | |
1331 | .ds [T An Algorithm for Differential File Comparison | |
1332 | .ds [A Hunt, James W. | |
1333 | .as [A " and McIlroy, M. D. | |
1334 | .ds [I Computing Science Technical Report, Bell Laboratories | |
1335 | .ds [R 41 | |
1336 | .ds [D June 1976 | |
1337 | .nr [T 0 | |
1338 | .nr [A 1 | |
1339 | .nr [O 0 | |
1340 | .][ 4 tech-report | |
1341 | .ds [F 3 | |
1342 | .]- | |
1343 | .ds [K SCCS | |
1344 | .ds [A Rochkind, Marc J. | |
1345 | .ds [D Dec. 1975 | |
1346 | .ds [T The Source Code Control System | |
1347 | .ds [J IEEE Transactions on Software Engineering | |
1348 | .ds [V SE-1 | |
1349 | .ds [N 4 | |
1350 | .ds [P 364-370 | |
1351 | .nr [P 1 | |
1352 | .nr [T 0 | |
1353 | .nr [A 1 | |
1354 | .nr [O 0 | |
1355 | .][ 1 journal-article | |
1356 | .ds [F 4 | |
1357 | .]- | |
1358 | .ds [K TICH08 | |
1359 | .ds [T Design, Implementation, and Evaluation of a Revision Control System | |
1360 | .ds [A Tichy, Walter F. | |
1361 | .ds [B Proceedings of the 6th International Conference on Software Engineering | |
1362 | .ds [I ACM, IEEE, IPS, NBS | |
1363 | .ds [D September 1982 | |
1364 | .ds [P 58-67 | |
1365 | .nr [P 1 | |
1366 | .nr [T 0 | |
1367 | .nr [A 1 | |
1368 | .nr [O 0 | |
1369 | .][ 3 article-in-book | |
1370 | .ds [F 5 | |
1371 | .]- | |
1372 | .ds [K LEBL01 | |
1373 | .ds [A Leblang, David B. | |
1374 | .as [A " and Chase, Robert P. | |
1375 | .ds [T Computer-Aided Software Engineering in a Distributed Workstation Environment | |
1376 | .ds [O Proceedings of the ACM SIGSOFT/SIGPLAN Software Engineering Symposium | |
1377 | .as [O " on Practical Software Development Environments. | |
1378 | .ds [J SIGPLAN Notices | |
1379 | .ds [V 19 | |
1380 | .ds [N 5 | |
1381 | .ds [D May 1984 | |
1382 | .ds [P 104-112 | |
1383 | .nr [P 1 | |
1384 | .nr [T 0 | |
1385 | .nr [A 1 | |
1386 | .nr [O 0 | |
1387 | .][ 1 journal-article | |
1388 | .ds [F 1 | |
1389 | .ds [F 3 | |
1390 | .ds [F 6 | |
1391 | .]- | |
1392 | .ds [K SCCSEval | |
1393 | .ds [A Glasser, Alan L. | |
1394 | .ds [D Nov. 1978 | |
1395 | .ds [T The Evolution of a Source Code Control System | |
1396 | .ds [J Software Engineering Notes | |
1397 | .ds [V 3 | |
1398 | .ds [N 5 | |
1399 | .ds [P 122-125 | |
1400 | .nr [P 1 | |
1401 | .ds [O Proceedings of the Software Quality and Assurance Workshop. | |
1402 | .nr [T 0 | |
1403 | .nr [A 1 | |
1404 | .nr [O 1 | |
1405 | .][ 1 journal-article | |
1406 | .ds [F 5 | |
1407 | .ds [F 7 | |
1408 | .]- | |
1409 | .ds [K IBMClearCaster | |
1410 | .ds [A Brown, H.B. | |
1411 | .ds [D 1970 | |
1412 | .ds [T The Clear/Caster System | |
1413 | .ds [J Nato Conference on Software Engineering, Rome | |
1414 | .nr [T 0 | |
1415 | .nr [A 1 | |
1416 | .nr [O 0 | |
1417 | .][ 1 journal-article | |
1418 | .ds [F 3 | |
1419 | .ds [F 8 | |
1420 | .]- | |
1421 | .ds [K HabermannSDC | |
1422 | .ds [A Habermann, A. Nico | |
1423 | .ds [D Jan. 1979 | |
1424 | .ds [T A Software Development Control System | |
1425 | .ds [I Technical Report, Carnegie-Mellon University, Department of Computer Science | |
1426 | .nr [T 0 | |
1427 | .nr [A 0 | |
1428 | .nr [O 0 | |
1429 | .][ 2 book | |
1430 | .ds [F 9 | |
1431 | .]- | |
1432 | .ds [K CMS | |
1433 | .ds [A DEC, | |
1434 | .ds [T Code Management System | |
1435 | .ds [I Digital Equipment Corporation | |
1436 | .ds [O Document No. EA-23134-82 | |
1437 | .ds [D 1982 | |
1438 | .nr [T 0 | |
1439 | .nr [A 0 | |
1440 | .nr [O 0 | |
1441 | .][ 2 book | |
1442 | .ds [F 10 | |
1443 | .]- | |
1444 | .ds [K LAMP01 | |
1445 | .ds [A Lampson, Butler W. | |
1446 | .as [A " and Schmidt, Eric E. | |
1447 | .ds [T Practical Use of a Polymorphic Applicative Language | |
1448 | .ds [B Proceedings of the 10th Symposium on Principles of Programming Languages | |
1449 | .ds [I ACM | |
1450 | .ds [P 237-255 | |
1451 | .nr [P 1 | |
1452 | .ds [D January 1983 | |
1453 | .nr [T 0 | |
1454 | .nr [A 1 | |
1455 | .nr [O 0 | |
1456 | .][ 3 article-in-book | |
1457 | .ds [F 5 | |
1458 | .ds [F 11 | |
1459 | .]- | |
1460 | .ds [K TICH07 | |
1461 | .ds [T A Data Model for Programming Support Environments and its Application | |
1462 | .ds [A Tichy, Walter F. | |
1463 | .ds [B Automated Tools for Information System Design and Development | |
1464 | .ds [E Hans-Jochen Schneider and Anthony I. Wasserman | |
1465 | .ds [C Amsterdam | |
1466 | .ds [I North-Holland Publishing Company | |
1467 | .ds [D 1982 | |
1468 | .nr [T 0 | |
1469 | .nr [A 1 | |
1470 | .nr [O 0 | |
1471 | .][ 3 article-in-book | |
1472 | .ds [F 4 | |
1473 | .ds [F 2 | |
1474 | .ds [F 12 | |
1475 | .]- | |
1476 | .ds [K HECK01 | |
1477 | .ds [T A Technique for Isolating Differences Between Files | |
1478 | .ds [A Heckel, Paul | |
1479 | .ds [J Communications of the ACM | |
1480 | .ds [D April 1978 | |
1481 | .ds [V 21 | |
1482 | .ds [N 4 | |
1483 | .ds [P 264-268 | |
1484 | .nr [P 1 | |
1485 | .nr [T 0 | |
1486 | .nr [A 0 | |
1487 | .nr [O 0 | |
1488 | .][ 1 journal-article | |
1489 | .ds [F 13 | |
1490 | .]- | |
1491 | .ds [K TICH11 | |
1492 | .ds [T The String-to-String Correction Problem with Block Moves | |
1493 | .ds [A Tichy, Walter F. | |
1494 | .ds [D Nov. 1984 | |
1495 | .ds [J ACM Transactions on Computer Systems | |
1496 | .ds [V 2 | |
1497 | .ds [N 4 | |
1498 | .ds [P 309-321 | |
1499 | .nr [P 1 | |
1500 | .nr [T 0 | |
1501 | .nr [A 1 | |
1502 | .nr [O 0 | |
1503 | .][ 1 journal-article | |
1504 | .]> |