BSD 4_2 development
[unix-history] / usr / doc / uucp / network
CommitLineData
c9528a00
C
1.RP
2.if n .ls 2
3.ds RH Nowitz
4.ND "August 18, 1978"
5.TL
6A Dial-Up Network of
7UNIX\s6\uTM\d\s0
8Systems
9.AU
10D. A. Nowitz
11.AU
12M. E. Lesk
13.AI
14.MH
15.AB
16.if n .ls 2
17A network of over eighty
18.UX
19computer systems has been established using the
20telephone system as its primary communication medium.
21The network was designed to meet the growing demands for
22software distribution and exchange.
23Some advantages of our design are:
24.IP -
25The startup cost is low.
26A system needs only a dial-up port,
27but systems with automatic calling units have much more
28flexibility.
29.IP -
30No operating system changes are required to install or use the system.
31.IP -
32The communication is basically over dial-up lines,
33however, hardwired communication lines can be used
34to increase speed.
35.IP -
36The command for sending/receiving files is simple to use.
37.sp
38Keywords: networks, communications, software distribution, software maintenance
39.AE
40.NH
41Purpose
42.PP
43The widespread use of the
44.UX
45system
46.[
47ritchie thompson bstj 1978
48.]
49within Bell Laboratories
50has produced problems of software distribution and maintenance.
51A conventional mechanism was set up to distribute the operating
52system and associated programs from a central site to the
53various users.
54However this mechanism alone does not meet all software
55distribution needs.
56Remote sites generate much software and must transmit it to
57other sites.
58Some
59.UX
60systems
61are themselves central sites for redistribution
62of a particular specialized utility,
63such as the Switching Control Center System.
64Other sites have particular, often long-distance needs for
65software exchange; switching research,
66for example, is carried on in
67New Jersey, Illinois, Ohio, and Colorado.
68In addition, general purpose utility programs are written at
69all
70.UX
71system sites.
72The
73.UX
74system is modified
75and enhanced by many people in many places and
76it would be very constricting to deliver new software in a one-way
77stream without any alternative
78for the user sites to respond with changes of their own.
79.PP
80Straightforward software distribution is only part of the problem.
81A large project may exceed the capacity of a single computer and
82several machines may be used by the one group of people.
83It then becomes necessary
84for them to pass messages, data and other information back an forth
85between computers.
86.PP
87Several groups with similar problems, both inside and outside of
88Bell Laboratories, have constructed networks built of
89hardwired connections only.
90.[
91dolotta mashey 1978 bstj
92.]
93.[
94network unix system chesson
95.]
96Our network, however, uses both dial-up and hardwired
97connections so that service can be provided to as many sites as possible.
98.NH
99Design Goals
100.PP
101Although some of our machines are connected directly, others
102can only communicate over low-speed dial-up lines.
103Since the dial-up lines are often unavailable
104and file transfers may take considerable time,
105we spool all work and transmit in the background.
106We also had to adapt to a community of systems which are independently
107operated and resistant to suggestions that they should all
108buy particular hardware or install particular operating system
109modifications.
110Therefore, we make minimal demands on the local sites
111in the network.
112Our implementation requires no operating system changes;
113in fact, the transfer programs look like any other user
114entering the system through the normal dial-up login ports,
115and obeying all local protection rules.
116.PP
117We distinguish ``active'' and ``passive'' systems
118on the network.
119Active systems have an automatic calling unit
120or a hardwired line to another system,
121and can initiate a connection.
122Passive systems do not have the hardware
123to initiate a connection.
124However, an
125active system can be assigned the job of calling passive
126systems and executing work found there;
127this makes a passive system the functional equivalent of
128an active system, except for an additional delay while it waits to be polled.
129Also, people frequently log into active systems and
130request copying from one passive system to another.
131This requires two telephone calls, but even so, it is faster
132than mailing tapes.
133.PP
134Where convenient, we use hardwired communication lines.
135These permit much faster transmission and multiplexing
136of
137the communications link.
138Dial-up connections are made at either 300 or 1200 baud;
139hardwired connections are asynchronous up to 9600 baud
140and might run even faster on special-purpose communications
141hardware.
142.[
143fraser spider 1974 ieee
144.]
145.[
146fraser channel network datamation 1975
147.]
148Thus, systems typically join our network first as
149passive systems and when
150they find the service more important, they acquire
151automatic calling units and become active
152systems; eventually, they may install high-speed
153links to particular machines with which they
154handle a great deal of traffic.
155At no point, however, must users change their
156programs or procedures.
157.PP
158The basic operation of the network is very simple.
159Each participating system has a spool directory,
160in which work to be done (files to be moved, or commands to be executed
161remotely) is stored.
162A standard program,
163.I uucico ,
164performs all transfers.
165This program starts by identifying a particular communication channel
166to a remote system with which it will hold a conversation.
167.I Uucico
168then selects a device and establishes the connection,
169logs onto the remote machine
170and starts the
171.I uucico
172program on the remote machine.
173Once two of these programs are connected, they first agree on a line protocol,
174and then start exchanging work.
175Each program in turn, beginning with the calling (active system) program,
176transmits everything it needs, and then asks the other what it wants done.
177Eventually neither has any more work, and both exit.
178.PP
179In this way, all services are available from all sites; passive sites,
180however, must wait until called.
181A variety of protocols may be used; this conforms to the real,
182non-standard world.
183As long as the caller and called programs have a protocol in common,
184they can communicate.
185Furthermore, each caller knows the hours when each destination system
186should be called.
187If a destination is unavailable, the data intended for it
188remain in the spool directory until the destination machine can be reached.
189.PP
190The implementation of this
191Bell Laboratories network
192between independent sites, all of which
193store proprietary programs and data,
194illustratives the pervasive need for security
195and administrative controls over file access.
196Each site, in configuring its programs and system files,
197limits and monitors transmission.
198In order to access a file a user needs access permission
199for the machine that contains the file and access permission
200for the file itself.
201This is achieved by first requiring the user to use his password
202to log into his local machine and then his local
203machine logs into the remote machine whose files are to be accessed.
204In addition, records are kept identifying all files
205that are moved into and out of the local system,
206and how the requestor of such accesses identified
207himself.
208Some sites may arrange
209to permit users only
210to call up
211and request work to be done;
212the calling users are then called back
213before the work is actually done.
214It is then possible to verify
215that the request is legitimate from the standpoint of the
216target system, as well as the originating system.
217Furthermore, because of the call-back,
218no site can masquerade as another
219even if it knows all the necessary passwords.
220.PP
221Each machine can optionally maintain a sequence count for
222conversations with other machines and require a verification of the
223count at the start of each conversation.
224Thus, even if call back is not in use, a successful masquerade requires
225the calling party to present the correct sequence number.
226A would-be impersonator must not just steal the correct phone number,
227user name, and password, but also the sequence count, and must call in
228sufficiently promptly to precede the next legitimate request from either side.
229Even a successful masquerade will be detected on the next correct
230conversation.
231.NH
232Processing
233.PP
234The user has two commands which set up communications,
235.I uucp
236to set up file copying,
237and
238.I uux
239to set up command execution where some of the required
240resources (system and/or files)
241are not on the local machine.
242Each of these commands will put work and data files
243into the spool directory for execution by
244.I uucp
245daemons.
246Figure 1 shows the major blocks of the file transfer process.
247.SH
248File Copy
249.PP
250The
251.I uucico
252program is used to perform all communications between
253the two systems.
254It performs the following functions:
255.RS
256.IP - 3
257Scan the spool directory for work.
258.IP -
259Place a call to a remote system.
260.IP -\ \
261Negotiate a line protocol to be used.
262.IP -\ \
263Start program
264.I uucico
265on the remote system.
266.IP -\ \
267Execute all requests from both systems.
268.IP -\ \
269Log work requests and work completions.
270.RE
271.LP
272.I Uucico
273may be started in several ways;
274.RS
275.IP a) 5
276by a system daemon,
277.IP b)
278by one of the
279.I uucp
280or
281.I uux
282programs,
283.IP c)
284by a remote system.
285.RE
286.SH
287Scan For Work
288.PP
289The file names in the spool directory are constructed to allow the
290daemon programs
291.I "(uucico, uuxqt)"
292to determine the files they should look at,
293the remote machines they should call
294and the order in which the files for a particular
295remote machine should be processed.
296.SH
297Call Remote System
298.PP
299The call is made using information from several
300files which reside in the uucp program directory.
301At the start of the call process, a lock is
302set on the system being called so that another
303call will not be attempted at the same time.
304.PP
305The system name is found in a
306``systems''
307file.
308The information contained for each system is:
309.IP
310.RS
311.IP [1]
312system name,
313.IP [2]
314times to call the system
315(days-of-week and times-of-day),
316.IP [3]
317device or device type to be used for call,
318.IP [4]
319line speed,
320.IP [5]
321phone number,
322.IP [6]
323login information (multiple fields).
324.RE
325.PP
326The time field is checked against the present time to see
327if the call should be made.
328The
329.I
330phone number
331.R
332may contain abbreviations (e.g. ``nyc'', ``boston'') which get translated into dial
333sequences using a
334``dial-codes'' file.
335This permits the same ``phone number'' to be stored at every site, despite
336local variations in telephone services and dialing conventions.
337.PP
338A ``devices''
339file is scanned using fields [3] and [4] from the
340``systems''
341file to find an available device for the connection.
342The program will try all devices which satisfy
343[3] and [4] until a connection is made, or no more
344devices can be tried.
345If a non-multiplexable device is successfully opened, a lock file
346is created so that another copy of
347.I uucico
348will not try to use it.
349If the connection is complete, the
350.I
351login information
352.R
353is used to log into the remote system.
354Then
355a command is sent to the remote system
356to start the
357.I uucico
358program.
359The conversation between the two
360.I uucico
361programs begins with a handshake started by the called,
362.I SLAVE ,
363system.
364The
365.I SLAVE
366sends a message to let the
367.I MASTER
368know it is ready to receive the system
369identification and conversation sequence number.
370The response from the
371.I MASTER
372is
373verified by the
374.I SLAVE
375and if acceptable, protocol selection begins.
376.SH
377Line Protocol Selection
378.PP
379The remote system sends a message
380.IP "" 12
381P\fIproto-list\fR
382.LP
383where
384.I proto-list
385is a string of characters, each
386representing a line protocol.
387The calling program checks the proto-list
388for a letter corresponding to an available line
389protocol and returns a
390.I use-protocol
391message.
392The
393.I use-protocol
394message is
395.IP "" 12
396U\fIcode\fR
397.LP
398where code is either a one character
399protocol letter or a
400.I N
401which means there is no common protocol.
402.PP
403Greg Chesson designed and implemented the standard
404line protocol used by the uucp transmission program.
405Other protocols may be added by individual installations.
406.SH
407Work Processing
408.PP
409During processing, one program is the
410.I MASTER
411and the other is
412.I SLAVE .
413Initially, the calling program is the
414.I MASTER.
415These roles may switch one or more times during
416the conversation.
417.PP
418There are four messages used during the
419work processing, each specified by the first
420character of the message.
421They are
422.KS
423.TS
424center;
425c l.
426S send a file,
427R receive a file,
428C copy complete,
429H hangup.
430.TE
431.KE
432.LP
433The
434.I MASTER
435will send
436.I R
437or
438.I S
439messages until all work from the spool directory is
440complete, at which point an
441.I H
442message will be sent.
443The
444.I SLAVE
445will reply with
446\fISY\fR, \fISN\fR, \fIRY\fR, \fIRN\fR, \fIHY\fR, \fIHN\fR,
447corresponding to
448.I yes
449or
450.I no
451for each request.
452.PP
453The send and receive replies are
454based on permission to access the
455requested file/directory.
456After each file is copied into the spool directory
457of the receiving system,
458a copy-complete message is sent by the receiver of the file.
459The message
460.I CY
461will be sent if the
462.UX
463.I cp
464command, used to copy from the spool directory, is successful.
465Otherwise, a
466.I CN
467message is sent.
468The requests and results are logged on both systems,
469and, if requested, mail is sent to the user reporting completion
470(or the user can request status information from the log program at any time).
471.PP
472The hangup response is determined by the
473.I SLAVE
474program by a work scan of the spool directory.
475If work for the remote system exists in the
476.I SLAVE's
477spool directory, a
478.I HN
479message is sent and the programs switch roles.
480If no work exists, an
481.I HY
482response is sent.
483.PP
484A sample conversation is shown in Figure 2.
485.SH
486Conversation Termination
487.PP
488When a
489.I HY
490message is received by the
491.I MASTER
492it is echoed back to the
493.I SLAVE
494and the protocols are turned off.
495Each program sends a final "OO" message to the
496other.
497.NH
498Present Uses
499.PP
500One application of this software is remote mail.
501Normally, a
502.UX
503system user
504writes ``mail dan'' to send mail to
505user ``dan''.
506By writing ``mail usg!dan''
507the mail is sent to user
508``dan''
509on system ``usg''.
510.PP
511The primary uses of our network to date have been in software maintenance.
512Relatively few of the bytes passed between systems are intended for
513people to read.
514Instead, new programs (or new versions of programs)
515are sent to users, and potential bugs are returned to authors.
516Aaron Cohen has implemented a
517``stockroom'' which allows remote users to call in and request software.
518He keeps a ``stock list'' of available programs, and new bug
519fixes and utilities are added regularly.
520In this way, users can always obtain the latest version of anything
521without bothering the authors of the programs.
522Although the stock list is maintained on a particular system,
523the items in the stockroom may be warehoused in many places;
524typically each program is distributed from the home site of
525its author.
526Where necessary, uucp does remote-to-remote copies.
527.PP
528We also routinely retrieve test cases from other systems
529to determine whether errors on remote systems are caused
530by local misconfigurations or old versions of software,
531or whether they are bugs that must be fixed at the home site.
532This helps identify errors rapidly.
533For one set of test programs maintained by us,
534over 70% of the bugs reported from remote sites
535were due to old software, and were fixed
536merely by distributing the current version.
537.PP
538Another application of the network for software maintenance
539is to compare files on two different machines.
540A very useful utility on one machine has been
541Doug McIlroy's ``diff'' program
542which compares two text files and indicates the differences,
543line by line, between them.
544.[
545hunt mcilroy file
546.]
547Only lines which are
548not identical are printed.
549Similarly,
550the program ``uudiff''
551compares files (or directories) on two machines.
552One of these directories may be on a passive system.
553The
554``uudiff'' program
555is set up to work similarly to the inter-system mail, but it is slightly
556more complicated.
557.PP
558To avoid moving large numbers of usually identical
559files,
560.I uudiff
561computes file checksums
562on each side, and only moves files that are different
563for detailed comparison.
564For large files, this process can be iterated; checksums can be computed
565for each line, and only those lines that are different
566actually moved.
567.PP
568The ``uux'' command has
569been useful for providing remote output.
570There are some machines which do not have hard-copy
571devices, but which are connected over 9600 baud
572communication lines to machines with printers.
573The
574.I uux
575command allows the formatting of the
576printout on the local machine and printing on the
577remote machine using standard
578.UX
579command programs.
580.br
581.NH
582Performance
583.PP
584Throughput, of course, is primarily dependent on transmission speed.
585The table below shows the real throughput of characters
586on communication links of different speeds.
587These numbers represent actual data transferred;
588they do not include bytes used by the line protocol for
589data validation such as checksums and messages.
590At the higher speeds, contention for the processors on both
591ends prevents the network from driving the line full speed.
592The range of speeds represents the difference between light and
593heavy loads on the two systems.
594If desired, operating system modifications can
595be installed
596that permit full use of even very fast links.
597.KS
598.TS
599center;
600c c
601n n.
602Nominal speed Characters/sec.
603300 baud 27
6041200 baud 100-110
6059600 baud 200-850
606.TE
607.KE
608In addition to the transfer time, there is some overhead
609for making the connection and logging in ranging from
61015 seconds to 1 minute.
611Even at 300 baud, however, a typical 5,000 byte source program
612can be transferred in
613four minutes instead of the 2 days that might be required
614to mail a tape.
615.PP
616Traffic between systems is variable. Between two
617closely related systems,
618we observed
61920 files moved and 5 remote commands executed in a typical day.
620A more normal traffic out of a single system would be around
621a dozen files per day.
622.PP
623The total number of sites at present
624in the main network is
62582, which includes most of the Bell Laboratories
626full-size machines
627which run the
628.UX
629operating system.
630Geographically, the machines range from Andover, Massachusetts to
631Denver, Colorado.
632.PP
633Uucp has also
634been used to set up another network
635which connects a group of
636systems in operational sites with the home site.
637The two networks touch at one
638Bell Labs computer.
639.NH
640Further Goals
641.PP
642Eventually, we would like to develop a full system of remote software
643maintenance.
644Conventional maintenance (a support group which mails tapes)
645has many well-known disadvantages.
646.[
647brooks mythical man month 1975
648.]
649There are distribution errors and delays, resulting in old software
650running at remote sites and old bugs continually reappearing.
651These difficulties are aggravated when
652there are 100 different small systems, instead of a few large ones.
653.PP
654The availability of file transfer on a network of compatible operating
655systems
656makes it possible just to send programs directly to the end user who wants them.
657This avoids the bottleneck of negotiation and packaging in the central support
658group.
659The ``stockroom'' serves this function for new utilities
660and fixes to old utilities.
661However, it is still likely that distributions will not be sent
662and installed as often as needed.
663Users are justifiably suspicious of the ``latest version'' that has just
664arrived; all too often it features the ``latest bug.''
665What is needed is to address both problems simultaneously:
666.IP 1.
667Send distributions whenever programs change.
668.IP 2.
669Have sufficient quality control so that users will install them.
670.LP
671To do this, we recommend systematic regression testing both on the
672distributing and receiving systems.
673Acceptance testing on the receiving systems can be automated and
674permits the local system to ensure that its essential work can continue
675despite the constant installation of changes sent from elsewhere.
676The work of writing the test sequences should be recovered in lower
677counseling and distribution costs.
678.PP
679Some slow-speed network services are also being implemented.
680We now have inter-system ``mail'' and ``diff,''
681plus the many implied commands represented by ``uux.''
682However, we still need inter-system ``write'' (real-time inter-user
683communication) and ``who'' (list of people logged in
684on different systems).
685A slow-speed network of this sort may be very useful
686for speeding up counseling and education, even
687if not fast enough for the distributed data base
688applications that attract many users to networks.
689Effective use of remote execution over slow-speed lines, however,
690must await the general installation of multiplexable channels so
691that long file transfers do not lock out short inquiries.
692.NH
693Lessons
694.PP
695The following is a summary of the lessons we learned in
696building these programs.
697.IP 1.
698By starting your network in a way that requires no hardware or major operating system
699changes, you can get going quickly.
700.IP 2.
701Support will follow use.
702Since the network existed and was being used, system maintainers
703were easily persuaded to help keep it operating, including purchasing
704additional hardware to speed traffic.
705.IP 3.
706Make the network commands look like local commands.
707Our users have a resistance to learning anything new:
708all the inter-system commands look very similar to
709standard
710.UX
711system
712commands so that little training cost
713is involved.
714.IP 4.
715An initial error was not coordinating enough
716with existing communications projects: thus, the first
717version of this network was restricted to dial-up, since
718it did not support the various hardware links between systems.
719This has been fixed in the current system.
720.SH
721Acknowledgements
722.PP
723We thank G. L. Chesson for his design and implementation
724of the packet driver and protocol, and A. S. Cohen, J. Lions,
725and P. F. Long for their suggestions and assistance.
726.[
727$LIST$
728.]