computer systems has been established using the
telephone system as its primary communication medium.
The network was designed to meet the growing demands for
software distribution and exchange.
Some advantages of our design are:
A system needs only a dial-up port,
but systems with automatic calling units have much more
No operating system changes are required to install or use the system.
The communication is basically over dial-up lines,
however, hardwired communication lines can be used
The command for sending/receiving files is simple to use.
Keywords: networks, communications, software distribution, software maintenance
The widespread use of the
ritchie thompson bstj 1978
has produced problems of software distribution and maintenance.
A conventional mechanism was set up to distribute the operating
system and associated programs from a central site to the
However this mechanism alone does not meet all software
Remote sites generate much software and must transmit it to
are themselves central sites for redistribution
of a particular specialized utility,
such as the Switching Control Center System.
Other sites have particular, often long-distance needs for
software exchange; switching research,
for example, is carried on in
New Jersey, Illinois, Ohio, and Colorado.
In addition, general purpose utility programs are written at
and enhanced by many people in many places and
it would be very constricting to deliver new software in a one-way
stream without any alternative
for the user sites to respond with changes of their own.
Straightforward software distribution is only part of the problem.
A large project may exceed the capacity of a single computer and
several machines may be used by the one group of people.
It then becomes necessary
for them to pass messages, data and other information back an forth
Several groups with similar problems, both inside and outside of
Bell Laboratories, have constructed networks built of
hardwired connections only.
network unix system chesson
Our network, however, uses both dial-up and hardwired
connections so that service can be provided to as many sites as possible.
Although some of our machines are connected directly, others
can only communicate over low-speed dial-up lines.
Since the dial-up lines are often unavailable
and file transfers may take considerable time,
we spool all work and transmit in the background.
We also had to adapt to a community of systems which are independently
operated and resistant to suggestions that they should all
buy particular hardware or install particular operating system
Therefore, we make minimal demands on the local sites
Our implementation requires no operating system changes;
in fact, the transfer programs look like any other user
entering the system through the normal dial-up login ports,
and obeying all local protection rules.
We distinguish ``active'' and ``passive'' systems
Active systems have an automatic calling unit
or a hardwired line to another system,
and can initiate a connection.
Passive systems do not have the hardware
to initiate a connection.
active system can be assigned the job of calling passive
systems and executing work found there;
this makes a passive system the functional equivalent of
an active system, except for an additional delay while it waits to be polled.
Also, people frequently log into active systems and
request copying from one passive system to another.
This requires two telephone calls, but even so, it is faster
Where convenient, we use hardwired communication lines.
These permit much faster transmission and multiplexing
Dial-up connections are made at either 300 or 1200 baud;
hardwired connections are asynchronous up to 9600 baud
and might run even faster on special-purpose communications
fraser channel network datamation 1975
Thus, systems typically join our network first as
they find the service more important, they acquire
automatic calling units and become active
systems; eventually, they may install high-speed
links to particular machines with which they
handle a great deal of traffic.
At no point, however, must users change their
The basic operation of the network is very simple.
Each participating system has a spool directory,
in which work to be done (files to be moved, or commands to be executed
This program starts by identifying a particular communication channel
to a remote system with which it will hold a conversation.
then selects a device and establishes the connection,
logs onto the remote machine
program on the remote machine.
Once two of these programs are connected, they first agree on a line protocol,
and then start exchanging work.
Each program in turn, beginning with the calling (active system) program,
transmits everything it needs, and then asks the other what it wants done.
Eventually neither has any more work, and both exit.
In this way, all services are available from all sites; passive sites,
however, must wait until called.
A variety of protocols may be used; this conforms to the real,
As long as the caller and called programs have a protocol in common,
Furthermore, each caller knows the hours when each destination system
If a destination is unavailable, the data intended for it
remain in the spool directory until the destination machine can be reached.
The implementation of this
Bell Laboratories network
between independent sites, all of which
store proprietary programs and data,
illustratives the pervasive need for security
and administrative controls over file access.
Each site, in configuring its programs and system files,
limits and monitors transmission.
In order to access a file a user needs access permission
for the machine that contains the file and access permission
This is achieved by first requiring the user to use his password
to log into his local machine and then his local
machine logs into the remote machine whose files are to be accessed.
In addition, records are kept identifying all files
that are moved into and out of the local system,
and how the requestor of such accesses identified
and request work to be done;
the calling users are then called back
before the work is actually done.
It is then possible to verify
that the request is legitimate from the standpoint of the
target system, as well as the originating system.
Furthermore, because of the call-back,
no site can masquerade as another
even if it knows all the necessary passwords.
Each machine can optionally maintain a sequence count for
conversations with other machines and require a verification of the
count at the start of each conversation.
Thus, even if call back is not in use, a successful masquerade requires
the calling party to present the correct sequence number.
A would-be impersonator must not just steal the correct phone number,
user name, and password, but also the sequence count, and must call in
sufficiently promptly to precede the next legitimate request from either side.
Even a successful masquerade will be detected on the next correct
The user has two commands which set up communications,
to set up command execution where some of the required
resources (system and/or files)
are not on the local machine.
Each of these commands will put work and data files
into the spool directory for execution by
Figure 1 shows the major blocks of the file transfer process.
program is used to perform all communications between
It performs the following functions:
Scan the spool directory for work.
Place a call to a remote system.
Negotiate a line protocol to be used.
Execute all requests from both systems.
Log work requests and work completions.
may be started in several ways;
The file names in the spool directory are constructed to allow the
to determine the files they should look at,
the remote machines they should call
and the order in which the files for a particular
remote machine should be processed.
The call is made using information from several
files which reside in the uucp program directory.
At the start of the call process, a lock is
set on the system being called so that another
call will not be attempted at the same time.
The system name is found in a
The information contained for each system is:
(days-of-week and times-of-day),
device or device type to be used for call,
login information (multiple fields).
The time field is checked against the present time to see
if the call should be made.
may contain abbreviations (e.g. ``nyc'', ``boston'') which get translated into dial
This permits the same ``phone number'' to be stored at every site, despite
local variations in telephone services and dialing conventions.
file is scanned using fields [3] and [4] from the
file to find an available device for the connection.
The program will try all devices which satisfy
[3] and [4] until a connection is made, or no more
If a non-multiplexable device is successfully opened, a lock file
is created so that another copy of
If the connection is complete, the
is used to log into the remote system.
a command is sent to the remote system
The conversation between the two
programs begins with a handshake started by the called,
sends a message to let the
know it is ready to receive the system
identification and conversation sequence number.
and if acceptable, protocol selection begins.
The remote system sends a message
is a string of characters, each
representing a line protocol.
The calling program checks the proto-list
for a letter corresponding to an available line
where code is either a one character
which means there is no common protocol.
Greg Chesson designed and implemented the standard
line protocol used by the uucp transmission program.
Other protocols may be added by individual installations.
During processing, one program is the
Initially, the calling program is the
These roles may switch one or more times during
There are four messages used during the
work processing, each specified by the first
character of the message.
messages until all work from the spool directory is
complete, at which point an
\fISY\fR, \fISN\fR, \fIRY\fR, \fIRN\fR, \fIHY\fR, \fIHN\fR,
The send and receive replies are
based on permission to access the
requested file/directory.
After each file is copied into the spool directory
a copy-complete message is sent by the receiver of the file.
command, used to copy from the spool directory, is successful.
The requests and results are logged on both systems,
and, if requested, mail is sent to the user reporting completion
(or the user can request status information from the log program at any time).
The hangup response is determined by the
program by a work scan of the spool directory.
If work for the remote system exists in the
message is sent and the programs switch roles.
A sample conversation is shown in Figure 2.
message is received by the
and the protocols are turned off.
Each program sends a final "OO" message to the
One application of this software is remote mail.
writes ``mail dan'' to send mail to
By writing ``mail usg!dan''
The primary uses of our network to date have been in software maintenance.
Relatively few of the bytes passed between systems are intended for
Instead, new programs (or new versions of programs)
are sent to users, and potential bugs are returned to authors.
Aaron Cohen has implemented a
``stockroom'' which allows remote users to call in and request software.
He keeps a ``stock list'' of available programs, and new bug
fixes and utilities are added regularly.
In this way, users can always obtain the latest version of anything
without bothering the authors of the programs.
Although the stock list is maintained on a particular system,
the items in the stockroom may be warehoused in many places;
typically each program is distributed from the home site of
Where necessary, uucp does remote-to-remote copies.
We also routinely retrieve test cases from other systems
to determine whether errors on remote systems are caused
by local misconfigurations or old versions of software,
or whether they are bugs that must be fixed at the home site.
This helps identify errors rapidly.
For one set of test programs maintained by us,
over 70% of the bugs reported from remote sites
were due to old software, and were fixed
merely by distributing the current version.
Another application of the network for software maintenance
is to compare files on two different machines.
A very useful utility on one machine has been
Doug McIlroy's ``diff'' program
which compares two text files and indicates the differences,
line by line, between them.
not identical are printed.
compares files (or directories) on two machines.
One of these directories may be on a passive system.
is set up to work similarly to the inter-system mail, but it is slightly
To avoid moving large numbers of usually identical
on each side, and only moves files that are different
For large files, this process can be iterated; checksums can be computed
for each line, and only those lines that are different
been useful for providing remote output.
There are some machines which do not have hard-copy
devices, but which are connected over 9600 baud
communication lines to machines with printers.
command allows the formatting of the
printout on the local machine and printing on the
remote machine using standard
Throughput, of course, is primarily dependent on transmission speed.
The table below shows the real throughput of characters
on communication links of different speeds.
These numbers represent actual data transferred;
they do not include bytes used by the line protocol for
data validation such as checksums and messages.
At the higher speeds, contention for the processors on both
ends prevents the network from driving the line full speed.
The range of speeds represents the difference between light and
heavy loads on the two systems.
If desired, operating system modifications can
that permit full use of even very fast links.
Nominal speed Characters/sec.
In addition to the transfer time, there is some overhead
for making the connection and logging in ranging from
Even at 300 baud, however, a typical 5,000 byte source program
four minutes instead of the 2 days that might be required
Traffic between systems is variable. Between two
20 files moved and 5 remote commands executed in a typical day.
A more normal traffic out of a single system would be around
The total number of sites at present
82, which includes most of the Bell Laboratories
Geographically, the machines range from Andover, Massachusetts to
been used to set up another network
which connects a group of
systems in operational sites with the home site.
The two networks touch at one
Eventually, we would like to develop a full system of remote software
Conventional maintenance (a support group which mails tapes)
has many well-known disadvantages.
brooks mythical man month 1975
There are distribution errors and delays, resulting in old software
running at remote sites and old bugs continually reappearing.
These difficulties are aggravated when
there are 100 different small systems, instead of a few large ones.
The availability of file transfer on a network of compatible operating
makes it possible just to send programs directly to the end user who wants them.
This avoids the bottleneck of negotiation and packaging in the central support
The ``stockroom'' serves this function for new utilities
and fixes to old utilities.
However, it is still likely that distributions will not be sent
and installed as often as needed.
Users are justifiably suspicious of the ``latest version'' that has just
arrived; all too often it features the ``latest bug.''
What is needed is to address both problems simultaneously:
Send distributions whenever programs change.
Have sufficient quality control so that users will install them.
To do this, we recommend systematic regression testing both on the
distributing and receiving systems.
Acceptance testing on the receiving systems can be automated and
permits the local system to ensure that its essential work can continue
despite the constant installation of changes sent from elsewhere.
The work of writing the test sequences should be recovered in lower
counseling and distribution costs.
Some slow-speed network services are also being implemented.
We now have inter-system ``mail'' and ``diff,''
plus the many implied commands represented by ``uux.''
However, we still need inter-system ``write'' (real-time inter-user
communication) and ``who'' (list of people logged in
A slow-speed network of this sort may be very useful
for speeding up counseling and education, even
if not fast enough for the distributed data base
applications that attract many users to networks.
Effective use of remote execution over slow-speed lines, however,
must await the general installation of multiplexable channels so
that long file transfers do not lock out short inquiries.
The following is a summary of the lessons we learned in
By starting your network in a way that requires no hardware or major operating system
changes, you can get going quickly.
Since the network existed and was being used, system maintainers
were easily persuaded to help keep it operating, including purchasing
additional hardware to speed traffic.
Make the network commands look like local commands.
Our users have a resistance to learning anything new:
all the inter-system commands look very similar to
commands so that little training cost
An initial error was not coordinating enough
with existing communications projects: thus, the first
version of this network was restricted to dial-up, since
it did not support the various hardware links between systems.
This has been fixed in the current system.
We thank G. L. Chesson for his design and implementation
of the packet driver and protocol, and A. S. Cohen, J. Lions,
and P. F. Long for their suggestions and assistance.