From f0b8d5679051f10d232453f4b30ff463781bbaa9 Mon Sep 17 00:00:00 2001 From: CSRG Date: Sun, 31 Jul 1983 19:06:40 -0800 Subject: [PATCH] BSD 4_2 development Work on file usr/doc/fp/Makefile Work on file usr/doc/ipc/0.t Work on file usr/doc/ipc/2.t Work on file usr/doc/ipc/4.t Work on file usr/doc/ipc/1.t Work on file usr/doc/ipc/3.t Work on file usr/doc/ipc/5.t Synthesized-from: CSRG/cd1/4.2 --- usr/doc/fp/Makefile | 7 + usr/doc/ipc/0.t | 52 +++++ usr/doc/ipc/1.t | 67 +++++++ usr/doc/ipc/2.t | 457 ++++++++++++++++++++++++++++++++++++++++++++ usr/doc/ipc/3.t | 312 ++++++++++++++++++++++++++++++ usr/doc/ipc/4.t | 408 +++++++++++++++++++++++++++++++++++++++ usr/doc/ipc/5.t | 348 +++++++++++++++++++++++++++++++++ 7 files changed, 1651 insertions(+) create mode 100644 usr/doc/fp/Makefile create mode 100644 usr/doc/ipc/0.t create mode 100644 usr/doc/ipc/1.t create mode 100644 usr/doc/ipc/2.t create mode 100644 usr/doc/ipc/3.t create mode 100644 usr/doc/ipc/4.t create mode 100644 usr/doc/ipc/5.t diff --git a/usr/doc/fp/Makefile b/usr/doc/fp/Makefile new file mode 100644 index 0000000000..117a2ec103 --- /dev/null +++ b/usr/doc/fp/Makefile @@ -0,0 +1,7 @@ +TROFF= vtroff +FP_MAN = manDefs.rno manCh0.rno manCh1.rno\ + manCh2.rno manCh3.rno manCh4.rno manCh5.rno\ + manCh6.rno manCh7.rno refs.rno manApp.rno + +man: ${FP_MAN} + @tbl ${FP_MAN} | eqn | ${TROFF} -me diff --git a/usr/doc/ipc/0.t b/usr/doc/ipc/0.t new file mode 100644 index 0000000000..fcccf8dd2f --- /dev/null +++ b/usr/doc/ipc/0.t @@ -0,0 +1,52 @@ +.ds lq `` +.ds rq '' +.de DT +.if t .ta .5i 1.25i 2.5i 3.75i +.\" 3.5i went to 3.8i +.if n .ta .7i 1.75i 3.8i +.. +.bd S B 3 +.TL +A 4.2BSD Interprocess Communication Primer +.br +DRAFT of \*(DY +.AU +Samuel J. Leffler +.AU +Robert S. Fabry +.AU +William N. Joy +.AI +Computer Systems Research Group +Department of Electrical Engineering and Computer Science +University of California, Berkeley +Berkeley, California 94720 +(415) 642-7780 +.de IR +\fI\\$1\fP\\$2 +.. +.de UX +UNIX\\$1 +.. +.AB +.PP +.FS +* DEC and VAX are trademarks of +Digital Equipment Corporation. +.FE +.FS +** \s-2UNIX\s0 is a Trademark of Bell Laboratories. +.FE +This document provides an introduction to the interprocess +communication facilities included in the +4.2BSD release of the VAX* +.UX ** +system. +.PP +It discusses the overall model for interprocess communication +and introduces the interprocess communication primitives +which have been added to the system. The majority of the +document considers the use of these primitives in developing +applications. The reader is expected to be familiar with +the C programming language as all examples are written in C. +.AE diff --git a/usr/doc/ipc/1.t b/usr/doc/ipc/1.t new file mode 100644 index 0000000000..b1ed2f1a8f --- /dev/null +++ b/usr/doc/ipc/1.t @@ -0,0 +1,67 @@ +.ds LH "4.2BSD IPC Primer +.ds RH Introduction +.LP +.nr H1 1 +.bp +.ds RF "Leffler/Fabry/Joy +.ds LF "DRAFT of \*(DY +.ds CF " +.LG +.B +.ce +1. INTRODUCTION +.sp 2 +.R +.NL +One of the most important parts of 4.2BSD is the interprocess +communication facilities. These facilities are the result of +more than two years of discussion and research. The facilities +provided in 4.2BSD incorporate many of the ideas from current +research, while trying to maintain the UNIX philosophy of +simplicity and conciseness. It is hoped that +the interprocess communication +facilities included in 4.2BSD will establish a +standard for UNIX. From the response to the design, +it appears many organizations carrying out +work with UNIX are adopting it. +.PP +UNIX has previously been very weak in the area of interprocess +communication. Prior to the 4.2BSD facilities, the only +standard mechanism which allowed two processes to communicate were +pipes (the mpx files which were part of Version 7 were +experimental). Unfortunately, pipes are very restrictive +in that +the two communicating processes must be related through a +common ancestor. +Further, the semantics of pipes makes them almost impossible +to maintain in a distributed environment. +.PP +Earlier attempts at extending the ipc facilities of UNIX have +met with mixed reaction. The majority of the problems have +been related to the fact these facilities have been tied to +the UNIX file system; either through naming, or implementation. +Consequently, the ipc facilities provided in 4.2BSD have been +designed as a totally independent subsystem. The 4.2BSD ipc +allows processes to rendezvous in many ways. +Processes may rendezvous through a UNIX file system-like +name space (a space where all names are path names) +as well as through a +network name space. In fact, new name spaces may +be added at a future time with only minor changes visible +to users. Further, the communication facilities +have been extended to included more than the simple byte stream +provided by a pipe-like entity. These extensions have resulted +in a completely new part of the system which users will need +time to familiarize themselves with. It is likely that as +more use is made of these facilities they will be refined; +only time will tell. +.PP +The remainder of this document is organized in four sections. +Section 2 introduces the new system calls and the basic model +of communication. Section 3 describes some of the supporting +library routines users may find useful in constructing distributed +applications. Section 4 is concerned with the client/server model +used in developing applications and includes examples of the +two major types of servers. Section 5 delves into advanced topics +which sophisticated users are likely to encounter when using +the ipc facilities. diff --git a/usr/doc/ipc/2.t b/usr/doc/ipc/2.t new file mode 100644 index 0000000000..8f430df24c --- /dev/null +++ b/usr/doc/ipc/2.t @@ -0,0 +1,457 @@ +.ds RH "Basics +.bp +.nr H1 2 +.nr H2 0 +.bp +.LG +.B +.ce +2. BASICS +.sp 2 +.R +.NL +.PP +The basic building block for communication is the \fIsocket\fP. +A socket is an endpoint of communication to which a name may +be \fIbound\fP. Each socket in use has a \fItype\fP +and one or more associated processes. Sockets exist within +\fIcommunication domains\fP. +A communication domain is an +abstraction introduced to bundle common properties of +processes communicating through sockets. +One such property is the scheme used to name sockets. For +example, in the UNIX communication domain sockets are +named with UNIX path names; e.g. a +socket may be named \*(lq/dev/foo\*(rq. Sockets normally +exchange data only with +sockets in the same domain (it may be possible to cross domain +boundaries, but only if some translation process is +performed). The +4.2BSD ipc supports two separate communication domains: +the UNIX domain, and the Internet domain is used by +processes which communicate +using the the DARPA standard communication protocols. +The underlying communication +facilities provided by these domains have a significant influence +on the internal system implementation as well as the interface to +socket facilities available to a user. An example of the +latter is that a socket \*(lqoperating\*(rq in the UNIX domain +sees a subset of the possible error conditions which are possible +when operating in the Internet domain. +.NH 2 +Socket types +.PP +Sockets are +typed according to the communication properties visible to a +user. +Processes are presumed to communicate only between sockets of +the same type, although there is +nothing that prevents communication between sockets of different +types should the underlying communication +protocols support this. +.PP +Three types of sockets currently are available to a user. +A \fIstream\fP socket provides for the bidirectional, reliable, +sequenced, and unduplicated flow of data without record boundaries. +Aside from the bidirectionality of data flow, a pair of connected +stream sockets provides an interface nearly identical to that of pipes*. +.FS +* In the UNIX domain, in fact, the semantics are identical and, +as one might expect, pipes have been implemented internally +as simply a pair of connected stream sockets. +.FE +.PP +A \fIdatagram\fP socket supports bidirectional flow of data which +is not promised to be sequenced, reliable, or unduplicated. +That is, a process +receiving messages on a datagram socket may find messages duplicated, +and, possibly, +in an order different from the order in which it was sent. +An important characteristic of a datagram +socket is that record boundaries in data are preserved. Datagram +sockets closely model the facilities found in many contemporary +packet switched networks such as the Ethernet. +.PP +A \fIraw\fP socket provides users access to +the underlying communication +protocols which support socket abstractions. +These sockets are normally datagram oriented, though their +exact characteristics are dependent on the interface provided by +the protocol. Raw sockets are not intended for the general user; they +have been provided mainly for those interested in developing new +communication protocols, or for gaining access to some of the more +esoteric facilities of an existing protocol. The use of raw sockets +is considered in section 5. +.PP +Two potential socket types which have interesting properties are +the \fIsequenced packet\fP socket and the \fIreliably delivered +message\fP socket. A sequenced packet socket is identical to +a stream socket +with the exception that record boundaries are preserved. This interface +is very similar to that provided by the Xerox NS Sequenced Packet protocol. +The reliably delivered message socket has +similar properties to a datagram socket, but with +reliable delivery. While these two socket types have been loosely defined, +they are currently unimplemented in 4.2BSD. As such, in this +document we will concern ourselves +only with the three socket types for which support exists. +.NH 2 +Socket creation +.PP +To create a socket the \fIsocket\fP system call is used: +.DS +s = socket(domain, type, protocol); +.DE +This call requests that the system create a socket in the specified +\fIdomain\fP and of the specified \fItype\fP. A particular protocol may +also be requested. If the protocol is left unspecified (a value +of 0), the system will select an appropriate protocol from those +protocols which comprise the communication domain and which +may be used to support the requested socket type. The user is +returned a descriptor (a small integer number) which may be used +in later system calls which operate on sockets. The domain is specified as +one of the manifest constants defined in the file <\fIsys/socket.h\fP>. +For the UNIX domain the constant is AF_UNIX*; for the Internet +.FS +* The manifest constants are named AF_whatever as they indicate +the ``address format'' to use in interpreting names. +.FE +domain AF_INET. The socket types are also defined in this file +and one of SOCK_STREAM, SOCK_DGRAM, or SOCK_RAW must be specified. +To create a stream socket in the Internet domain the following +call might be used: +.DS +s = socket(AF_INET, SOCK_STREAM, 0); +.DE +This call would result in a stream socket being created with the TCP +protocol providing the underlying communication support. To +create a datagram socket for on-machine use a sample call might +be: +.DS +s = socket(AF_UNIX, SOCK_DGRAM, 0); +.DE +.PP +To obtain a particular protocol one selects the protocol number, +as defined within the communication domain. For the Internet +domain the available protocols are defined in <\fInetinet/in.h\fP> +or, better yet, one may use one of the library routines +discussed in section 3, such as \fIgetprotobyname\fP: +.DS +#include +#include +#include +#include + ... +pp = getprotobyname("tcp"); +s = socket(AF_INET, SOCK_STREAM, pp->p_proto); +.DE +.PP +There are several reasons a socket call may fail. Aside from +the rare occurrence of lack of memory (ENOBUFS), a socket +request may fail due to a request for an unknown protocol +(EPROTONOSUPPORT), or a request for a type of socket for +which there is no supporting protocol (EPROTOTYPE). +.NH 2 +Binding names +.PP +A socket is created without a name. Until a name is bound +to a socket, processes have no way to reference it and, consequently, +no messages may be received on it. The \fIbind\fP call is used to +assign a name to a socket: +.DS +bind(s, name, namelen); +.DE +The bound name is a variable length byte string which is interpreted +by the supporting protocol(s). Its interpretation may vary from +communication domain to communication domain (this is one of +the properties which comprise the \*(lqdomain\*(rq). In the +UNIX domain names are path names while in the Internet domain +names contain an Internet address and port number. +If one wanted to bind the name \*(lq/dev/foo\*(rq to +a UNIX domain socket, the following would be used: +.DS +bind(s, "/dev/foo", sizeof ("/dev/foo") \- 1); +.DE +(Note how the null byte in the name is not counted as part of +the name.) In binding an Internet address things become more +complicated. The actual call is simple, +.DS +#include +#include + ... +struct sockaddr_in sin; + ... +bind(s, &sin, sizeof (sin)); +.DE +but the selection of what to place in the address \fIsin\fP +requires some discussion. We will come back to the problem +of formulating Internet addresses in section 3 when +the library routines used in name resolution are discussed. +.NH 2 +Connection establishment +.PP +With a bound socket it is possible to rendezvous with +an unrelated process. This operation is usually asymmetric +with one process a \*(lqclient\*(rq and the other a \*(lqserver\*(rq. +The client requests services from the server by initiating a +\*(lqconnection\*(rq to the server's socket. The server, when +willing to offer its advertised services, passively \*(lqlistens\*(rq +on its socket. On the client side the \fIconnect\fP call is +used to initiate a connection. Using the UNIX domain, this +might appear as, +.DS +connect(s, "server-name", sizeof ("server-name")); +.DE +while in the Internet domain, +.DS +struct sockaddr_in server; +connect(s, &server, sizeof (server)); +.DE +If the client process's socket is unbound at the time of +the connect call, +the system will automatically select and bind a name to +the socket; c.f. section 5.4. +An error is returned when the connection was unsuccessful +(any name automatically bound by the system, however, remains). +Otherwise, the socket is associated with the server and +data transfer may begin. +.PP +Many errors can be returned when a connection attempt +fails. The most common are: +.IP ETIMEDOUT +.br +After failing to establish a connection for a period of time, +the system decided there was no point in retrying the +connection attempt any more. This usually occurs because +the destination host is down, or because problems in +the network resulted in transmissions being lost. +.IP ECONNREFUSED +.br +The host refused service for some reason. When connecting +to a host running 4.2BSD this is usually +due to a server process +not being present at the requested name. +.IP "ENETDOWN or EHOSTDOWN" +.br +These operational errors are +returned based on status information delivered to +the client host by the underlying communication services. +.IP "ENETUNREACH or EHOSTUNREACH" +.br +These operational errors can occur either because the network +or host is unknown (no route to the network or host is present), +or because of status information returned by intermediate +gateways or switching nodes. Many times the status returned +is not sufficient to distinguish a network being down from a +host being down. In these cases the system is conservative and +indicates the entire network is unreachable. +.PP +For the server to receive a client's connection it must perform +two steps after binding its socket. +The first is to indicate a willingness to listen for +incoming connection requests: +.DS +listen(s, 5); +.DE +The second parameter to the \fIlisten\fP call specifies the maximum +number of outstanding connections which may be queued awaiting +acceptance by the server process. Should a connection be +requested while the queue is full, the connection will not be +refused, but rather the individual messages which comprise the +request will be ignored. This gives a harried server time to +make room in its pending connection queue while the client +retries the connection request. Had the connection been returned +with the ECONNREFUSED error, the client would be unable to tell +if the server was up or not. As it is now it is still possible +to get the ETIMEDOUT error back, though this is unlikely. The +backlog figure supplied with the listen call is limited +by the system to a maximum of 5 pending connections on any +one queue. This avoids the problem of processes hogging system +resources by setting an infinite backlog, then ignoring +all connection requests. +.PP +With a socket marked as listening, a server may \fIaccept\fP +a connection: +.DS +fromlen = sizeof (from); +snew = accept(s, &from, &fromlen); +.DE +A new descriptor is returned on receipt of a connection (along with +a new socket). If the server wishes to find out who its client is, +it may supply a buffer for the client socket's name. The value-result +parameter \fIfromlen\fP is initialized by the server to indicate how +much space is associated with \fIfrom\fP, then modified on return +to reflect the true size of the name. If the client's name is not +of interest, the second parameter may be zero. +.PP +Accept normally blocks. That is, the call to accept +will not return until a connection is available or the system call +is interrupted by a signal to the process. Further, there is no +way for a process to indicate it will accept connections from only +a specific individual, or individuals. It is up to the user process +to consider who the connection is from and close down the connection +if it does not wish to speak to the process. If the server process +wants to accept connections on more than one socket, or not block +on the accept call there are alternatives; they will be considered +in section 5. +.NH 2 +Data transfer +.PP +With a connection established, data may begin to flow. To send +and receive data there are a number of possible calls. +With the peer entity at each end of a connection +anchored, a user can send or receive a message without specifying +the peer. As one might expect, in this case, then +the normal \fIread\fP and \fIwrite\fP system calls are useable, +.DS +write(s, buf, sizeof (buf)); +read(s, buf, sizeof (buf)); +.DE +In addition to \fIread\fP and \fIwrite\fP, +the new calls \fIsend\fP and \fIrecv\fP +may be used: +.DS +send(s, buf, sizeof (buf), flags); +recv(s, buf, sizeof (buf), flags); +.DE +While \fIsend\fP and \fIrecv\fP are virtually identical to +\fIread\fP and \fIwrite\fP, +the extra \fIflags\fP argument is important. The flags may be +specified as a non-zero value if one or more +of the following is required: +.DS +.TS +l l. +SOF_OOB send/receive out of band data +SOF_PREVIEW look at data without reading +SOF_DONTROUTE send data without routing packets +.TE +.DE +Out of band data is a notion specific to stream sockets, and one +which we will not immediately consider. The option to have data +sent without routing applied to the outgoing packets is currently +used only by the routing table management process, and is +unlikely to be of interest to the casual user. The ability +to preview data is, however, of interest. When SOF_PREVIEW +is specified with a \fIrecv\fP call, any data present is returned +to the user, but treated as still \*(lqunread\*(rq. That +is, the next \fIread\fP or \fIrecv\fP call applied to the socket will +return the data previously previewed. +.NH 2 +Discarding sockets +.PP +Once a socket is no longer of interest, it may be discarded +by applying a \fIclose\fP to the descriptor, +.DS +close(s); +.DE +If data is associated with a socket which promises reliable delivery +(e.g. a stream socket) when a close takes place, the system will +continue to attempt to transfer the data. +However, after a fairly long period of +time, if the data is still undelivered, it will be discarded. +Should a user have no use for any pending data, it may +perform a \fIshutdown\fP on the socket prior to closing it. +This call is of the form: +.DS +shutdown(s, how); +.DE +where \fIhow\fP is 0 if the user is no longer interested in reading +data, 1 if no more data will be sent, or 2 if no data is to +be sent or received. Applying shutdown to a socket causes +any data queued to be immediately discarded. +.NH 2 +Connectionless sockets +.PP +To this point we have been concerned mostly with sockets which +follow a connection oriented model. However, there is also +support for connectionless interactions typical of the datagram +facilities found in contemporary packet switched networks. +A datagram socket provides a symmetric interface to data +exchange. While processes are still likely to be client +and server, there is no requirement for connection establishment. +Instead, each message includes the destination address. +.PP +Datagram sockets are created as before, and each should +have a name bound to it in order that the recipient of +a message may identify the sender. To send data, +the \fIsendto\fP primitive is used, +.DS +sendto(s, buf, buflen, flags, &to, tolen); +.DE +The \fIs\fP, \fIbuf\fP, \fIbuflen\fP, and \fIflags\fP +parameters are used as before. +The \fIto\fP and \fItolen\fP +values are used to indicate the intended recipient of the +message. When using an unreliable datagram interface, it is +unlikely any errors will be reported to the sender. Where +information is present locally to recognize a message which may +never be delivered (for instance when a network is unreachable), +the call will return \-1 and the global value \fIerrno\fP will +contain an error number. +.PP +To receive messages on an unconnected datagram socket, the +\fIrecvfrom\fP primitive is provided: +.DS +recvfrom(s, buf, buflen, flags, &from, &fromlen); +.DE +Once again, the \fIfromlen\fP parameter is handled in +a value-result fashion, initially containing the size of +the \fIfrom\fP buffer. +.PP +In addition to the two calls mentioned above, datagram +sockets may also use the \fIconnect\fP call to associate +a socket with a specific address. In this case, any +data sent on the socket will automatically be addressed +to the connected peer, and only data received from that +peer will be delivered to the user. Only one connected +address is permitted for each socket (i.e. no multi-casting). +Connect requests on datagram sockets return immediately, +as this simply results in the system recording +the peer's address (as compared to a stream socket where a +connect request initiates establishment of an end to end +connection). +Other of the less +important details of datagram sockets are described +in section 5. +.NH 2 +Input/Output multiplexing +.PP +One last facility often used in developing applications +is the ability to multiplex i/o requests among multiple +sockets and/or files. This is done using the \fIselect\fP +call: +.DS +select(nfds, &readfds, &writefds, &execptfds, &timeout); +.DE +\fISelect\fP takes as arguments three bit masks, one for +the set of file descriptors for which the caller wishes to +be able to read data on, one for those descriptors to which +data is to be written, and one for which exceptional conditions +are pending. +Bit masks are created +by or-ing bits of the form \*(lq1 << fd\*(rq. That is, +a descriptor \fIfd\fP is selected if a 1 is present in +the \fIfd\fP'th bit of the mask. +The parameter \fInfds\fP specifies the range +of file descriptors (i.e. one plus the value of the largest +descriptor) specified in a mask. +.PP +A timeout value may be specified if the selection +is not to last more than a predetermined period of time. If +\fItimeout\fP is set to 0, the selection takes the form of a +\fIpoll\fP, returning immediately. If the last parameter is +a null pointer, the selection will block indefinitely*. +.FS +* To be more specific, a return takes place only when a +descriptor is selectable, or when a signal is received by +the caller, interrupting the system call. +.FE +\fISelect\fP normally returns the number of file descriptors selected. +If the \fIselect\fP call returns due to the timeout expiring, then +a value of \-1 is returned along with the error number EINTR. +.PP +\fISelect\fP provides a synchronous multiplexing scheme. +Asynchronous notification of output completion, input availability, +and exceptional conditions is possible through use of the +SIGIO and SIGURG signals described in section 5. diff --git a/usr/doc/ipc/3.t b/usr/doc/ipc/3.t new file mode 100644 index 0000000000..b14c361cc7 --- /dev/null +++ b/usr/doc/ipc/3.t @@ -0,0 +1,312 @@ +.ds RH "Network Library Routines +.bp +.nr H1 3 +.nr H2 0 +.bp +.LG +.B +.ce +3. NETWORK LIBRARY ROUTINES +.sp 2 +.R +.NL +.PP +The discussion in section 2 indicated the possible need to +locate and construct network addresses when using the +interprocess communication facilities in a distributed +environment. To aid in this task a number of routines +have been added to the standard C run-time library. +In this section we will consider the new routines provided +to manipulate network addresses. While the 4.2BSD networking +facilities support only the DARPA standard Internet protocols, +these routines have been designed with flexibility in mind. +As more communication protocols become available, we hope +the same user interface will be maintained in accessing +network-related address data bases. The only difference +should be the values returned to the user. Since these +values are normally supplied the system, users should +not need to be directly aware of the communication protocol +and/or naming conventions in use. +.PP +Locating a service on a remote host requires many levels of +mapping before client and server may +communicate. A service is assigned a name which is intended +for human consumption; e.g. \*(lqthe \fIlogin server\fP on host +monet\*(rq. +This name, and the name of the peer host, must then be translated +into network \fIaddresses\fP which are not necessarily suitable +for human consumption. Finally, the address must then used in locating +a physical \fIlocation\fP and \fIroute\fP to the service. The +specifics of these three mappings is likely to vary between +network architectures. For instance, it is desirable for a network +to not require hosts +be named in such a way that their physical location is known by +the client host. Instead, underlying services in the network +may discover the actual location of the host at the time a client +host wishes to communicate. This ability to have hosts named in +a location independent manner may induce overhead in connection +establishment, as a discovery process must take place, +but allows a host to be physically mobile without requiring it to +notify its clientele of its current location. +.PP +Standard routines are provided for: mapping host names +to network addresses, network names to network numbers, +protocol names to protocol numbers, and service names +to port numbers and the appropriate protocol to +use in communicating with the server process. The +file <\fInetdb.h\fP> must be included when using any of these +routines. +.NH 2 +Host names +.PP +A host name to address mapping is represented by +the \fIhostent\fP structure: +.DS +.DT +struct hostent { + char *h_name; /* official name of host */ + char **h_aliases; /* alias list */ + int h_addrtype; /* host address type */ + int h_length; /* length of address */ + char *h_addr; /* address */ +}; +.DE +The official name of the host and its public aliases are +returned, along with a variable length address and address +type. The routine \fIgethostbyname\fP(3N) takes a host name +and returns a \fIhostent\fP structure, +while the routine \fIgethostbyaddr\fP(3N) +maps host addresses into a \fIhostent\fP structure. It is possible +for a host to have many addresses, all having the same name. +\fIGethostybyname\fP returns the first matching entry in the data +base file \fI/etc/hosts\fP; if this is unsuitable, the lower level +routine \fIgethostent\fP(3N) may be used. For example, to +obtain a \fIhostent\fP structure for a +host on a particular network the following routine might be +used (for simplicity, only Internet addresses are considered): +.DS +.if t .ta .5i 1.0i 1.5i 2.0i +.\" 3.5i went to 3.8i +.if n .ta .7i 1.4i 2.1i 2.8i 3.5i 4.2i +#include +#include +#include +#include + ... +struct hostent * +gethostbynameandnet(name, net) + char *name; + int net; +{ + register struct hostent *hp; + register char **cp; + + sethostent(0); + while ((hp = gethostent()) != NULL) { + if (hp->h_addrtype != AF_INET) + continue; + if (strcmp(name, hp->h_name)) { + for (cp = hp->h_aliases; cp && *cp != NULL; cp++) + if (strcmp(name, *cp) == 0) + goto found; + continue; + } + found: + if (in_netof(*(struct in_addr *)hp->h_addr)) == net) + break; + } + endhostent(0); + return (hp); +} +.DE +(\fIin_netof\fP(3N) is a standard routine which returns +the network portion of an Internet address.) +.NH 2 +Network names +.PP +As for host names, routines for mapping network names to numbers, +and back, are provided. These routines return a \fInetent\fP +structure: +.DS +.DT +/* + * Assumption here is that a network number + * fits in 32 bits -- probably a poor one. + */ +struct netent { + char *n_name; /* official name of net */ + char **n_aliases; /* alias list */ + int n_addrtype; /* net address type */ + int n_net; /* network # */ +}; +.DE +The routines \fIgetnetbyname\fP(3N), \fIgetnetbynumber\fP(3N), +and \fIgetnetent\fP(3N) are the network counterparts to the +host routines described above. +.NH 2 +Protocol names +.PP +For protocols the \fIprotoent\fP structure defines the +protocol-name mapping +used with the routines \fIgetprotobyname\fP(3N), +\fIgetprotobynumber\fP(3N), +and \fIgetprotoent\fP(3N): +.DS +.DT +struct protoent { + char *p_name; /* official protocol name */ + char **p_aliases; /* alias list */ + int p_proto; /* protocol # */ +}; +.DE +.NH 2 +Service names +.PP +Information regarding services is a bit more complicated. A service +is expected to reside at a specific \*(lqport\*(rq and employ +a particular communication protocol. This view is consistent with +the Internet domain, but inconsistent with other network architectures. +Further, a service may reside on multiple ports or support multiple +protocols. If either of these occurs, the higher level library routines +will have to be bypassed in favor of homegrown routines similar in +spirit to the \*(lqgethostbynameandnet\*(rq routine described above. +A service mapping is described by the \fIservent\fP structure, +.DS +.DT +struct servent { + char *s_name; /* official service name */ + char **s_aliases; /* alias list */ + int s_port; /* port # */ + char *s_proto; /* protocol to use */ +}; +.DE +The routine \fIgetservbyname\fP(3N) maps service +names to a servent structure by specifying a service name and, +optionally, a qualifying protocol. Thus the call +.DS +sp = getservbyname("telnet", (char *)0); +.DE +returns the service specification for a telnet server using +any protocol, while the call +.DS +sp = getservbyname("telnet", "tcp"); +.DE +returns only that telnet server which uses the TCP protocol. +The routines \fIgetservbyport\fP(3N) and \fIgetservent\fP(3N) are +also provided. The \fIgetservbyport\fP routine has an interface similar +to that provided by \fIgetservbyname\fP; an optional protocol name may +be specified to qualify lookups. +.NH 2 +Miscellaneous +.PP +With the support routines described above, an application program +should rarely have to deal directly +with addresses. This allows +services to be developed as much as possible in a network independent +fashion. It is clear, however, that purging all network dependencies +is very difficult. So long as the user is required to supply network +addresses when naming services and sockets there will always some +network dependency in a program. For example, the normal +code included in client programs, such as the remote login program, +is of the form shown in Figure 1. +.KF +.DS +.if t .ta .5i 1.0i 1.5i 2.0i +.if n .ta .7i 1.4i 2.1i 2.8i +#include +#include +#include +#include +#include + ... +main(argc, argv) + char *argv[]; +{ + struct sockaddr_in sin; + struct servent *sp; + struct hostent *hp; + int s; + ... + sp = getservbyname("login", "tcp"); + if (sp == NULL) { + fprintf(stderr, "rlogin: tcp/login: unknown service\en"); + exit(1); + } + hp = gethostbyname(argv[1]); + if (hp == NULL) { + fprintf(stderr, "rlogin: %s: unknown host\en", argv[1]); + exit(2); + } + bzero((char *)&sin, sizeof (sin)); + bcopy(hp->h_addr, (char *)&sin.sin_addr, hp->h_length); + sin.sin_family = hp->h_addrtype; + sin.sin_port = sp->s_port; + s = socket(AF_INET, SOCK_STREAM, 0); + if (s < 0) { + perror("rlogin: socket"); + exit(3); + } + ... + if (connect(s, (char *)&sin, sizeof (sin)) < 0) { + perror("rlogin: connect"); + exit(5); + } + ... +} +.DE +.ce +Figure 1. Remote login client code. +.KE +(This example will be considered in more detail in section 4.) +.PP +If we wanted to make the remote login program independent of the +Internet protocols and addressing scheme we would be forced to add +a layer of routines which masked the network dependent aspects from +the mainstream login code. For the current facilities available in +the system this does not appear to be worthwhile. Perhaps when the +system is adapted to different network architectures the utilities +will be reorganized more cleanly. +.PP +Aside from the address-related data base routines, there are several +other routines available in the run-time library which are of interest +to users. These are intended mostly to simplify manipulation of +names and addresses. Table 1 summarizes the routines +for manipulating variable length byte strings and handling byte +swapping of network addresses and values. +.KF +.DS B +.TS +box; +l | l +l | l. +Call Synopsis +_ +bcmp(s1, s2, n) compare byte-strings; 0 if same, not 0 otherwise +bcopy(s1, s2, n) copy n bytes from s1 to s2 +bzero(base, n) zero-fill n bytes starting at base +htonl(val) convert 32-bit quantity from host to network byte order +htons(val) convert 16-bit quantity from host to network byte order +ntohl(val) convert 32-bit quantity from network to host byte order +ntohs(val) convert 16-bit quantity from network to host byte order +.TE +.DE +.ce +Table 1. C run-time routines. +.KE +.PP +The byte swapping routines are provided because the operating +system expects addresses to be supplied in network order. On a +VAX, or machine with similar architecture, this +is usually reversed. Consequently, +programs are sometimes required to byte swap quantities. The +library routines which return network addresses provide them +in network order so that they may simply be copied into the structures +provided to the system. This implies users should encounter the +byte swapping problem only when \fIinterpreting\fP network addresses. +For example, if an Internet port is to be printed out the following +code would be required: +.DS +printf("port number %d\en", ntohs(sp->s_port)); +.DE +On machines other than the VAX these routines are defined as null +macros. diff --git a/usr/doc/ipc/4.t b/usr/doc/ipc/4.t new file mode 100644 index 0000000000..9765059db2 --- /dev/null +++ b/usr/doc/ipc/4.t @@ -0,0 +1,408 @@ +.ds RH "Client/Server Model +.bp +.nr H1 4 +.nr H2 0 +.bp +.LG +.B +.ce +4. CLIENT/SERVER MODEL +.sp 2 +.R +.NL +.PP +The most commonly used paradigm in constructing distributed applications +is the client/server model. In this scheme client applications request +services from a server process. This implies an asymmetry in establishing +communication between the client and server which has been examined +in section 2. In this section we will look more closely at the interactions +between client and server, and consider some of the problems in developing +client and server applications. +.PP +Client and server require a well known set of conventions before +service may be rendered (and accepted). This set of conventions +comprises a protocol which must be implemented at both ends of a +connection. Depending on the situation, the protocol may be symmetric +or asymmetric. In a symmetric protocol, either side may play the +master or slave roles. In an asymmetric protocol, one side is +immutably recognized as the master, with the other the slave. +An example of a symmetric protocol is the TELNET protocol used in +the Internet for remote terminal emulation. An example +of an asymmetric protocol is the Internet file transfer protocol, +FTP. No matter whether the specific protocol used in obtaining +a service is symmetric or asymmetric, when accessing a service there +is a \*(lqclient process\*(rq and a \*(lqserver process\*(rq. We +will first consider the properties of server processes, then +client processes. +.PP +A server process normally listens at a well know address for +service requests. Alternative schemes which use a service server +may be used to eliminate a flock of server processes clogging the +system while remaining dormant most of the time. The Xerox +Courier protocol uses the latter scheme. When using Courier, a +Courier client process contacts a Courier server at the remote +host and identifies the service it requires. The Courier server +process then creates the appropriate server process based on a +data base and \*(lqsplices\*(rq the client and server together, +voiding its part in the transaction. This scheme is attractive +in that the Courier server process may provide a single contact +point for all services, as well as carrying out the initial steps +in authentication. However, while this is an attractive possibility +for standardizing access to services, it does introduce a certain +amount of overhead due to the intermediate process involved. +Implementations which provide this type of service within the +system can minimize the cost of client server +rendezvous. The \fIportal\fP notion described +in the \*(lq4.2BSD System Manual\*(rq embodies many of the ideas +found in Courier, with the rendezvous mechanism implemented internal +to the system. +.NH 2 +Servers +.PP +In 4.2BSD most servers are accessed at well known Internet addresses +or UNIX domain names. When a server is started at boot time it +advertises it services by listening at a well know location. For +example, the remote login server's main loop is of the form shown +in Figure 2. +.KF +.if t .ta .5i 1.0i 1.5i 2.0i +.if n .ta .7i 1.4i 2.1i 2.8i +.DS +main(argc, argv) + int argc; + char **argv; +{ + int f; + struct sockaddr_in from; + struct servent *sp; + + sp = getservbyname("login", "tcp"); + if (sp == NULL) { + fprintf(stderr, "rlogind: tcp/login: unknown service\en"); + exit(1); + } + ... +#ifndef DEBUG + <> +#endif + ... + sin.sin_port = sp->s_port; + ... + f = socket(AF_INET, SOCK_STREAM, 0); + ... + if (bind(f, (caddr_t)&sin, sizeof (sin)) < 0) { + ... + } + ... + listen(f, 5); + for (;;) { + int g, len = sizeof (from); + + g = accept(f, &from, &len); + if (g < 0) { + if (errno != EINTR) + perror("rlogind: accept"); + continue; + } + if (fork() == 0) { + close(f); + doit(g, &from); + } + close(g); + } +} +.DE +.ce +Figure 2. Remote login server. +.sp +.KE +.PP +The first step taken by the server is look up its service +definition: +.sp 1 +.nf +.in +5 +.if t .ta .5i 1.0i 1.5i 2.0i +.if n .ta .7i 1.4i 2.1i 2.8i +sp = getservbyname("login", "tcp"); +if (sp == NULL) { + fprintf(stderr, "rlogind: tcp/login: unknown service\en"); + exit(1); +} +.sp 1 +.in -5 +.fi +This definition is used in later portions of the code to +define the Internet port at which it listens for service +requests (indicated by a connection). +.PP +Step two is to disassociate the server from the controlling +terminal of its invoker. This is important as the server will +likely not want to receive signals delivered to the process +group of the controlling terminal. +.PP +Once a server has established a pristine environment, it +creates a socket and begins accepting service requests. +The \fIbind\fP call is required to insure the server listens +at its expected location. The main body of the loop is +fairly simple: +.DS +.if t .ta .5i 1.0i 1.5i 2.0i +.if n .ta .7i 1.4i 2.1i 2.8i +for (;;) { + int g, len = sizeof (from); + + g = accept(f, &from, &len); + if (g < 0) { + if (errno != EINTR) + perror("rlogind: accept"); + continue; + } + if (fork() == 0) { + close(f); + doit(g, &from); + } + close(g); +} +.DE +An \fIaccept\fP call blocks the server until +a client requests service. This call could return a +failure status if the call is interrupted by a signal +such as SIGCHLD (to be discussed in section 5). Therefore, +the return value from \fIaccept\fP is checked to insure +a connection has actually been established. With a connection +in hand, the server then forks a child process and invokes +the main body of the remote login protocol processing. Note +how the socket used by the parent for queueing connection +requests is closed in the child, while the socket created as +a result of the accept is closed in the parent. The +address of the client is also handed the \fIdoit\fP routine +because it requires it in authenticating clients. +.NH 2 +Clients +.PP +The client side of the remote login service was shown +earlier in Figure 1. +One can see the separate, asymmetric roles of the client +and server clearly in the code. The server is a passive entity, +listening for client connections, while the client process is +an active entity, initiating a connection when invoked. +.PP +Let us consider more closely the steps taken +by the client remote login process. As in the server process +the first step is to locate the service definition for a remote +login: +.DS +sp = getservbyname("login", "tcp"); +if (sp == NULL) { + fprintf(stderr, "rlogin: tcp/login: unknown service\en"); + exit(1); +} +.DE +Next the destination host is looked up with a +\fIgethostbyname\fP call: +.DS +hp = gethostbyname(argv[1]); +if (hp == NULL) { + fprintf(stderr, "rlogin: %s: unknown host\en", argv[1]); + exit(2); +} +.DE +With this accomplished, all that is required is to establish a +connection to the server at the requested host and start up the +remote login protocol. The address buffer is cleared, then filled +in with the Internet address of the foreign host and the port +number at which the login process resides: +.DS +bzero((char *)&sin, sizeof (sin)); +bcopy(hp->h_addr, (char *)sin.sin_addr, hp->h_length); +sin.sin_family = hp->h_addrtype; +sin.sin_port = sp->s_port; +.DE +A socket is created, and a connection initiated. +.DS +s = socket(hp->h_addrtype, SOCK_STREAM, 0); +if (s < 0) { + perror("rlogin: socket"); + exit(3); +} + ... +if (connect(s, (char *)&sin, sizeof (sin)) < 0) { + perror("rlogin: connect"); + exit(4); +} +.DE +The details of the remote login protocol will not be considered here. +.NH 2 +Connectionless servers +.PP +While connection-based services are the norm, some services +are based on the use of datagram sockets. One, in particular, +is the \*(lqrwho\*(rq service which provides users with status +information for hosts connected to a local area +network. This service, while predicated on the ability to +\fIbroadcast\fP information to all hosts connected to a particular +network, is of interest as an example usage of datagram sockets. +.PP +A user on any machine running the rwho server may find out +the current status of a machine with the \fIruptime\fP(1) program. +The output generated is illustrated in Figure 3. +.KF +.DS B +.TS +l r l l l l l. +arpa up 9:45, 5 users, load 1.15, 1.39, 1.31 +cad up 2+12:04, 8 users, load 4.67, 5.13, 4.59 +calder up 10:10, 0 users, load 0.27, 0.15, 0.14 +dali up 2+06:28, 9 users, load 1.04, 1.20, 1.65 +degas up 25+09:48, 0 users, load 1.49, 1.43, 1.41 +ear up 5+00:05, 0 users, load 1.51, 1.54, 1.56 +ernie down 0:24 +esvax down 17:04 +ingres down 0:26 +kim up 3+09:16, 8 users, load 2.03, 2.46, 3.11 +matisse up 3+06:18, 0 users, load 0.03, 0.03, 0.05 +medea up 3+09:39, 2 users, load 0.35, 0.37, 0.50 +merlin down 19+15:37 +miro up 1+07:20, 7 users, load 4.59, 3.28, 2.12 +monet up 1+00:43, 2 users, load 0.22, 0.09, 0.07 +oz down 16:09 +statvax up 2+15:57, 3 users, load 1.52, 1.81, 1.86 +ucbvax up 9:34, 2 users, load 6.08, 5.16, 3.28 +.TE +.DE +.ce +Figure 3. ruptime output. +.sp +.KE +.PP +Status information for each host is periodically broadcast +by rwho server processes on each machine. The same server +process also receives the status information and uses it +to update a database. This database is then interpreted +to generate the status information for each host. Servers +operate autonomously, coupled only by the local network and +its broadcast capabilities. +.PP +The rwho server, in a simplified form, is pictured in Figure +4. There are two separate tasks performed by the server. The +first task is to act as a receiver of status information broadcast +by other hosts on the network. This job is carried out in the +main loop of the program. Packets received at the rwho port +are interrogated to insure they've been sent by another rwho +server process, then are time stamped with their arrival time +and used to update a file indicating the status of the host. +When a host has not been heard from for an extended period of +time, the database interpretation routines assume the host is +down and indicate such on the status reports. This algorithm +is prone to error as a server may be down while a host is actually +up, but serves our current needs. +.KF +.DS +.if t .ta .5i 1.0i 1.5i 2.0i +.if n .ta .7i 1.4i 2.1i 2.8i +main() +{ + ... + sp = getservbyname("who", "udp"); + net = getnetbyname("localnet"); + sin.sin_addr = inet_makeaddr(INADDR_ANY, net); + sin.sin_port = sp->s_port; + ... + s = socket(AF_INET, SOCK_DGRAM, 0); + ... + bind(s, &sin, sizeof (sin)); + ... + sigset(SIGALRM, onalrm); + onalrm(); + for (;;) { + struct whod wd; + int cc, whod, len = sizeof (from); + + cc = recvfrom(s, (char *)&wd, sizeof (struct whod), 0, &from, &len); + if (cc <= 0) { + if (cc < 0 && errno != EINTR) + perror("rwhod: recv"); + continue; + } + if (from.sin_port != sp->s_port) { + fprintf(stderr, "rwhod: %d: bad from port\en", + ntohs(from.sin_port)); + continue; + } + ... + if (!verify(wd.wd_hostname)) { + fprintf(stderr, "rwhod: malformed host name from %x\en", + ntohl(from.sin_addr.s_addr)); + continue; + } + (void) sprintf(path, "%s/whod.%s", RWHODIR, wd.wd_hostname); + whod = open(path, FWRONLY|FCREATE|FTRUNCATE, 0666); + ... + (void) time(&wd.wd_recvtime); + (void) write(whod, (char *)&wd, cc); + (void) close(whod); + } +} +.DE +.ce +Figure 4. rwho server. +.KE +.PP +The second task performed by the server is to supply information +regarding the status of its host. This involves periodically +acquiring system status information, packaging it up in a message +and broadcasting it on the local network for other rwho servers +to hear. The supply function is triggered by a timer and +runs off a signal. Locating the system status +information is somewhat involved, but uninteresting. Deciding +where to transmit the resultant packet does, however, indicates +some problems with the current protocol. +.PP +Status information is broadcast on the local network. +For networks which do not support the notion of broadcast another +scheme must be used to simulate or +replace broadcasting. One possibility is to enumerate the +known neighbors (based on the status received). This, unfortunately, +requires some bootstrapping information, as a +server started up on a quiet network will have no +known neighbors and thus never receive, or send, any status information. +This is the identical problem faced by the routing table management +process in propagating routing status information. The standard +solution, unsatisfactory as it may be, is to inform one or more servers +of known neighbors and request that they always communicate with +these neighbors. If each server has at least one neighbor supplied +it, status information may then propagate through +a neighbor to hosts which +are not (possibly) directly neighbors. If the server is able to +support networks which provide a broadcast capability, as well as +those which do not, then networks with an +arbitrary topology may share status information*. +.FS +* One must, however, be concerned about \*(lqloops\*(rq. +That is, if a host is connected to multiple networks, it +will receive status information from itself. This can lead +to an endless, wasteful, exchange of information. +.FE +.PP +The second problem with the current scheme is that the rwho process +services only a single local network, and this network is found by +reading a file. It is important that software operating in a distributed +environment not have any site-dependent information compiled into it. +This would require a separate copy of the server at each host and +make maintenance a severe headache. 4.2BSD attempts to isolate +host-specific information from applications by providing system +calls which return the necessary information\(dg. +.FS +\(dg An example of such a system call is the \fIgethostname\fP(2) +call which returns the host's \*(lqofficial\*(rq name. +.FE +Unfortunately, no straightforward mechanism currently +exists for finding the collection +of networks to which a host is directly connected. Thus the +rwho server performs a lookup in a file +to find its local network. A better, though still +unsatisfactory, scheme used by the routing process is to interrogate +the system data structures to locate those directly connected +networks. A mechanism to acquire this information from the system +would be a useful addition. diff --git a/usr/doc/ipc/5.t b/usr/doc/ipc/5.t new file mode 100644 index 0000000000..b5de738b61 --- /dev/null +++ b/usr/doc/ipc/5.t @@ -0,0 +1,348 @@ +.ds RH "Advanced Topics +.bp +.nr H1 5 +.nr H2 0 +.bp +.LG +.B +.ce +5. ADVANCED TOPICS +.sp 2 +.R +.NL +.PP +A number of facilities have yet to be discussed. For most users +of the ipc the mechanisms already +described will suffice in constructing distributed +applications. However, others will find need to utilize some +of the features which we consider in this section. +.NH 2 +Out of band data +.PP +The stream socket abstraction includes the notion of \*(lqout +of band\*(rq data. Out of band data is a logically independent +transmission channel associated with each pair of connected +stream sockets. Out of band data is delivered to the user +independently of normal data along with the SIGURG signal. +In addition to the information passed, a logical mark is placed in +the data stream to indicate the point at which the out +of band data was sent. The remote login and remote shell +applications use this facility to propagate signals from between +client and server processes. When a signal is expected to +flush any pending output from the remote process(es), all +data up to the mark in the data stream is discarded. +.PP +The +stream abstraction defines that the out of band data facilities +must support the reliable delivery of at least one +out of band message at a time. This message may contain at least one +byte of data, and at least one message may be pending delivery +to the user at any one time. For communications protocols which +support only in-band signaling (i.e. the urgent data is +delivered in sequence with the normal data) the system extracts +the data from the normal data stream and stores it separately. +This allows users to choose between receiving the urgent data +in order and receiving it out of sequence without having to +buffer all the intervening data. +.PP +To send an out of band message the SOF_OOB flag is supplied to +a \fIsend\fP or \fIsendto\fP calls, +while to receive out of band data SOF_OOB should be indicated +when performing a \fIrecvfrom\fP or \fIrecv\fP call. +To find out if the read pointer is currently pointing at +the mark in the data stream, the SIOCATMARK ioctl is provided: +.DS +ioctl(s, SIOCATMARK, &yes); +.DE +If \fIyes\fP is a 1 on return, the next read will return data +after the mark. Otherwise (assuming out of band data has arrived), +the next read will provide data sent by the client prior +to transmission of the out of band signal. The routine used +in the remote login process to flush output on receipt of an +interrupt or quit signal is shown in Figure 5. +.KF +.DS +oob() +{ + int out = 1+1; + char waste[BUFSIZ], mark; + + signal(SIGURG, oob); + /* flush local terminal input and output */ + ioctl(1, TIOCFLUSH, (char *)&out); + for (;;) { + if (ioctl(rem, SIOCATMARK, &mark) < 0) { + perror("ioctl"); + break; + } + if (mark) + break; + (void) read(rem, waste, sizeof (waste)); + } + recv(rem, &mark, 1, SOF_OOB); + ... +} +.DE +.ce +Figure 5. Flushing terminal i/o on receipt of out of band data. +.sp +.KE +.NH 2 +Signals and process groups +.PP +Due to the existence of the SIGURG and SIGIO signals each socket has an +associated process group (just as is done for terminals). +This process group is initialized to the process group of its +creator, but may be redefined at a later time with the SIOCSPGRP +ioctl: +.DS +ioctl(s, SIOCSPGRP, &pgrp); +.DE +A similar ioctl, SIOCGPGRP, is available for determining the +current process group of a socket. +.NH 2 +Pseudo terminals +.PP +Many programs will not function properly without a terminal +for standard input and output. Since a socket is not a terminal, +it is often necessary to have a process communicating over +the network do so through a \fIpseudo terminal\fP. A pseudo +terminal is actually a pair of devices, master and slave, +which allow a process to serve as an active agent in communication +between processes and users. Data written on the slave side +of a pseudo terminal is supplied as input to a process reading +from the master side. Data written on the master side is +given the slave as input. In this way, the process manipulating +the master side of the pseudo terminal has control over the +information read and written on the slave side. The remote +login server uses pseudo terminals for remote login sessions. +A user logging in to a machine across the network is provided +a shell with a slave pseudo terminal as standard input, output, +and error. The server process then handles the communication +between the programs invoked by the remote shell and the user's +local client process. When a user sends an interrupt or quit +signal to a process executing on a remote machine, the client +login program traps the signal, sends an out of band message +to the server process who then uses the signal number, sent +as the data value in the out of band message, to perform a +\fIkillpg\fP(2) on the appropriate process group. +.NH 2 +Internet address binding +.PP +Binding addresses to sockets in the Internet domain can be +fairly complex. Communicating processes are bound +by an \fIassociation\fP. An association +is composed of local and foreign +addresses, and local and foreign ports. Port numbers are +allocated out of separate spaces, one for each Internet +protocol. Associations are always unique. That is, there +may never be duplicate tuples. +.PP +The bind system call allows a process to specify half of +an association, , while the connect +and accept primitives are used to complete a socket's association. +Since the association is created in two steps the association +uniqueness requirement indicated above could be violated unless +care is taken. Further, it is unrealistic to expect user +programs to always know proper values to use for the local address +and local port since a host may reside on multiple networks and +the set of allocated port numbers is not directly accessible +to a user. +.PP +To simplify local address binding the notion of a +\*(lqwildcard\*(rq address has been provided. When an address +is specified as INADDR_ANY (a manifest constant defined in +), the system interprets the address as +\*(lqany valid address\*(rq. For example, to bind a specific +port number to a socket, but leave the local address unspecified, +the following code might be used: +.DS +#include +#include + ... +struct sockaddr_in sin; + ... +s = socket(AF_INET, SOCK_STREAM, 0); +sin.sin_family = AF_INET; +sin.sin_addr.s_addr = INADDR_ANY; +sin.sin_port = MYPORT; +bind(s, (char *)&sin, sizeof (sin)); +.DE +Sockets with wildcarded local addresses may receive messages +directed to the specified port number, and addressed to any +of the possible addresses assigned a host. For example, +if a host is on a networks 46 and 10 and a socket is bound as +above, then an accept call is performed, the process will be +able to accept connection requests which arrive either from +network 46 or network 10. +.PP +In a similar fashion, a local port may be left unspecified +(specified as zero), in which case the system will select an +appropriate port number for it. For example: +.DS +sin.sin_addr.s_addr = MYADDRESS; +sin.sin_port = 0; +bind(s, (char *)&sin, sizeof (sin)); +.DE +The system selects the port number based on two criteria. +The first is that ports numbered 0 through 1023 are reserved +for privileged users (i.e. the super user). The second is +that the port number is not currently bound to some other +socket. In order to find a free port number in the privileged +range the following code is used by the remote shell server: +.DS +struct sockaddr_in sin; + ... +lport = IPPORT_RESERVED \- 1; +sin.sin_addr.s_addr = INADDR_ANY; + ... +for (;;) { + sin.sin_port = htons((u_short)lport); + if (bind(s, (caddr_t)&sin, sizeof (sin)) >= 0) + break; + if (errno != EADDRINUSE && errno != EADDRNOTAVAIL) { + perror("socket"); + break; + } + lport--; + if (lport == IPPORT_RESERVED/2) { + fprintf(stderr, "socket: All ports in use\en"); + break; + } +} +.DE +The restriction on allocating ports was done to allow processes +executing in a \*(lqsecure\*(rq environment to perform authentication +based on the originating address and port number. +.PP +In certain cases the algorithm used by the system in selecting +port numbers is unsuitable for an application. This is due to +associations being created in a two step process. For example, +the Internet file transfer protocol, FTP, specifies that data +connections must always originate from the same local port. However, +duplicate associations are avoided by connecting to different foreign +ports. In this situation the system would disallow binding the +same local address and port number to a socket if a previous data +connection's socket were around. To override the default port +selection algorithm then an option call must be performed prior +to address binding: +.DS +setsockopt(s, SOL_SOCKET, SO_REUSEADDR, (char *)0, 0); +bind(s, (char *)&sin, sizeof (sin)); +.DE +With the above call, local addresses may be bound which +are already in use. This does not violate the uniqueness +requirement as the system still checks at connect time to +be sure any other sockets with the same local address and +port do not have the same foreign address and port (if an +association already exists, the error EADDRINUSE is returned). +.PP +Local address binding by the system is currently +done somewhat haphazardly when a host is on multiple +networks. Logically, one would expect +the system to bind the local address associated with +the network through which a peer was communicating. +For instance, if the local host is connected to networks +46 and 10 and the foreign host is on network 32, and +traffic from network 32 were arriving via network +10, the local address to be bound would be the host's address +on network 10, not network 46. This unfortunately, is +not always the case. For reasons too complicated to discuss +here, the local address bound may be appear to be chosen +at random. This property of local address binding +will normally be invisible to users unless the foreign +host does not understand how to reach the address +selected*. +.FS +* For example, if network 46 were unknown to the host on +network 32, and the local address were bound to that located +on network 46, then even though a route between the two hosts +existed through network 10, a connection would fail. +.FE +.NH 2 +Broadcasting and datagram sockets +.PP +By using a datagram socket it is possible to send broadcast +packets on many networks supported by the system (the network +itself must support the notion of broadcasting; the system +provides no broadcast simulation in software). Broadcast +messages can place a high load on a network since they force +every host on the network to service them. Consequently, +the ability to send broadcast packets has been limited to +the super user. +.PP +To send a broadcast message, an Internet datagram socket +should be created: +.DS +s = socket(AF_INET, SOCK_DGRAM, 0); +.DE +and at least a port number should be bound to the socket: +.DS +sin.sin_family = AF_INET; +sin.sin_addr.s_addr = INADDR_ANY; +sin.sin_port = MYPORT; +bind(s, (char *)&sin, sizeof (sin)); +.DE +Then the message should be addressed as: +.DS +dst.sin_family = AF_INET; +dst.sin_addr.s_addr = INADDR_ANY; +dst.sin_port = DESTPORT; +.DE +and, finally, a sendto call may be used: +.DS +sendto(s, buf, buflen, 0, &dst, sizeof (dst)); +.DE +.PP +Received broadcast messages contain the senders address +and port (datagram sockets are anchored before +a message is allowed to go out). +.NH 2 +Signals +.PP +Two new signals have been added to the system which may +be used in conjunction with the interprocess communication +facilities. The SIGURG signal is associated with the existence +of an \*(lqurgent condition\*(rq. The SIGIO signal is used +with \*(lqinterrupt driven i/o\*(rq (not presently implemented). +SIGURG is currently supplied a process when out of band data +is present at a socket. If multiple sockets have out of band +data awaiting delivery, a select call may be used to determine +those sockets with such data. +.PP +An old signal which is useful when constructing server processes +is SIGCHLD. This signal is delivered to a process when any +children processes have changed state. Normally servers use +the signal to \*(lqreap\*(rq child processes after exiting. +For example, the remote login server loop shown in Figure 2 +may be augmented as follows: +.DS +int reaper(); + ... +sigset(SIGCHLD, reaper); +listen(f, 10); +for (;;) { + int g, len = sizeof (from); + + g = accept(f, &from, &len, 0); + if (g < 0) { + if (errno != EINTR) + perror("rlogind: accept"); + continue; + } + ... +} + ... +#include +reaper() +{ + union wait status; + + while (wait3(&status, WNOHANG, 0) > 0) + ; +} +.DE +.PP +If the parent server process fails to reap its children, +a large number of \*(lqzombie\*(rq processes may be created. -- 2.20.1