Commit | Line | Data |
---|---|---|
d60e8dff MK |
1 | .\" Copyright (c) 1986 Regents of the University of California. |
2 | .\" All rights reserved. The Berkeley software License Agreement | |
3 | .\" specifies the terms and conditions for redistribution. | |
4 | .\" | |
5 | .\" @(#)5.t 1.2 (Berkeley) %G% | |
6 | .\" | |
200989e9 MK |
7 | .ds RH "Advanced Topics |
8 | .bp | |
9 | .nr H1 5 | |
10 | .nr H2 0 | |
11 | .bp | |
12 | .LG | |
13 | .B | |
14 | .ce | |
15 | 5. ADVANCED TOPICS | |
16 | .sp 2 | |
17 | .R | |
18 | .NL | |
19 | .PP | |
20 | A number of facilities have yet to be discussed. For most users | |
d60e8dff | 21 | of the IPC the mechanisms already |
200989e9 | 22 | described will suffice in constructing distributed |
d60e8dff | 23 | applications. However, others will find the need to utilize some |
200989e9 MK |
24 | of the features which we consider in this section. |
25 | .NH 2 | |
26 | Out of band data | |
27 | .PP | |
28 | The stream socket abstraction includes the notion of \*(lqout | |
29 | of band\*(rq data. Out of band data is a logically independent | |
30 | transmission channel associated with each pair of connected | |
31 | stream sockets. Out of band data is delivered to the user | |
d60e8dff MK |
32 | independently of normal data along with the SIGURG signal |
33 | (if multiple sockets may have out of band data awaiting | |
34 | delivery, a \fIselect\fP call may be used to determine those | |
35 | sockets with such data). A process can set the process group | |
36 | or process id to be informed by the SIGURG signal via the | |
37 | appropriate \fIfcntl\fP call, as described below for | |
38 | SIGIO. | |
39 | .PP | |
200989e9 MK |
40 | In addition to the information passed, a logical mark is placed in |
41 | the data stream to indicate the point at which the out | |
42 | of band data was sent. The remote login and remote shell | |
d60e8dff | 43 | applications use this facility to propagate signals between |
200989e9 MK |
44 | client and server processes. When a signal is expected to |
45 | flush any pending output from the remote process(es), all | |
46 | data up to the mark in the data stream is discarded. | |
47 | .PP | |
48 | The | |
49 | stream abstraction defines that the out of band data facilities | |
50 | must support the reliable delivery of at least one | |
51 | out of band message at a time. This message may contain at least one | |
52 | byte of data, and at least one message may be pending delivery | |
53 | to the user at any one time. For communications protocols which | |
54 | support only in-band signaling (i.e. the urgent data is | |
d60e8dff | 55 | delivered in sequence with the normal data), the system extracts |
200989e9 MK |
56 | the data from the normal data stream and stores it separately. |
57 | This allows users to choose between receiving the urgent data | |
58 | in order and receiving it out of sequence without having to | |
d60e8dff MK |
59 | buffer all the intervening data. It is not possible |
60 | to ``peek'' (via MSG_PEEK) at out of band data. | |
200989e9 | 61 | .PP |
d60e8dff | 62 | To send an out of band message the MSG_OOB flag is supplied to |
200989e9 | 63 | a \fIsend\fP or \fIsendto\fP calls, |
d60e8dff | 64 | while to receive out of band data MSG_OOB should be indicated |
200989e9 MK |
65 | when performing a \fIrecvfrom\fP or \fIrecv\fP call. |
66 | To find out if the read pointer is currently pointing at | |
67 | the mark in the data stream, the SIOCATMARK ioctl is provided: | |
68 | .DS | |
69 | ioctl(s, SIOCATMARK, &yes); | |
70 | .DE | |
71 | If \fIyes\fP is a 1 on return, the next read will return data | |
72 | after the mark. Otherwise (assuming out of band data has arrived), | |
73 | the next read will provide data sent by the client prior | |
74 | to transmission of the out of band signal. The routine used | |
75 | in the remote login process to flush output on receipt of an | |
76 | interrupt or quit signal is shown in Figure 5. | |
77 | .KF | |
78 | .DS | |
d60e8dff MK |
79 | #include <sys/ioctl.h> |
80 | #include <sys/file.h> | |
81 | ... | |
200989e9 MK |
82 | oob() |
83 | { | |
d60e8dff | 84 | int out = FWRITE; |
200989e9 MK |
85 | char waste[BUFSIZ], mark; |
86 | ||
d60e8dff | 87 | /* flush local terminal output */ |
200989e9 MK |
88 | ioctl(1, TIOCFLUSH, (char *)&out); |
89 | for (;;) { | |
90 | if (ioctl(rem, SIOCATMARK, &mark) < 0) { | |
91 | perror("ioctl"); | |
92 | break; | |
93 | } | |
94 | if (mark) | |
95 | break; | |
96 | (void) read(rem, waste, sizeof (waste)); | |
97 | } | |
d60e8dff MK |
98 | if (recv(rem, &mark, 1, MSG_OOB) < 0) { |
99 | perror("recv"); | |
100 | ... | |
101 | } | |
200989e9 MK |
102 | ... |
103 | } | |
104 | .DE | |
105 | .ce | |
106 | Figure 5. Flushing terminal i/o on receipt of out of band data. | |
107 | .sp | |
108 | .KE | |
109 | .NH 2 | |
d60e8dff MK |
110 | Interrupt driven socket i/o |
111 | .PP | |
112 | The SIGIO signal allows a process to be notified | |
113 | via a signal when a socket (or more generally, a file | |
114 | descriptor) has data waiting to be read. Use of | |
115 | the SIGIO facility requires three steps: First, | |
116 | the process must set up a SIGIO signal handler | |
117 | by use of the \fIsignal\fP call. Second, | |
118 | it must set the process id or process group id which is to receive | |
119 | notification of pending input to its own process id, | |
120 | or the process group id of its process group (note that | |
121 | the default process group of a socket is group zero). | |
122 | This is accomplished by use of a \fIfcntl\fP call. | |
123 | Third, it must turn on notification of pending i/o requests | |
124 | with another \fIfcntl\fP call. Sample code to | |
125 | allow a given process to receive information on | |
126 | pending i/o requests as they occur for a socket \fIs\fP | |
127 | is given in Figure 6. With slight change, this code can also | |
128 | be used to prepare for receipt of SIGURG signals. | |
129 | .KF | |
130 | .DS | |
131 | #include <fcntl.h> | |
132 | ... | |
133 | int io_handler(); | |
134 | ... | |
135 | signal(SIGIO, io_handler); | |
136 | ||
137 | /* Set the process receiving SIGIO/SIGURG signals to us */ | |
138 | ||
139 | if (fcntl(s, F_SETOWN, getpid()) < 0) { | |
140 | perror("fcntl F_SETOWN"); | |
141 | exit(1); | |
142 | } | |
143 | ||
144 | /* Allow receipt of asynchronous i/o signals */ | |
145 | ||
146 | if (fcntl(s, F_SETFL, FASYNC) < 0) { | |
147 | perror("fcntl F_SETFL, FASYNC"); | |
148 | exit(1); | |
149 | } | |
150 | .DE | |
151 | .ce | |
152 | Figure 6. Use of asynchronous notification of i/o requests. | |
153 | .sp | |
154 | .KE | |
155 | .NH 2 | |
200989e9 MK |
156 | Signals and process groups |
157 | .PP | |
158 | Due to the existence of the SIGURG and SIGIO signals each socket has an | |
d60e8dff MK |
159 | associated process number, just as is done for terminals. |
160 | This value is initialized to zero, | |
161 | but may be redefined at a later time with the F_SETOWN | |
162 | \fIfcntl\fP, such as was done in the code above for SIGIO. | |
163 | To set the socket's process id for signals, positive arguments | |
164 | should be given to the \fIfcntl\fP call. To set the socket's | |
165 | process group for signals, negative arguments should be | |
166 | passed to \fIfcntl\fP. Note that the process number indicates | |
167 | either the associated process id or the associated process | |
168 | group; it is impossible to specify both at the same time. | |
169 | A similar \fIfcntl\fP, F_GETOWN, is available for determining the | |
170 | current process number of a socket. | |
171 | .PP | |
172 | An old signal which is useful when constructing server processes | |
173 | is SIGCHLD. This signal is delivered to a process when any | |
174 | children processes have changed state. Normally servers use | |
175 | the signal to \*(lqreap\*(rq child processes after exiting. | |
176 | For example, the remote login server loop shown in Figure 2 | |
177 | may be augmented as shown in Figure 7. | |
178 | .KF | |
200989e9 | 179 | .DS |
d60e8dff MK |
180 | int reaper(); |
181 | ... | |
182 | signal(SIGCHLD, reaper); | |
183 | listen(f, 5); | |
184 | for (;;) { | |
185 | int g, len = sizeof (from); | |
186 | ||
187 | g = accept(f, (struct sockaddr *)&from, &len,); | |
188 | if (g < 0) { | |
189 | if (errno != EINTR) | |
190 | syslog(LOG_ERR, "rlogind: accept: %m"); | |
191 | continue; | |
192 | } | |
193 | ... | |
194 | } | |
195 | ... | |
196 | #include <wait.h> | |
197 | reaper() | |
198 | { | |
199 | union wait status; | |
200 | ||
201 | while (wait3(&status, WNOHANG, 0) > 0) | |
202 | ; | |
203 | } | |
200989e9 | 204 | .DE |
d60e8dff MK |
205 | .sp |
206 | .ce | |
207 | Figure 7. Use of the SIGCHLD signal. | |
208 | .sp | |
209 | .KE | |
210 | .PP | |
211 | If the parent server process fails to reap its children, | |
212 | a large number of \*(lqzombie\*(rq processes may be created. | |
200989e9 MK |
213 | .NH 2 |
214 | Pseudo terminals | |
215 | .PP | |
216 | Many programs will not function properly without a terminal | |
217 | for standard input and output. Since a socket is not a terminal, | |
218 | it is often necessary to have a process communicating over | |
219 | the network do so through a \fIpseudo terminal\fP. A pseudo | |
220 | terminal is actually a pair of devices, master and slave, | |
221 | which allow a process to serve as an active agent in communication | |
222 | between processes and users. Data written on the slave side | |
223 | of a pseudo terminal is supplied as input to a process reading | |
d60e8dff MK |
224 | from the master side, while data written on the master side is |
225 | given to the slave as input. In this way, the process manipulating | |
200989e9 | 226 | the master side of the pseudo terminal has control over the |
d60e8dff MK |
227 | information read and written on the slave side. |
228 | The purpose of this abstraction is to | |
229 | preserve terminal semantics over a network connection \(em | |
230 | that is, the slave side looks like a normal terminal to | |
231 | any process reading from or writing to it. | |
232 | .PP | |
233 | For example, the remote | |
200989e9 MK |
234 | login server uses pseudo terminals for remote login sessions. |
235 | A user logging in to a machine across the network is provided | |
236 | a shell with a slave pseudo terminal as standard input, output, | |
237 | and error. The server process then handles the communication | |
238 | between the programs invoked by the remote shell and the user's | |
239 | local client process. When a user sends an interrupt or quit | |
240 | signal to a process executing on a remote machine, the client | |
241 | login program traps the signal, sends an out of band message | |
242 | to the server process who then uses the signal number, sent | |
243 | as the data value in the out of band message, to perform a | |
244 | \fIkillpg\fP(2) on the appropriate process group. | |
d60e8dff MK |
245 | .PP |
246 | Under 4.3BSD, the slave side of a pseudo terminal is | |
247 | \fI/dev/ttyxy\fP, where \fIx\fP is a single letter | |
248 | starting at `p' and perhaps continuing as far down | |
249 | as `t'. \fIy\fP is a hexidecimal ``digit'' (i.e., a single | |
250 | character in the range 0 through 9 or `a' through `f'). | |
251 | The master side of a pseudo terminal is \fI/dev/ptyxy\fP, | |
252 | where \fIx\fP and \fIy\fP correspond to the same letters | |
253 | in the slave side of the pseudo terminal. | |
254 | .PP | |
255 | In general, the method of obtaining a pair of master and | |
256 | slave pseudo terminals is made up of three components. | |
257 | First, the process must find a pseudo terminal which | |
258 | is not currently in use. Having done so, | |
259 | it then opens both the master and the slave side of | |
260 | the device, taking care to open the master side of the device first. | |
261 | The process then \fIfork\fPs; the child closes | |
262 | the master side of the pseudo terminal, and \fIexec\fPs the | |
263 | appropriate program. Meanwhile, the parent closes the | |
264 | slave side of the pseudo terminal and begins reading and | |
265 | writing from the master side. Sample code making use of | |
266 | pseudo terminals is given in Figure 8; this code assumes | |
267 | that a connection on a socket \fIs\fP exists, connected | |
268 | to a peer who wants a service of some kind, and that the | |
269 | process has disassociated itself from a controlling terminal. | |
270 | .KF | |
271 | .DS | |
272 | gotpty = 0; | |
273 | for (c = 'p'; !gotpty && c <= 's'; c++) { | |
274 | line = "/dev/ptyXX"; | |
275 | line[sizeof("/dev/pty")-1] = c; | |
276 | line[sizeof("/dev/ptyp")-1] = '0'; | |
277 | if (stat(line, &statbuf) < 0) | |
278 | break; | |
279 | for (i = 0; i < 16; i++) { | |
280 | line[sizeof("/dev/ptyp")-1] = "0123456789abcdef"[i]; | |
281 | master = open(line, O_RDWR); | |
282 | if (master > 0) { | |
283 | gotpty = 1; | |
284 | break; | |
285 | } | |
286 | } | |
287 | } | |
288 | if (!gotpty) { | |
289 | syslog(LOG_ERR, "All network ports in use"); | |
290 | exit(1); | |
291 | } | |
292 | ||
293 | line[sizeof("/dev/")-1] = 't'; | |
294 | slave = open(line, O_RDWR); /* \fIslave\fP is now slave side */ | |
295 | if (slave < 0) { | |
296 | syslog(LOG_ERR, "Cannot open slave pty %s", line); | |
297 | exit(1); | |
298 | } | |
299 | ||
300 | ioctl(slave, TIOCGETP, &b); /* Set slave tty modes */ | |
301 | b.sg_flags = CRMOD|XTABS|ANYP; | |
302 | ioctl(slave, TIOCSETP, &b); | |
303 | ||
304 | i = fork(); | |
305 | if (i < 0) { | |
306 | syslog(LOG_ERR, "fork: %m"); | |
307 | exit(1); | |
308 | } else if (i) { /* Parent */ | |
309 | close(slave); | |
310 | ... | |
311 | } else { /* Child */ | |
312 | (void) close(s); | |
313 | (void) close(master); | |
314 | dup2(slave, 0); | |
315 | dup2(slave, 1); | |
316 | dup2(slave, 2); | |
317 | if (slave > 2) | |
318 | (void) close(slave); | |
319 | ... | |
320 | } | |
321 | .DE | |
322 | .ce | |
323 | Figure 8. Creation and use of a pseudo terminal | |
324 | .sp | |
325 | .KE | |
200989e9 | 326 | .NH 2 |
d60e8dff | 327 | Selecting specific protocols |
200989e9 | 328 | .PP |
d60e8dff MK |
329 | If the third argument to the \fIsocket\fP call is 0, |
330 | \fIsocket\fP will select a default protocol to use with | |
331 | the returned socket of the type requested. This | |
332 | protocol should be correct for almost every situation. | |
333 | Still, it is conceivable that the user may wish to specify | |
334 | a particular protocol for use with a given socket. | |
335 | .PP | |
336 | To obtain a particular protocol one selects the protocol number, | |
337 | as defined within the communication domain. For the Internet | |
338 | domain the available protocols are defined in <\fInetinet/in.h\fP> | |
339 | or, better yet, one may use one of the library routines | |
340 | discussed in section 3, such as \fIgetprotobyname\fP: | |
341 | .DS | |
342 | #include <sys/types.h> | |
343 | #include <sys/socket.h> | |
344 | #include <netinet/in.h> | |
345 | #include <netdb.h> | |
346 | ... | |
347 | pp = getprotobyname("newtcp"); | |
348 | s = socket(AF_INET, SOCK_STREAM, pp->p_proto); | |
349 | .DE | |
350 | This would result in a socket \fIs\fP using a stream | |
351 | based connection, but with protocol type of ``newtcp'' | |
352 | instead of the default ``tcp.'' | |
353 | .PP | |
354 | In the NS domain, the available socket protocols are defined in | |
355 | <\fInetns/ns.h\fP>. To create a raw socket for Xerox Error Protocol | |
356 | messages, one might use: | |
357 | .DS | |
358 | #include <sys/types.h> | |
359 | #include <sys/socket.h> | |
360 | #include <netns/ns.h> | |
361 | ... | |
362 | s = socket(AF_NS, SOCK_RAW, NSPROTO_ERROR); | |
363 | .DE | |
364 | .NH 2 | |
365 | Address binding | |
366 | .PP | |
367 | As was mentioned in section 2, | |
368 | binding addresses to sockets in the Internet and NS domains can be | |
369 | fairly complex. As a brief reminder, these associations | |
370 | are composed of local and foreign | |
200989e9 | 371 | addresses, and local and foreign ports. Port numbers are |
d60e8dff MK |
372 | allocated out of separate spaces, one for each system and one |
373 | for each domain on that system. | |
374 | Through the \fIbind\fP system call, a | |
375 | process may specify half of an association, the | |
376 | <local address, local port> part, while the | |
377 | \fIconnect\fP | |
378 | and \fIaccept\fP | |
379 | primitives are used to complete a socket's association by | |
380 | specifying the <foreign address, foreign port> part. | |
200989e9 | 381 | Since the association is created in two steps the association |
d60e8dff | 382 | uniqueness requirement indicated previously could be violated unless |
200989e9 MK |
383 | care is taken. Further, it is unrealistic to expect user |
384 | programs to always know proper values to use for the local address | |
385 | and local port since a host may reside on multiple networks and | |
386 | the set of allocated port numbers is not directly accessible | |
387 | to a user. | |
388 | .PP | |
d60e8dff | 389 | To simplify local address binding in the Internet domain the notion of a |
200989e9 MK |
390 | \*(lqwildcard\*(rq address has been provided. When an address |
391 | is specified as INADDR_ANY (a manifest constant defined in | |
392 | <netinet/in.h>), the system interprets the address as | |
393 | \*(lqany valid address\*(rq. For example, to bind a specific | |
394 | port number to a socket, but leave the local address unspecified, | |
395 | the following code might be used: | |
396 | .DS | |
397 | #include <sys/types.h> | |
398 | #include <netinet/in.h> | |
399 | ... | |
400 | struct sockaddr_in sin; | |
401 | ... | |
402 | s = socket(AF_INET, SOCK_STREAM, 0); | |
403 | sin.sin_family = AF_INET; | |
d60e8dff MK |
404 | sin.sin_addr.s_addr = htonl(INADDR_ANY); |
405 | sin.sin_port = htons(MYPORT); | |
406 | bind(s, (struct sockaddr *) &sin, sizeof (sin)); | |
200989e9 MK |
407 | .DE |
408 | Sockets with wildcarded local addresses may receive messages | |
409 | directed to the specified port number, and addressed to any | |
d60e8dff MK |
410 | of the possible addresses assigned to a host. For example, |
411 | if a host is on a networks 128.32 and 10 and a socket is bound as | |
200989e9 MK |
412 | above, then an accept call is performed, the process will be |
413 | able to accept connection requests which arrive either from | |
d60e8dff MK |
414 | network 128.32 or network 10. |
415 | If a server process wished to only allow hosts on a | |
416 | given network connect to it, it would bind | |
417 | the address of the host on the appropriate network. Such | |
418 | an address could perhaps be determined by a routine | |
419 | such as \fIgethostbynameandnet\fP, as mentioned in section 3. | |
200989e9 MK |
420 | .PP |
421 | In a similar fashion, a local port may be left unspecified | |
422 | (specified as zero), in which case the system will select an | |
d60e8dff MK |
423 | appropriate port number for it. This shortcut will work |
424 | both in the Internet and NS domains. For example, to | |
425 | bind a specific local address to a socket, but to leave the | |
426 | local port number unspecified: | |
200989e9 | 427 | .DS |
d60e8dff MK |
428 | hp = gethostbyname(hostname); |
429 | if (hp == NULL) { | |
430 | ... | |
431 | } | |
432 | bcopy(hp->h_addr, (char *) sin.sin_addr, hp->h_length); | |
433 | sin.sin_port = htons(0); | |
434 | bind(s, (struct sockaddr *) &sin, sizeof (sin)); | |
200989e9 | 435 | .DE |
d60e8dff MK |
436 | The system selects the local port number based on two criteria. |
437 | The first is that on 4BSD systems, | |
438 | local ports numbered 0 through 1023 (for the Xerox domain, | |
439 | 0 through 3000) are reserved | |
440 | for privileged users (i.e., the super user). The second is | |
200989e9 | 441 | that the port number is not currently bound to some other |
d60e8dff MK |
442 | socket. In order to find a free Internet port number in the privileged |
443 | range the \fIrresvport\fP library routine may be used as follows | |
444 | to return a stream socket in with a privileged port number: | |
200989e9 | 445 | .DS |
d60e8dff MK |
446 | int lport = IPPORT_RESERVED \- 1; |
447 | int s; | |
448 | ... | |
449 | s = rresvport(&lport); | |
450 | if (s < 0) { | |
451 | if (errno == EAGAIN) | |
452 | fprintf(stderr, "socket: all ports in use\en"); | |
453 | else | |
454 | perror("rresvport: socket"); | |
455 | ... | |
200989e9 MK |
456 | } |
457 | .DE | |
458 | The restriction on allocating ports was done to allow processes | |
459 | executing in a \*(lqsecure\*(rq environment to perform authentication | |
d60e8dff MK |
460 | based on the originating address and port number. For example, |
461 | the \fIrlogin\fP(1) command allows users to log in across a network | |
462 | without being asked for a password, if two conditions hold: | |
463 | First, the name of the system the user | |
464 | is logging in from is in the file | |
465 | \fI/etc/hosts.equiv\fP on the system he is logging | |
466 | in to (or the system name and the user name are in | |
467 | the user's \fI.rhosts\fP file in the user's home | |
468 | directory), and second, that the user's rlogin | |
469 | process is coming from a privileged port on the machine he is | |
470 | logging in from. The port number and network address of the | |
471 | machine the user is logging in from can be determined either | |
472 | by the \fIfrom\fP value-result parameter to the \fIaccept\fP call, or | |
473 | from the \fIgetpeername\fP call. | |
200989e9 MK |
474 | .PP |
475 | In certain cases the algorithm used by the system in selecting | |
476 | port numbers is unsuitable for an application. This is due to | |
477 | associations being created in a two step process. For example, | |
478 | the Internet file transfer protocol, FTP, specifies that data | |
479 | connections must always originate from the same local port. However, | |
480 | duplicate associations are avoided by connecting to different foreign | |
481 | ports. In this situation the system would disallow binding the | |
482 | same local address and port number to a socket if a previous data | |
483 | connection's socket were around. To override the default port | |
484 | selection algorithm then an option call must be performed prior | |
485 | to address binding: | |
486 | .DS | |
d60e8dff MK |
487 | ... |
488 | int on = 1; | |
489 | ... | |
490 | setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on)); | |
491 | bind(s, (struct sockaddr *) &sin, sizeof (sin)); | |
200989e9 MK |
492 | .DE |
493 | With the above call, local addresses may be bound which | |
494 | are already in use. This does not violate the uniqueness | |
495 | requirement as the system still checks at connect time to | |
496 | be sure any other sockets with the same local address and | |
497 | port do not have the same foreign address and port (if an | |
498 | association already exists, the error EADDRINUSE is returned). | |
200989e9 MK |
499 | .NH 2 |
500 | Broadcasting and datagram sockets | |
501 | .PP | |
502 | By using a datagram socket it is possible to send broadcast | |
503 | packets on many networks supported by the system (the network | |
504 | itself must support the notion of broadcasting; the system | |
505 | provides no broadcast simulation in software). Broadcast | |
506 | messages can place a high load on a network since they force | |
507 | every host on the network to service them. Consequently, | |
d60e8dff MK |
508 | the ability to send broadcast packets has been limited |
509 | to sockets which are explicitly marked as allowing broadcasting. | |
200989e9 | 510 | .PP |
d60e8dff | 511 | To send a broadcast message, a datagram socket |
200989e9 MK |
512 | should be created: |
513 | .DS | |
514 | s = socket(AF_INET, SOCK_DGRAM, 0); | |
515 | .DE | |
d60e8dff MK |
516 | or |
517 | .DS | |
518 | s = socket(AF_NS, SOCK_DGRAM, 0); | |
519 | .DE | |
520 | The socket is marked as allowing broadcasting, | |
521 | .DS | |
522 | int on = 1; | |
523 | ||
524 | setsockopt(s, SOL_SOCKET, SO_BROADCAST, &on, sizeof (on)); | |
525 | .DE | |
200989e9 MK |
526 | and at least a port number should be bound to the socket: |
527 | .DS | |
528 | sin.sin_family = AF_INET; | |
d60e8dff MK |
529 | sin.sin_addr.s_addr = htonl(INADDR_ANY); |
530 | sin.sin_port = htons(MYPORT); | |
531 | bind(s, (struct sockaddr *) &sin, sizeof (sin)); | |
532 | .DE | |
533 | or, for the NS domain, | |
534 | .DS | |
535 | sns.sns_family = AF_NS; | |
536 | netnum = htonl(net); | |
537 | sns.sns_addr.x_net = *(union ns_net *) &netnum; /* insert net number */ | |
538 | sns.sns_addr.x_port = htons(MYPORT); | |
539 | bind(s, (struct sockaddr *) &sns, sizeof (sns)); | |
540 | .DE | |
541 | The destination address of the message to be broadcast | |
542 | depends on the network the message is to be broadcast | |
543 | on, and therefore requires some knowledge of the networks | |
544 | to which the host is connected. Since this information should | |
545 | be obtained in a host-independent fashion, 4.3BSD provides a method of | |
546 | retrieving this information from the system data structures. | |
547 | The SIOCGIFCONF \fIioctl\fP call returns the interface | |
548 | configuration of a host in the form of a | |
549 | single \fIifconf\fP structure; this structure contains | |
550 | a ``data area'' which is made up of an array of | |
551 | of \fIifreq\fP structures, one for each network interface | |
552 | to which the host is connected. | |
553 | These structures are defined in | |
554 | \fI<net/if.h>\fP as follows: | |
555 | .DS | |
556 | .if t .ta .5i 1.0i 1.5i 3.5i | |
557 | .if n .ta .7i 1.4i 2.1i 3.4i | |
558 | struct ifconf { | |
559 | int ifc_len; /* size of associated buffer */ | |
560 | union { | |
561 | caddr_t ifcu_buf; | |
562 | struct ifreq *ifcu_req; | |
563 | } ifc_ifcu; | |
564 | }; | |
565 | ||
566 | #define ifc_buf ifc_ifcu.ifcu_buf /* buffer address */ | |
567 | #define ifc_req ifc_ifcu.ifcu_req /* array of structures returned */ | |
568 | ||
569 | #define IFNAMSIZ 16 | |
570 | ||
571 | struct ifreq { | |
572 | char ifr_name[IFNAMSIZ]; /* if name, e.g. "en0" */ | |
573 | union { | |
574 | struct sockaddr ifru_addr; | |
575 | struct sockaddr ifru_dstaddr; | |
576 | struct sockaddr ifru_broadaddr; | |
577 | short ifru_flags; | |
578 | caddr_t ifru_data; | |
579 | } ifr_ifru; | |
580 | }; | |
581 | ||
582 | .if t .ta \w' #define'u +\w' ifr_broadaddr'u +\w' ifr_ifru.ifru_broadaddr'u | |
583 | #define ifr_addr ifr_ifru.ifru_addr /* address */ | |
584 | #define ifr_dstaddr ifr_ifru.ifru_dstaddr /* other end of p-to-p link */ | |
585 | #define ifr_broadaddr ifr_ifru.ifru_broadaddr /* broadcast address */ | |
586 | #define ifr_flags ifr_ifru.ifru_flags /* flags */ | |
587 | #define ifr_data ifr_ifru.ifru_data /* for use by interface */ | |
588 | .DE | |
589 | The actual call which obtains the | |
590 | interface configuration is | |
591 | .DS | |
592 | struct ifconf ifc; | |
593 | char buf[BUFSIZ]; | |
594 | ||
595 | ifc.ifc_len = sizeof (buf); | |
596 | ifc.ifc_buf = buf; | |
597 | if (ioctl(s, SIOCGIFCONF, (char *) &ifc) < 0) { | |
598 | ... | |
599 | } | |
200989e9 | 600 | .DE |
d60e8dff MK |
601 | After this call \fIbuf\fP will contain one \fIifreq\fP structure for |
602 | each network to which the host is connected, and | |
603 | \fIifc.ifc_len\fP will have been modified to reflect the number | |
604 | of bytes used by the \fIifreq\fP structures. | |
605 | .PP | |
606 | For each structure | |
607 | there exists a set of ``interface flags'' which tell | |
608 | whether the network corresponding to that interface is | |
609 | up or down, point to point or broadcast, etc. The | |
610 | SIOCGIFFLAGS \fIioctl\fP retrieves these | |
611 | flags for an interface specified by an \fIifreq\fP | |
612 | structure as follows: | |
200989e9 | 613 | .DS |
d60e8dff MK |
614 | struct ifreq *ifr; |
615 | ||
616 | ifr = ifc.ifc_req; | |
617 | ||
618 | for (n = ifc.ifc_len / sizeof (struct ifreq); --n >= 0; ifr++) { | |
619 | /* | |
620 | * We must be careful that we don't use an interface | |
621 | * devoted to an address family other than our own; | |
622 | * if we were interested in NS interfaces, the | |
623 | * AF_INET would be AF_NS. | |
624 | */ | |
625 | if (ifr->ifr_addr.sa_family != AF_INET) | |
626 | continue; | |
627 | if (ioctl(s, SIOCGIFFLAGS, (char *) ifr) < 0) { | |
628 | ... | |
629 | } | |
630 | if ((ifr->ifr_flags & IFF_UP) == 0 || /* Skip boring cases */ | |
631 | (ifr->ifr_flags & (IFF_BROADCAST | IFF_POINTTOPOINT)) == 0) | |
632 | continue; | |
200989e9 | 633 | .DE |
d60e8dff MK |
634 | .PP |
635 | Once the flags have been obtained, the broadcast address | |
636 | must be obtained. In the case of broadcast networks this is | |
637 | done via the SIOCGIFBRDADDR \fIioctl\fP, while for point-to-point networks | |
638 | the address of the destination host is obtained with SIOCGIFDSTADDR. | |
639 | .DS | |
640 | struct sockaddr dst; | |
641 | ||
642 | if (ifr->ifr_flags & IFF_POINTTOPOINT) { | |
643 | if (ioctl(s, SIOCGIFDSTADDR, (char *) ifr) < 0) { | |
644 | ... | |
645 | } | |
646 | bcopy((char *) ifr->ifr_dstaddr, (char *) &dst, sizeof (ifr->ifr_dstaddr)); | |
647 | } else if (ifr->ifr_flags & IFF_BROADCAST) { | |
648 | if (ioctl(s, SIOCGIFBRDADDR, (char *) ifr) < 0) { | |
649 | ... | |
650 | } | |
651 | bcopy((char *) ifr->ifr_broadaddr, (char *) &dst, sizeof (ifr->ifr_broadaddr)); | |
652 | } | |
653 | .DE | |
654 | .PP | |
655 | After the appropriate \fIioctl\fP's have obtained the broadcast | |
656 | or destination address (now in \fIdst\fP), the \fIsendto\fP call may be | |
657 | used: | |
200989e9 | 658 | .DS |
d60e8dff MK |
659 | sendto(s, buf, buflen, 0, (struct sockaddr *)&dst, sizeof (dst)); |
660 | } | |
200989e9 | 661 | .DE |
d60e8dff MK |
662 | In the above loop one \fIsendto\fP occurs for every |
663 | interface the host is connected to which supports the notion of | |
664 | broadcast or point-to-point addressing. | |
665 | If a process only wished to send broadcast | |
666 | messages on a given network, code similar to that outlined above | |
667 | would be used, but the loop would need to find the | |
668 | correct destination address. | |
200989e9 MK |
669 | .PP |
670 | Received broadcast messages contain the senders address | |
d60e8dff MK |
671 | and port, as datagram sockets are bound before |
672 | a message is allowed to go out. | |
200989e9 | 673 | .NH 2 |
d60e8dff | 674 | Socket Options |
200989e9 | 675 | .PP |
d60e8dff MK |
676 | It is possible to set and get a number of options on sockets |
677 | via the \fIsetsockopt\fP and \fIgetsockopt\fP system calls. | |
678 | These options include such things as marking a socket for | |
679 | broadcasting, not to route, to linger on close, etc. | |
680 | The general forms of the calls are: | |
200989e9 | 681 | .DS |
d60e8dff MK |
682 | setsockopt(s, level, optname, optval, optlen); |
683 | .DE | |
684 | and | |
685 | .DS | |
686 | getsockopt(s, level, optname, optval, optlen); | |
687 | .DE | |
688 | .PP | |
689 | The parameters to the calls are as follows: \fIs\fP | |
690 | is the socket on which the option is to be applied. | |
691 | \fILevel\fP specifies the protocol layer on which the | |
692 | option is to be applied; in most cases this is | |
693 | the ``socket level'', indicated by the symbolic constant | |
694 | SOL_SOCKET, defined in \fI<sys/socket.h>.\fP | |
695 | The actual option is specified in \fIoptname\fP, and is | |
696 | a symbolic constant also defined in \fI<sys/socket.h>\fP. | |
697 | \fIOptval\fP and \fIOptlen\fP point to the value of the | |
698 | option (in most cases, whether the option is to be turned | |
699 | on or off), and the length of the value of the option, | |
700 | respectively. | |
701 | For \fIgetsockopt\fP, \fIoptlen\fP is | |
702 | a value-result parameter, initially set to the size of | |
703 | the storage area pointed to by \fIoptval\fP, and modified | |
704 | upon return to indicate the actual amount of storage used. | |
705 | .PP | |
706 | An example should help clarify things. It is sometimes | |
707 | useful to determine the type (e.g., stream, datagram, etc.) | |
708 | of an existing socket; programs | |
709 | under \fIinetd\fP (described below) may need to perform this | |
710 | task. This can be accomplished as follows via the | |
711 | SO_TYPE socket option and the \fIgetsockopt\fP call: | |
712 | .DS | |
713 | #include <sys/types.h> | |
714 | #include <sys/socket.h> | |
200989e9 | 715 | |
d60e8dff MK |
716 | int type, size; |
717 | ||
718 | size = sizeof (int); | |
719 | ||
720 | if (getsockopt(s, SOL_SOCKET, SO_TYPE, (char *) &type, &size) < 0) { | |
200989e9 MK |
721 | ... |
722 | } | |
d60e8dff MK |
723 | .DE |
724 | After the \fIgetsockopt\fP call, \fItype\fP will be set | |
725 | to the value of the socket type, as defined in | |
726 | \fI<sys/socket.h>\fP. If, for example, the socket were | |
727 | a datagram socket, \fItype\fP would have the value | |
728 | corresponding to SOCK_DGRAM. | |
729 | .NH 2 | |
730 | NS Packet Sequences | |
731 | .PP | |
732 | The semantics of NS connections demand that | |
733 | the user both be able to look inside the network header associated | |
734 | with any incoming packet and be able to specify what should go | |
735 | in certain fields of an outgoing packet. The header of an | |
736 | IDP-level packet looks like: | |
737 | .DS | |
738 | .if t .ta \w'struct 'u +\w" struct ns_addr"u +2.0i | |
739 | struct idp { | |
740 | u_short idp_sum; /* Checksum */ | |
741 | u_short idp_len; /* Length, in bytes, including header */ | |
742 | u_char idp_tc; /* Transport Control (i.e., hop count) */ | |
743 | u_char idp_pt; /* Packet Type (i.e., level 2 protocol) */ | |
744 | struct ns_addr idp_dna; /* Destination Network Address */ | |
745 | struct ns_addr idp_sna; /* Source Network Address */ | |
746 | }; | |
747 | .DE | |
748 | Most of the fields are filled in automatically; the only | |
749 | field that the user should be concerned with is the | |
750 | \fIpacket type\fP field. The standard values for this | |
751 | field are (as defined in <\fInetns/ns.h\fP>): | |
752 | .DS | |
753 | .if t .ta \w" #define"u +\w" NSPROTO_ERROR"u +1.0i | |
754 | #define NSPROTO_RI 1 /* Routing Information */ | |
755 | #define NSPROTO_ECHO 2 /* Echo Protocol */ | |
756 | #define NSPROTO_ERROR 3 /* Error Protocol */ | |
757 | #define NSPROTO_PE 4 /* Packet Exchange */ | |
758 | #define NSPROTO_SPP 5 /* Sequenced Packet */ | |
759 | .DE | |
760 | For SPP connections, the contents of this field are | |
761 | automatically set to NSPROTO_SPP; for IDP packets, | |
762 | this value defaults to zero, which means ``unknown''. | |
763 | .PP | |
764 | The contents of a SPP header (minus the IDP header) are: | |
765 | .DS | |
766 | .if t .ta \w" #define"u +\w" u_short"u +2.0i | |
767 | struct sphdr { | |
768 | u_char sp_cc; /* connection control */ | |
769 | #define SP_SP 0x80 /* system packet */ | |
770 | #define SP_SA 0x40 /* send acknowledgement */ | |
771 | #define SP_OB 0x20 /* attention (out of band data) */ | |
772 | #define SP_EM 0x10 /* end of message */ | |
773 | u_char sp_dt; /* datastream type */ | |
774 | u_short sp_sid; /* source connection identifier */ | |
775 | u_short sp_did; /* destination connection identifier */ | |
776 | u_short sp_seq; /* sequence number */ | |
777 | u_short sp_ack; /* acknowledge number */ | |
778 | u_short sp_alo; /* allocation number */ | |
779 | }; | |
780 | .DE | |
781 | Here, the items of interest are the \fIdatastream type\fP and | |
782 | the \fIconnection control\fP fields. The semantics of the | |
783 | datastream type are defined by the application(s) in question; | |
784 | the value of this field is, by default, zero, but it can be | |
785 | used to indicate things such as Xerox's Bulk Data Transfer | |
786 | Protocol (in which case it is set to one). The connection control | |
787 | field is a mask of the flags defined above. The user may | |
788 | set or clear the end-of-message bit to indicate | |
789 | that a given message is the last of a given substream type, | |
790 | or may set/clear the attention bit as an alternate way to | |
791 | indicate that a packet should be sent out-of-band. | |
792 | .PP | |
793 | Using different calls to \fIsetsockopt\fP, is it possible | |
794 | to indicate whether prototype headers will be associated by | |
795 | the user with each outgoing packet (SO_HEADERS_ON_OUTPUT), | |
796 | to indicate whether the headers received by the system should be | |
797 | delivered to the user (SO_HEADERS_ON_INPUT), or to indicate | |
798 | default information that should be associated with all | |
799 | outgoing packets on a given socket (SO_DEFAULT_HEADERS). | |
800 | For example, to associate prototype headers with outgoing | |
801 | SPP packets, one might use: | |
802 | .DS | |
803 | #include <sys/types.h> | |
804 | #include <sys/socket.h> | |
805 | #include <netns/ns.h> | |
806 | #include <netns/sp.h> | |
200989e9 | 807 | ... |
d60e8dff MK |
808 | struct sockaddr_ns sns, to; |
809 | int s, on = 1; | |
810 | struct databuf { | |
811 | struct sphdr proto_spp; /* prototype header */ | |
812 | char buf[534]; /* max. possible data by Xerox std. */ | |
813 | } buf; | |
814 | ... | |
815 | s = socket(AF_NS, SOCK_SEQPACKET, 0); | |
816 | ... | |
817 | bind(s, (struct sockaddr *) &sns, sizeof (sns)); | |
818 | setsockopt(s, NSPROTO_SPP, SO_HEADERS_ON_OUTPUT, &on, sizeof(on)); | |
819 | ... | |
820 | buf.proto_spp.sp_dt = 1; /* bulk data */ | |
821 | buf.proto_spp.sp_cc = SP_EM; /* end-of-message */ | |
822 | strcpy(buf.buf, "hello world\en"); | |
823 | sendto(s, (char *) &buf, sizeof(struct sphdr) + strlen("hello world\en"), | |
824 | (struct sockaddr *) &to, sizeof(to)); | |
825 | ... | |
826 | .DE | |
827 | Note that one must be careful when writing headers; if the prototype | |
828 | header is not written with the data with which it is to be associated, | |
829 | the kernel will treat the first few bytes of the data as the | |
830 | header, with unpredictable results. | |
831 | To turn off the above association, and to indicate that packet | |
832 | headers received by the system should be passed up to the user, | |
833 | one might use: | |
834 | .DS | |
835 | #include <sys/types.h> | |
836 | #include <sys/socket.h> | |
837 | #include <netns/ns.h> | |
838 | #include <netns/sp.h> | |
839 | ... | |
840 | struct sockaddr sns; | |
841 | int s, on = 1, off = 0; | |
842 | ... | |
843 | s = socket(AF_NS, SOCK_SEQPACKET, 0); | |
844 | ... | |
845 | bind(s, (struct sockaddr *) &sns, sizeof (sns)); | |
846 | setsockopt(s, NSPROTO_SPP, SO_HEADERS_ON_OUTPUT, &off, sizeof(off)); | |
847 | setsockopt(s, NSPROTO_SPP, SO_HEADERS_ON_INPUT, &on, sizeof(on)); | |
848 | ... | |
849 | .DE | |
850 | To indicate a default prototype header to be associated with | |
851 | the outgoing packets on an IDP datagram socket, one would use: | |
852 | .DS | |
853 | #include <sys/types.h> | |
854 | #include <sys/socket.h> | |
855 | #include <netns/ns.h> | |
856 | #include <netns/idp.h> | |
857 | ... | |
858 | struct sockaddr sns; | |
859 | struct idp proto_idp; /* prototype header */ | |
860 | int s, on = 1; | |
861 | ... | |
862 | s = socket(AF_NS, SOCK_DGRAM, 0); | |
863 | ... | |
864 | bind(s, (struct sockaddr *) &sns, sizeof (sns)); | |
865 | proto_idp.idp_pt = NSPROTO_PE; /* packet exchange */ | |
866 | setsockopt(s, NSPROTO_IDP, SO_DEFAULT_HEADERS, (char *) &proto_idp, | |
867 | sizeof(proto_idp)); | |
868 | ... | |
869 | .DE | |
870 | .NH 2 | |
871 | Three-way Handshake | |
872 | .PP | |
873 | The semantics of SPP connections indicates that a three-way | |
874 | handshake, involving changes in the datastream type, should \(em | |
875 | but is not absolutely required to \(em take place before a SPP | |
876 | connection is closed. Almost all SPP connections are | |
877 | ``well-behaved'' in this manner; when communicating with | |
878 | any process, it is best to assume that the three-way handshake | |
879 | is required unless it is known for certain that it is not | |
880 | required. In a three-way close, the closing process | |
881 | indicates that it wishes to close the connection by sending | |
882 | a zero-length packet with end-of-message set and with | |
883 | datastream type 254. The other side of the connection | |
884 | indicates that it is OK to close by sending a zero-length | |
885 | packet with end-of-message set and datastream type 255. Finally, | |
886 | the closing process replies with a zero-length packet with | |
887 | substream type 255; at this point, the connection is considered | |
888 | closed. The following code fragments are simplified examples | |
889 | of how one might handle this three-way handshake at the user | |
890 | level; in the future, support for this type of close will | |
891 | probably be provided as part of the C library or as part of | |
892 | the kernel. The first code fragment below illustrates how a process | |
893 | might handle three-way handshake if it sees that the process it | |
894 | is communicating with wants to close the connection: | |
895 | .DS | |
896 | #include <sys/types.h> | |
897 | #include <sys/socket.h> | |
898 | #include <netns/ns.h> | |
899 | #include <netns/sp.h> | |
900 | ... | |
901 | #ifndef SPPSST_END | |
902 | #define SPPSST_END 254 | |
903 | #define SPPSST_ENDREPLY 255 | |
904 | #endif | |
905 | struct sphdr proto_sp; | |
906 | int s; | |
907 | ... | |
908 | read(s, buf, BUFSIZE); | |
909 | if (((struct sphdr *)buf)->sp_dt == SPPSST_END) { | |
910 | /* | |
911 | * SPPSST_END indicates that the other side wants to | |
912 | * close. | |
913 | */ | |
914 | proto_sp.sp_dt = SPPSST_ENDREPLY; | |
915 | proto_sp.sp_cc = SP_EM; | |
916 | setsockopt(s, NSPROTO_SPP, SO_DEFAULT_HEADERS, (char *)&proto_sp, | |
917 | sizeof(proto_sp)); | |
918 | write(s, buf, 0); | |
919 | /* | |
920 | * Write a zero-length packet with datastream type = SPPSST_ENDREPLY | |
921 | * to indicate that the close is OK with us. The packet that we | |
922 | * don't see (because we don't look for it) is another packet | |
923 | * from the other side of the connection, with SPPSST_ENDREPLY | |
924 | * on it it, too. Once that packet is sent, the connection is | |
925 | * considered closed; note that we really ought to retransmit | |
926 | * the close for some time if we do not get a reply. | |
927 | */ | |
928 | close(s); | |
929 | } | |
930 | ... | |
931 | .DE | |
932 | To indicate to another process that we would like to close the | |
933 | connection, the following code would suffice: | |
934 | .DS | |
935 | #include <sys/types.h> | |
936 | #include <sys/socket.h> | |
937 | #include <netns/ns.h> | |
938 | #include <netns/sp.h> | |
939 | ... | |
940 | #ifndef SPPSST_END | |
941 | #define SPPSST_END 254 | |
942 | #define SPPSST_ENDREPLY 255 | |
943 | #endif | |
944 | struct sphdr proto_sp; | |
945 | int s; | |
946 | ... | |
947 | proto_sp.sp_dt = SPPSST_END; | |
948 | proto_sp.sp_cc = SP_EM; | |
949 | setsockopt(s, NSPROTO_SPP, SO_DEFAULT_HEADERS, (char *)&proto_sp, | |
950 | sizeof(proto_sp)); | |
951 | write(s, buf, 0); /* send the end request */ | |
952 | proto_sp.sp_dt = SPPSST_ENDREPLY; | |
953 | setsockopt(s, NSPROTO_SPP, SO_DEFAULT_HEADERS, (char *)&proto_sp, | |
954 | sizeof(proto_sp)); | |
955 | /* | |
956 | * We assume (perhaps unwisely) | |
957 | * that the other side will send the | |
958 | * ENDREPLY, so we'll just send our final ENDREPLY | |
959 | * as if we'd seen theirs already. | |
960 | */ | |
961 | write(s, buf, 0); | |
962 | close(s); | |
963 | ... | |
964 | .DE | |
965 | .NH 2 | |
966 | Packet Exchange | |
967 | .PP | |
968 | The Xerox standard protocols include a protocol that is both | |
969 | reliable and datagram-oriented. This protocol is known as | |
970 | Packet Exchange (PEX or PE) and, like SPP, is layered on top | |
971 | of IDP. PEX is important for a number of things: Courier | |
972 | remote procedure calls may be expedited through the use | |
973 | of PEX, and many Xerox servers are located by doing a PEX | |
974 | ``BroadcastForServers'' operation. Although there is no | |
975 | implementation of PEX in the kernel, | |
976 | it may be simulated at the user level with some clever coding | |
977 | and the use of one peculiar \fIgetsockopt\fP. A PEX packet | |
978 | looks like: | |
979 | .DS | |
980 | .if t .ta \w'struct 'u +\w" struct idp"u +2.0i | |
981 | /* | |
982 | * The packet-exchange header shown here is not defined | |
983 | * as part of any of the system include files. | |
984 | */ | |
985 | struct pex { | |
986 | struct idp p_idp; /* idp header */ | |
987 | u_short ph_id[2]; /* unique transaction ID for pex */ | |
988 | u_short ph_client; /* client type field for pex */ | |
989 | }; | |
990 | .DE | |
991 | The \fIph_id\fP field is used to hold a ``unique id'' that | |
992 | is used in duplicate suppression; the \fIph_client\fP | |
993 | field indicates the PEX client type (similar to the packet | |
994 | type field in the IDP header). PEX reliability stems from the | |
995 | fact that it is an idempotent (``I send a packet to you, you | |
996 | send a packet to me'') protocol. Processes on each side of | |
997 | the connection may use the unique id to determine if they have | |
998 | seen a given packet before (the unique id field differs on each | |
999 | packet sent) so that duplicates may be detected, and to indicate | |
1000 | which message a given packet is in response to. If a packet with | |
1001 | a given unique id is sent and no response is received in a given | |
1002 | amount of time, the packet is retransmitted until it is decided | |
1003 | that no response will ever be received. To simulate PEX, one | |
1004 | must be able to generate unique ids -- something that is hard to | |
1005 | do at the user level with any real guarantee that the id is really | |
1006 | unique. Therefore, a means (via \fIgetsockopt\fP) has been provided | |
1007 | for getting unique ids from the kernel. The following code fragment | |
1008 | indicates how to get a unique id: | |
1009 | .DS | |
1010 | long uniqueid; | |
1011 | int s, idsize = sizeof(uniqueid); | |
1012 | ... | |
1013 | s = socket(AF_NS, SOCK_DGRAM, 0); | |
1014 | ... | |
1015 | /* get id from the kernel -- only on IDP sockets */ | |
1016 | getsockopt(s, NSPROTO_PE, SO_SEQNO, (char *)&uniqueid, &idsize); | |
1017 | ... | |
1018 | .DE | |
1019 | The retransmission and duplicate suppression code required to | |
1020 | simulate PEX fully is left as an exercise for the reader. | |
1021 | .NH 2 | |
1022 | Non-Blocking Sockets | |
1023 | .PP | |
1024 | It is occasionally convenient to make use of sockets | |
1025 | which do not block; that is, i/o requests which | |
1026 | would take time and | |
1027 | would cause the process to wait for their completion are | |
1028 | not executed, and an error code is returned. | |
1029 | Once a socket has been created via | |
1030 | the \fIsocket\fP call, it may be marked as non-blocking | |
1031 | by \fIfcntl\fP as follows: | |
1032 | .DS | |
1033 | #include <fcntl.h> | |
1034 | ... | |
1035 | int s; | |
1036 | ... | |
1037 | s = socket(AF_INET, SOCK_STREAM, 0); | |
1038 | ... | |
1039 | if (fcntl(s, F_SETFL, FNDELAY) < 0) | |
1040 | perror("fcntl F_SETFL, FNDELAY"); | |
1041 | exit(1); | |
200989e9 | 1042 | } |
d60e8dff | 1043 | ... |
200989e9 MK |
1044 | .DE |
1045 | .PP | |
d60e8dff MK |
1046 | When performing non-blocking i/o on sockets, one must be |
1047 | careful to check for the error EWOULDBLOCK (stored in the | |
1048 | global variable \fIerrno\fP), which occurs when | |
1049 | an operation would normally block, but the socket it | |
1050 | was performed on is marked as non-blocking. | |
1051 | In particular, \fIaccept\fP, \fIconnect\fP, \fIsend\fP, \fIrecv\fP, | |
1052 | \fIread\fP, and \fIwrite\fP can | |
1053 | all return EWOULDBLOCK, and processes should be prepared | |
1054 | to deal with such return codes. | |
1055 | .NH 2 | |
1056 | Inetd | |
1057 | .PP | |
1058 | One of the daemons provided with 4.3BSD is \fIinetd\fP, the | |
1059 | so called ``internet super-server.'' \fIInetd\fP is invoked at boot | |
1060 | time, and determines from the file \fI/etc/inetd.conf\fP the | |
1061 | servers for which it is to listen. Once this information has been | |
1062 | read and a pristine environment created, \fIinetd\fP proceeds | |
1063 | to create one socket for each service it is to listen for, | |
1064 | binding the appropriate port number to each socket. | |
1065 | .PP | |
1066 | \fIInetd\fP then performs a \fIselect\fP on all these | |
1067 | sockets for read availability, waiting for somebody wishing | |
1068 | a connection to the service corresponding to | |
1069 | that socket. \fIInetd\fP then performs an \fIaccept\fP on | |
1070 | the socket in question, \fIfork\fPs, \fIdup\fPs the new | |
1071 | socket to file descriptors 0 and 1 (stdin and | |
1072 | stdout), closes other open file | |
1073 | descriptors, and \fIexec\fPs the appropriate server. | |
1074 | .PP | |
1075 | Servers making use of \fIinetd\fP are considerably simplified, | |
1076 | as \fIinetd\fP takes care of the majority of the IPC work | |
1077 | required in establishing a connection. The server invoked | |
1078 | by \fIinetd\fP expects the socket connected to its client | |
1079 | on file descriptors 0 and 1, and may immediately perform | |
1080 | any operations such as \fIread\fP, \fIwrite\fP, \fIsend\fP, | |
1081 | or \fIrecv\fP. Indeed, servers may use | |
1082 | buffered i/o as provided by the ``stdio'' conventions, as | |
1083 | long as as they remember to use \fIfflush\fP when appropriate. | |
1084 | .PP | |
1085 | One call which may be of interest to individuals writing | |
1086 | servers under \fIinetd\fP is the \fIgetpeername\fP call, | |
1087 | which returns the address of the peer (process) connected | |
1088 | on the other end of the socket. For example, to log the | |
1089 | Internet address in ``dot notation'' (e.g., ``128.32.0.4'') | |
1090 | of a client connected to a server under | |
1091 | \fIinetd\fP, the following code might be used: | |
1092 | .DS | |
1093 | struct sockaddr_in name; | |
1094 | int namelen = sizeof (name); | |
1095 | ... | |
1096 | if (getpeername(0, (struct sockaddr *)&name, &namelen) < 0) { | |
1097 | syslog(LOG_ERR, "getpeername: %m"); | |
1098 | exit(1); | |
1099 | } else | |
1100 | syslog(LOG_INFO, "Connection from %s", inet_ntoa(name.sin_addr)); | |
1101 | ... | |
1102 | .DE | |
1103 | While the \fIgetpeername\fP call is especially useful when | |
1104 | writing programs to run with \fIinetd\fP, it can be used | |
1105 | at any time. Be warned, however, that \fIgetpeername\fP will | |
1106 | fail on UNIX domain sockets, as their addresses (i.e., pathnames) | |
1107 | are inaccessible. |