Commit | Line | Data |
---|---|---|
15637ed4 RG |
1 | .\" Copyright (c) 1983, 1986 The Regents of the University of California. |
2 | .\" All rights reserved. | |
3 | .\" | |
4 | .\" Redistribution and use in source and binary forms, with or without | |
5 | .\" modification, are permitted provided that the following conditions | |
6 | .\" are met: | |
7 | .\" 1. Redistributions of source code must retain the above copyright | |
8 | .\" notice, this list of conditions and the following disclaimer. | |
9 | .\" 2. Redistributions in binary form must reproduce the above copyright | |
10 | .\" notice, this list of conditions and the following disclaimer in the | |
11 | .\" documentation and/or other materials provided with the distribution. | |
12 | .\" 3. All advertising materials mentioning features or use of this software | |
13 | .\" must display the following acknowledgement: | |
14 | .\" This product includes software developed by the University of | |
15 | .\" California, Berkeley and its contributors. | |
16 | .\" 4. Neither the name of the University nor the names of its contributors | |
17 | .\" may be used to endorse or promote products derived from this software | |
18 | .\" without specific prior written permission. | |
19 | .\" | |
20 | .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND | |
21 | .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | |
22 | .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | |
23 | .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE | |
24 | .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | |
25 | .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS | |
26 | .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) | |
27 | .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT | |
28 | .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY | |
29 | .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF | |
30 | .\" SUCH DAMAGE. | |
31 | .\" | |
32 | .\" @(#)a.t 6.5 (Berkeley) 4/17/91 | |
33 | .\" | |
34 | .nr H2 1 | |
35 | .\".ds RH "Gateways and routing | |
36 | .br | |
37 | .ne 2i | |
38 | .NH | |
39 | \s+2Gateways and routing issues\s0 | |
40 | .PP | |
41 | The system has been designed with the expectation that it will | |
42 | be used in an internetwork environment. The ``canonical'' | |
43 | environment was envisioned to be a collection of local area | |
44 | networks connected at one or more points through hosts with | |
45 | multiple network interfaces (one on each local area network), | |
46 | and possibly a connection to a long haul network (for example, | |
47 | the ARPANET). In such an environment, issues of | |
48 | gatewaying and packet routing become very important. Certain | |
49 | of these issues, such as congestion | |
50 | control, have been handled in a simplistic manner or specifically | |
51 | not addressed. | |
52 | Instead, where possible, the network system | |
53 | attempts to provide simple mechanisms upon which more involved | |
54 | policies may be implemented. As some of these problems become | |
55 | better understood, the solutions developed will be incorporated | |
56 | into the system. | |
57 | .PP | |
58 | This section will describe the facilities provided for packet | |
59 | routing. The simplistic mechanisms provided for congestion | |
60 | control are described in chapter 12. | |
61 | .NH 2 | |
62 | Routing tables | |
63 | .PP | |
64 | The network system maintains a set of routing tables for | |
65 | selecting a network interface to use in delivering a | |
66 | packet to its destination. These tables are of the form: | |
67 | .DS | |
68 | .ta \w'struct 'u +\w'u_long 'u +\w'sockaddr rt_gateway; 'u | |
69 | struct rtentry { | |
70 | u_long rt_hash; /* hash key for lookups */ | |
71 | struct sockaddr rt_dst; /* destination net or host */ | |
72 | struct sockaddr rt_gateway; /* forwarding agent */ | |
73 | short rt_flags; /* see below */ | |
74 | short rt_refcnt; /* no. of references to structure */ | |
75 | u_long rt_use; /* packets sent using route */ | |
76 | struct ifnet *rt_ifp; /* interface to give packet to */ | |
77 | }; | |
78 | .DE | |
79 | .PP | |
80 | The routing information is organized in two separate tables, one | |
81 | for routes to a host and one for routes to a network. The | |
82 | distinction between hosts and networks is necessary so | |
83 | that a single mechanism may be used | |
84 | for both broadcast and multi-drop type networks, and | |
85 | also for networks built from point-to-point links (e.g | |
86 | DECnet [DEC80]). | |
87 | .PP | |
88 | Each table is organized as a hashed set of linked lists. | |
89 | Two 32-bit hash values are calculated by routines defined for | |
90 | each address family; one based on the destination being | |
91 | a host, and one assuming the target is the network portion | |
92 | of the address. Each hash value is used to | |
93 | locate a hash chain to search (by taking the value modulo the | |
94 | hash table size) and the entire 32-bit value is then | |
95 | used as a key in scanning the list of routes. Lookups are | |
96 | applied first to the routing | |
97 | table for hosts, then to the routing table for networks. | |
98 | If both lookups fail, a final lookup is made for a ``wildcard'' | |
99 | route (by convention, network 0). | |
100 | The first appropriate route discovered is used. | |
101 | By doing this, routes to a specific host on a network may be | |
102 | present as well as routes to the network. This also allows a | |
103 | ``fall back'' network route to be defined to a ``smart'' gateway | |
104 | which may then perform more intelligent routing. | |
105 | .PP | |
106 | Each routing table entry contains a destination (the desired final destination), | |
107 | a gateway to which to send the packet, | |
108 | and various flags which indicate the route's status and type (host or | |
109 | network). A count | |
110 | of the number of packets sent using the route is kept, along | |
111 | with a count of ``held references'' to the dynamically | |
112 | allocated structure to insure that memory reclamation | |
113 | occurs only when the route is not in use. Finally, a pointer to the | |
114 | a network interface is kept; packets sent using | |
115 | the route should be handed to this interface. | |
116 | .PP | |
117 | Routes are typed in two ways: either as host or network, and as | |
118 | ``direct'' or ``indirect''. The host/network | |
119 | distinction determines how to compare the \fIrt_dst\fP field | |
120 | during lookup. If the route is to a network, only a packet's | |
121 | destination network is compared to the \fIrt_dst\fP entry stored | |
122 | in the table. If the route is to a host, the addresses must | |
123 | match bit for bit. | |
124 | .PP | |
125 | The distinction between ``direct'' and ``indirect'' routes indicates | |
126 | whether the destination is directly connected to the source. | |
127 | This is needed when performing local network encapsulation. If | |
128 | a packet is destined for a peer at a host or network which is | |
129 | not directly connected to the source, the internetwork packet | |
130 | header will | |
131 | contain the address of the eventual destination, while | |
132 | the local network header will address the intervening | |
133 | gateway. Should the destination be directly connected, these addresses | |
134 | are likely to be identical, or a mapping between the two exists. | |
135 | The RTF_GATEWAY flag indicates that the route is to an ``indirect'' | |
136 | gateway agent, and that the local network header should be filled in | |
137 | from the \fIrt_gateway\fP field instead of | |
138 | from the final internetwork destination address. | |
139 | .PP | |
140 | It is assumed that multiple routes to the same destination will not | |
141 | be present; only one of multiple routes, that most recently installed, | |
142 | will be used. | |
143 | .PP | |
144 | Routing redirect control messages are used to dynamically | |
145 | modify existing routing table entries as well as dynamically | |
146 | create new routing table entries. On hosts where exhaustive | |
147 | routing information is too expensive to maintain (e.g. work | |
148 | stations), the | |
149 | combination of wildcard routing entries and routing redirect | |
150 | messages can be used to provide a simple routing management | |
151 | scheme without the use of a higher level policy process. | |
152 | Current connections may be rerouted after notification of the protocols | |
153 | by means of their \fIpr_ctlinput\fP entries. | |
154 | Statistics are kept by the routing table routines | |
155 | on the use of routing redirect messages and their | |
156 | affect on the routing tables. These statistics may be viewed using | |
78ed81a3 | 157 | .I netstat (1). |
15637ed4 RG |
158 | .PP |
159 | Status information other than routing redirect control messages | |
160 | may be used in the future, but at present they are ignored. | |
161 | Likewise, more intelligent ``metrics'' may be used to describe | |
162 | routes in the future, possibly based on bandwidth and monetary | |
163 | costs. | |
164 | .NH 2 | |
165 | Routing table interface | |
166 | .PP | |
167 | A protocol accesses the routing tables through | |
168 | three routines, | |
169 | one to allocate a route, one to free a route, and one | |
170 | to process a routing redirect control message. | |
171 | The routine \fIrtalloc\fP performs route allocation; it is | |
172 | called with a pointer to the following structure containing | |
173 | the desired destination: | |
174 | .DS | |
175 | ._f | |
176 | struct route { | |
177 | struct rtentry *ro_rt; | |
178 | struct sockaddr ro_dst; | |
179 | }; | |
180 | .DE | |
181 | The route returned is assumed ``held'' by the caller until | |
182 | released with an \fIrtfree\fP call. Protocols which implement | |
183 | virtual circuits, such as TCP, hold onto routes for the duration | |
184 | of the circuit's lifetime, while connection-less protocols, | |
185 | such as UDP, allocate and free routes whenever their destination address | |
186 | changes. | |
187 | .PP | |
188 | The routine \fIrtredirect\fP is called to process a routing redirect | |
189 | control message. It is called with a destination address, | |
190 | the new gateway to that destination, and the source of the redirect. | |
191 | Redirects are accepted only from the current router for the destination. | |
192 | If a non-wildcard route | |
193 | exists to the destination, the gateway entry in the route is modified | |
194 | to point at the new gateway supplied. Otherwise, a new routing | |
195 | table entry is inserted reflecting the information supplied. Routes | |
196 | to interfaces and routes to gateways which are not directly accessible | |
197 | from the host are ignored. | |
198 | .NH 2 | |
199 | User level routing policies | |
200 | .PP | |
201 | Routing policies implemented in user processes manipulate the | |
202 | kernel routing tables through two \fIioctl\fP calls. The | |
203 | commands SIOCADDRT and SIOCDELRT add and delete routing entries, | |
204 | respectively; the tables are read through the /dev/kmem device. | |
205 | The decision to place policy decisions in a user process implies | |
206 | that routing table updates may lag a bit behind the identification of | |
207 | new routes, or the failure of existing routes, but this period | |
208 | of instability is normally very small with proper implementation | |
209 | of the routing process. Advisory information, such as ICMP | |
210 | error messages and IMP diagnostic messages, may be read from | |
211 | raw sockets (described in the next section). | |
212 | .PP | |
213 | Several routing policy processes have already been implemented. The | |
214 | system standard | |
215 | ``routing daemon'' uses a variant of the Xerox NS Routing Information | |
216 | Protocol [Xerox82] to maintain up-to-date routing tables in our local | |
217 | environment. Interaction with other existing routing protocols, | |
218 | such as the Internet EGP (Exterior Gateway Protocol), has been | |
219 | accomplished using a similar process. |