This commit was manufactured by cvs2svn to create tag 'FreeBSD-release/1.0'.
[unix-history] / share / doc / smm / 15.net / a.t
CommitLineData
15637ed4
RG
1.\" Copyright (c) 1983, 1986 The Regents of the University of California.
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\" notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\" notice, this list of conditions and the following disclaimer in the
11.\" documentation and/or other materials provided with the distribution.
12.\" 3. All advertising materials mentioning features or use of this software
13.\" must display the following acknowledgement:
14.\" This product includes software developed by the University of
15.\" California, Berkeley and its contributors.
16.\" 4. Neither the name of the University nor the names of its contributors
17.\" may be used to endorse or promote products derived from this software
18.\" without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\" @(#)a.t 6.5 (Berkeley) 4/17/91
33.\"
34.nr H2 1
35.\".ds RH "Gateways and routing
36.br
37.ne 2i
38.NH
39\s+2Gateways and routing issues\s0
40.PP
41The system has been designed with the expectation that it will
42be used in an internetwork environment. The ``canonical''
43environment was envisioned to be a collection of local area
44networks connected at one or more points through hosts with
45multiple network interfaces (one on each local area network),
46and possibly a connection to a long haul network (for example,
47the ARPANET). In such an environment, issues of
48gatewaying and packet routing become very important. Certain
49of these issues, such as congestion
50control, have been handled in a simplistic manner or specifically
51not addressed.
52Instead, where possible, the network system
53attempts to provide simple mechanisms upon which more involved
54policies may be implemented. As some of these problems become
55better understood, the solutions developed will be incorporated
56into the system.
57.PP
58This section will describe the facilities provided for packet
59routing. The simplistic mechanisms provided for congestion
60control are described in chapter 12.
61.NH 2
62Routing tables
63.PP
64The network system maintains a set of routing tables for
65selecting a network interface to use in delivering a
66packet to its destination. These tables are of the form:
67.DS
68.ta \w'struct 'u +\w'u_long 'u +\w'sockaddr rt_gateway; 'u
69struct rtentry {
70 u_long rt_hash; /* hash key for lookups */
71 struct sockaddr rt_dst; /* destination net or host */
72 struct sockaddr rt_gateway; /* forwarding agent */
73 short rt_flags; /* see below */
74 short rt_refcnt; /* no. of references to structure */
75 u_long rt_use; /* packets sent using route */
76 struct ifnet *rt_ifp; /* interface to give packet to */
77};
78.DE
79.PP
80The routing information is organized in two separate tables, one
81for routes to a host and one for routes to a network. The
82distinction between hosts and networks is necessary so
83that a single mechanism may be used
84for both broadcast and multi-drop type networks, and
85also for networks built from point-to-point links (e.g
86DECnet [DEC80]).
87.PP
88Each table is organized as a hashed set of linked lists.
89Two 32-bit hash values are calculated by routines defined for
90each address family; one based on the destination being
91a host, and one assuming the target is the network portion
92of the address. Each hash value is used to
93locate a hash chain to search (by taking the value modulo the
94hash table size) and the entire 32-bit value is then
95used as a key in scanning the list of routes. Lookups are
96applied first to the routing
97table for hosts, then to the routing table for networks.
98If both lookups fail, a final lookup is made for a ``wildcard''
99route (by convention, network 0).
100The first appropriate route discovered is used.
101By doing this, routes to a specific host on a network may be
102present as well as routes to the network. This also allows a
103``fall back'' network route to be defined to a ``smart'' gateway
104which may then perform more intelligent routing.
105.PP
106Each routing table entry contains a destination (the desired final destination),
107a gateway to which to send the packet,
108and various flags which indicate the route's status and type (host or
109network). A count
110of the number of packets sent using the route is kept, along
111with a count of ``held references'' to the dynamically
112allocated structure to insure that memory reclamation
113occurs only when the route is not in use. Finally, a pointer to the
114a network interface is kept; packets sent using
115the route should be handed to this interface.
116.PP
117Routes are typed in two ways: either as host or network, and as
118``direct'' or ``indirect''. The host/network
119distinction determines how to compare the \fIrt_dst\fP field
120during lookup. If the route is to a network, only a packet's
121destination network is compared to the \fIrt_dst\fP entry stored
122in the table. If the route is to a host, the addresses must
123match bit for bit.
124.PP
125The distinction between ``direct'' and ``indirect'' routes indicates
126whether the destination is directly connected to the source.
127This is needed when performing local network encapsulation. If
128a packet is destined for a peer at a host or network which is
129not directly connected to the source, the internetwork packet
130header will
131contain the address of the eventual destination, while
132the local network header will address the intervening
133gateway. Should the destination be directly connected, these addresses
134are likely to be identical, or a mapping between the two exists.
135The RTF_GATEWAY flag indicates that the route is to an ``indirect''
136gateway agent, and that the local network header should be filled in
137from the \fIrt_gateway\fP field instead of
138from the final internetwork destination address.
139.PP
140It is assumed that multiple routes to the same destination will not
141be present; only one of multiple routes, that most recently installed,
142will be used.
143.PP
144Routing redirect control messages are used to dynamically
145modify existing routing table entries as well as dynamically
146create new routing table entries. On hosts where exhaustive
147routing information is too expensive to maintain (e.g. work
148stations), the
149combination of wildcard routing entries and routing redirect
150messages can be used to provide a simple routing management
151scheme without the use of a higher level policy process.
152Current connections may be rerouted after notification of the protocols
153by means of their \fIpr_ctlinput\fP entries.
154Statistics are kept by the routing table routines
155on the use of routing redirect messages and their
156affect on the routing tables. These statistics may be viewed using
78ed81a3 157.I netstat (1).
15637ed4
RG
158.PP
159Status information other than routing redirect control messages
160may be used in the future, but at present they are ignored.
161Likewise, more intelligent ``metrics'' may be used to describe
162routes in the future, possibly based on bandwidth and monetary
163costs.
164.NH 2
165Routing table interface
166.PP
167A protocol accesses the routing tables through
168three routines,
169one to allocate a route, one to free a route, and one
170to process a routing redirect control message.
171The routine \fIrtalloc\fP performs route allocation; it is
172called with a pointer to the following structure containing
173the desired destination:
174.DS
175._f
176struct route {
177 struct rtentry *ro_rt;
178 struct sockaddr ro_dst;
179};
180.DE
181The route returned is assumed ``held'' by the caller until
182released with an \fIrtfree\fP call. Protocols which implement
183virtual circuits, such as TCP, hold onto routes for the duration
184of the circuit's lifetime, while connection-less protocols,
185such as UDP, allocate and free routes whenever their destination address
186changes.
187.PP
188The routine \fIrtredirect\fP is called to process a routing redirect
189control message. It is called with a destination address,
190the new gateway to that destination, and the source of the redirect.
191Redirects are accepted only from the current router for the destination.
192If a non-wildcard route
193exists to the destination, the gateway entry in the route is modified
194to point at the new gateway supplied. Otherwise, a new routing
195table entry is inserted reflecting the information supplied. Routes
196to interfaces and routes to gateways which are not directly accessible
197from the host are ignored.
198.NH 2
199User level routing policies
200.PP
201Routing policies implemented in user processes manipulate the
202kernel routing tables through two \fIioctl\fP calls. The
203commands SIOCADDRT and SIOCDELRT add and delete routing entries,
204respectively; the tables are read through the /dev/kmem device.
205The decision to place policy decisions in a user process implies
206that routing table updates may lag a bit behind the identification of
207new routes, or the failure of existing routes, but this period
208of instability is normally very small with proper implementation
209of the routing process. Advisory information, such as ICMP
210error messages and IMP diagnostic messages, may be read from
211raw sockets (described in the next section).
212.PP
213Several routing policy processes have already been implemented. The
214system standard
215``routing daemon'' uses a variant of the Xerox NS Routing Information
216Protocol [Xerox82] to maintain up-to-date routing tables in our local
217environment. Interaction with other existing routing protocols,
218such as the Internet EGP (Exterior Gateway Protocol), has been
219accomplished using a similar process.