.TH ENET 4 "17 January 1986" Stanford .SH "NAME" enet \- ethernet packet filter .SH SYNOPSIS .B "pseudo-device enetfilter 64" .SH "DESCRIPTION" The packet filter provides a raw interface to Ethernets and similar network data link layers. Packets received that are not used by the kernel (i.e., to support IP, ARP, and on some systems XNS, protocols) are available through this mechanism. The packet filter appears as a set of character special files, one per hardware interface. Each .I enet file may be opened multiple times, allowing each interface to be used by many processes. The total number of open ethernet files is limited to the value given in the kernel configuration; the example given in the SYNOPSIS above sets the limit to 64. .PP The minor device numbers are associated with interfaces when the system is booted. Minor device 0 is associated with the first Ethernet interface ``attached'', minor device 1 with the second, and so forth. (These character special files are, for historical reasons, given the names .IR /dev/enet0 , .IR /dev/eneta0 , .IR /dev/enetb0 , etc.) .PP Associated with each open instance of an .I enet file is a user-settable packet filter which is used to deliver incoming ethernet packets to the appropriate process. Whenever a packet is received from the net, successive packet filters from the list of filters for all open .I enet files are applied to the packet. When a filter accepts the packet, it is placed on the packet input queue of the associated file. If no filters accept the packet, it is discarded. The format of a packet filter is described below. .PP Reads from these files return the next packet from a queue of packets that have matched the filter. If insufficient buffer space to store the entire packet is specified in the read, the packet will be truncated and the trailing contents lost. Writes to these devices transmit packets on the network, with each write generating exactly one packet. .PP The packet filter currently supports a variety of different ``Ethernet'' data-link levels: .IP "3mb Ethernet" 1.5i packets consist of 4 or more bytes with the first byte specifying the source ethernet address, the second byte specifying the destination ethernet address, and the next two bytes specifying the packet type. (Actually, on the network the source and destination addresses are in the opposite order.) .IP "byte-swapping 3mb Ethernet" 1.5i packets consist of 4 or more bytes with the first byte specifying the source ethernet address, the second byte specifying the destination ethernet address, and the next two bytes specifying the packet type. \fBEach short word (pair of bytes) is swapped from the network byte order\fR; this device type is only provided as a concession to backwards-compatibility. .IP "10mb Ethernet" 1.5i packets consist of 14 or more bytes with the first six bytes specifying the destination ethernet address, the next six bytes the source ethernet address, and the next two bytes specifying the packet type. .PP The remaining words are interpreted according to the packet type. Note that 16-bit and 32-bit quantities may have to be byteswapped (and possible short-swapped) to be intelligible on a Vax. .PP The packet filter mechanism does not know anything about the data portion of the packets it sends and receives. The user must supply the headers for transmitted packets (although the system makes sure that the source address is correct) and the headers of received packets are delivered to the user. The packet filters treat the entire packet, including headers, as uninterpreted data. .SH "IOCTL CALLS" In addition to FIONREAD, ten special .I ioctl calls may be applied to an open .I enet file. The first two set and fetch parameters for the file and are of the form: .sp .nf .RS .B #include .B #include .B ioctl(fildes, code, param) .B struct eniocb *param; .RE .fi .sp where .I param is defined in .I as: .br .sp .nf .RS .ta \w'struct 'u +\w'u_char 'u .ft B struct eniocb { u_char en_addr; u_char en_maxfilters; u_char en_maxwaiting; u_char en_maxpriority; long en_rtout; }; .DT .RE .fi .ft R .sp with the applicable codes being: .TP EIOCGETP Fetch the parameters for this file. .TP EIOCSETP Set the parameters for this file. .i0 .DT .PP The maximum filter length parameter en_maxfilters indicates the maximum possible packet filter command list length (see EIOCSETF below). The maximum input wait queue size parameter en_maxwaitingindicates the maximum number of packets which may be queued for an ethernet file at one time (see EIOCSETW below). The maximum priority parameter en_maxpriority indicates the highest filter priority which may be set for the file (see EIOCSETF below). The en_addr field is no longer maintained by the driver; see EIOCDEVP below. .PP The read timeout parameter en_rtout specifies the number of clock ticks to wait before timing out on a read request and returning an EOF. This parameter is initialized to zero by .IR open (2), indicating no timeout. If it is negative, then read requests will return an EOF immediately if there are no packets in the input queue. (Note that all parameters except for the read timeout are read-only and are ignored when changed.) .PP A different .I ioctl is used to get device parameters of the ethernet underlying the minor device. It is of the form: .sp .nf .RS .B #include .br .B #include .B ioctl(fildes, EIOCDEVP, param) .RE .fi .sp where .I param is defined in .I as: .br .sp .nf .RS .ta \w'struct 'u +\w'u_short 'u .ft B struct endevp { u_char end_dev_type; u_char end_addr_len; u_short end_hdr_len; u_short end_MTU; u_char end_addr[EN_MAX_ADDR_LEN]; u_char end_broadaddr[EN_MAX_ADDR_LEN]; }; .DT .RE .fi .ft R .sp The fields are: .IP end_dev_type 1.5i Specifies the device type; currently one of ENDT_3MB, ENDT_BS3MB or ENDT_10MB. .IP end_addr_len 1.5i Specifies the address length in bytes (e.g., 1 or 6). .IP end_hdr_len 1.5i Specifies the total header length in bytes (e.g., 4 or 14). .IP end_MTU 1.5i Specifies the maximum packet size, including header, in bytes. .IP end_addr 1.5i The address of this interface; aligned so that the low order byte of the address is the first byte in the array. .IP end_broadaddr 1.5i The hardware destination address for broadcasts on this network. .PP The next two calls enable and disable the input packet signal mechanism for the file and are of the form: .sp .nf .RS .B #include .B #include .B ioctl(fildes, code, signp) .B u_int *signp; .RE .fi .sp where .I signp is a pointer to a word containing the number of the signal to be sent when an input packet arrives and with the applicable codes being: .TP EIOCENBS Enable the specified signal when an input packet is received for this file. If the ENHOLDSIG flag (see EIOCMBIS below) is not set, further signals are automatically disabled whenever a signal is sent to prevent nesting and hence must be specifically re-enabled after processing. When a signal number of 0 is supplied, this call is equivalent to EIOCINHS. .TP EIOCINHS Disable any signal when an input packet is received for this file (the .I signp parameter is ignored). This is the default when the file is first opened. .i0 .DT .PP .sp The next two calls set and clear ``mode bits'' for the for the file and are of the form: .sp .nf .RS .B #include .B #include .B ioctl(fildes, code, bits) .B u_short *bits; .RE .fi .sp where .I bits is a short work bit-mask specifying which bits to set or clear. Currently, the only bit mask recognized is ENHOLDSIG, which (if .IR clear ) means that the driver should disable the effect of EIOCENBS once it has delivered a signal. Setting this bit means that you need use EIOCENBS only once. (For historical reasons, the default is that ENHOLDSIG is set.) The applicable codes are: .TP EIOCMBIS Sets the specified mode bits .TP EIOCMBIC Clears the specified mode bits .PP Another .I ioctl call is used to set the maximum size of the packet input queue for an open .I enet file. It is of the form: .sp .nf .RS .B #include .B #include .B ioctl(fildes, EIOCSETW, maxwaitingp) .B u_int *maxwaitingp; .RE .fi .sp where .I maxwaitingp is a pointer to a word containing the input queue size to be set. If this is greater than maximum allowable size (see EIOCGETP above), it is set to the maximum, and if it is zero, it is set to a default value. .sp Another .I ioctl call flushes the queue of incoming packets. It is of the form: .sp .nf .RS .B #include .B #include .B ioctl(fildes, EIOCFLUSH, 0) .RE .fi .sp The final .I ioctl call is used to set the packet filter for an open .I enet file. It is of the form: .sp .nf .RS .B #include .B #include .B ioctl(fildes, EIOCSETF, filter) .B struct enfilter *filter .RE .fi .sp where enfilter is defined in .I as: .sp .nf .RS .ft B .ta \w'struct 'u \w'struct u_short 'u struct enfilter { u_char enf_Priority; u_char enf_FilterLen; u_short enf_Filter[ENMAXFILTERS]; }; .DT .ft R .RE .fi .PP A packet filter consists of a priority, the filter command list length (in shortwords), and the filter command list itself. Each filter command list specifies a sequence of actions which operate on an internal stack. Each shortword of the command list specifies an action from the set { .B ENF_PUSHLIT, .B ENF_PUSHZERO, .B ENF_PUSHWORD+N } which respectively push the next shortword of the command list, zero, or shortword .B N of the incoming packet on the stack, and a binary operator from the set { .B ENF_EQ, .B ENF_NEQ, .B ENF_LT, .B ENF_LE, .B ENF_GT, .B ENF_GE, .B ENF_AND, .B ENF_OR, .B ENF_XOR } which then operates on the top two elements of the stack and replaces them with its result. When both an action and operator are specified in the same shortword, the action is performed followed by the operation. .PP The binary operator can also be from the set { .B ENF_COR, .B ENF_CAND, .B ENF_CNOR, .B ENF_CNAND }. These are ``short-circuit'' operators, in that they terminate the execution of the filter immediately if the condition they are checking for is found, and continue otherwise. All pop two elements from the stack and compare them for equality; .B ENF_CAND returns false if the result is false; .B ENF_COR returns true if the result is true; .B ENF_CNAND returns true if the result is false; .B ENF_CNOR returns false if the result is true. Unlike the other binary operators, these four do not leave a result on the stack, even if they continue. .PP The short-circuit operators should be used when possible, to reduce the amount of time spent evaluating filters. When they are used, you should also arrange the order of the tests so that the filter will succeed or fail as soon as possible; for example, checking the Socket field of a Pup packet is more likely to indicate failure than the packet type field. .PP The special action .B ENF_NOPUSH and the special operator .B ENF_NOP can be used to only perform the binary operation or to only push a value on the stack. Since both are (conveniently) defined to be zero, indicating only an action actually specifies the action followed by .BR ENF_NOP , and indicating only an operation actually specifies .B ENF_NOPUSH followed by the operation. .PP After executing the filter command list, a non-zero value (true) left on top of the stack (or an empty stack) causes the incoming packet to be accepted for the corresponding .I enet file and a zero value (false) causes the packet to be passed through the next packet filter. (If the filter exits as the result of a short-circuit operator, the top-of-stack value is ignored.) Specifying an undefined operation or action in the command list or performing an illegal operation or action (such as pushing a shortword offset past the end of the packet or executing a binary operator with fewer than two shortwords on the stack) causes a filter to reject the packet. .sp In an attempt to deal with the problem of overlapping and/or conflicting packet filters, the filters for each open .I enet file are ordered by the driver according to their priority (lowest priority is 0, highest is 255). When processing incoming ethernet packets, filters are applied according to their priority (from highest to lowest) and for identical priority values according to their relative ``busyness'' (the filter that has previously matched the most packets is checked first) until one or more filters accept the packet or all filters reject it and it is discarded. .PP Filters at a priority of 2 or higher are called "high priority" filters. Once a packet is delivered to one of these "high priority" .I enet files, no further filters are examined, i.e. the packet is delivered only to the first .I enet file with a "high priority" filter which accepts the packet. A packet may be delivered to more than one filter with a priority below 2; this might be useful, for example, in building replicated programs. However, the use of low-priority filters imposes an additional cost on the system, as these filters each must be checked against all packets not accepted by a high-priority filter. .PP The packet filter for an .I enet file is initialized with length 0 at priority 0 by .IR open (2), and hence by default accepts all packets which no "high priority" filter is interested in. .PP Priorities should be assigned so that, in general, the more packets a filter is expected to match, the higher its priority. This will prevent a lot of needless checking of packets against filters that aren't likely to match them. .SH "FILTER EXAMPLES" The following filter would accept all incoming .I Pup packets on a 3mb ethernet with Pup types in the range 1-0100: .sp .nf .ft B .ta \w'stru'u \w'struct ENF_PUSHWORD+8, ENF_PUSHLIT, 2, 'u struct enfilter f = { 10, 19, /* priority and length */ ENF_PUSHWORD+1, ENF_PUSHLIT, 2, ENF_EQ, /* packet type == PUP */ ENF_PUSHWORD+3, ENF_PUSHLIT, 0xFF00, ENF_AND, /* mask high byte */ ENF_PUSHZERO, ENF_GT, /* PupType > 0 */ ENF_PUSHWORD+3, ENF_PUSHLIT, 0xFF00, ENF_AND, /* mask high byte */ ENF_PUSHLIT, 0100, ENF_LE, /* PupType <= 0100 */ ENF_AND, /* 0 < PupType <= 0100 */ ENF_AND /* && packet type == PUP */ }; .DT .ft R .fi .sp Note that shortwords, such as the packet type field, are byte-swapped and so the literals you compare them to must be byte-swapped. Also, although for this example the word offsets are constants, code that must run with either 3mb or 10mb ethernets must use offsets that depend on the device type. .PP By taking advantage of the ability to specify both an action and operation in each word of the command list, the filter could be abbreviated to: .sp .nf .ta \w'stru'u \w'struct ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND, 'u .ft B struct enfilter f = { 10, 14, /* priority and length */ ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_EQ, 2, /* packet type == PUP */ ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND, 0xFF00, /* mask high byte */ ENF_PUSHZERO | ENF_GT, /* PupType > 0 */ ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND, 0xFF00, /* mask high byte */ ENF_PUSHLIT | ENF_LE, 0100, /* PupType <= 0100 */ ENF_AND, /* 0 < PupType <= 0100 */ ENF_AND /* && packet type == PUP */ }; .ft R .DT .fi .sp A different example shows the use of "short-circuit" operators to create a more efficient filter. This one accepts Pup packets (on a 3Mbit ethernet) with a Socket field of 12345. Note that we check the Socket field before the packet type field, since in most packets the Socket is not likely to match. .sp .nf .ta \w'stru'u \w'struct ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_CAND, 'u .ft B struct enfilter f = { 10, 9, /* priority and length */ ENF_PUSHWORD+7, ENF_PUSHLIT | ENF_CAND, 0, /* High word of socket */ ENF_PUSHWORD+8, ENF_PUSHLIT | ENF_CAND, 12345, /* Low word of socket */ ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_CAND, 2 /* packet type == Pup */ }; .ft R .DT .fi .SH "SEE ALSO" de(4), ec(4), en(4), il(4), enstat(8) .SH "FILES" /dev/enet{,a,b,c,...}0 .SH BUGS The current implementation can only filter on words within the first "mbuf" of the packet; this is around 100 bytes (or 50 words). .PP Because packets are streams of bytes, yet the filters operate on short words, and standard network byte order is usually opposite from Vax byte order, the relational operators .B ENF_LT, ENF_LE, .B ENF_GT, and .B ENF_GE are not all that useful. Fortunately, they were not often used when the packets were treated as streams of shorts, so this is probably not a severe problem. If this becomes a severe problem, a byte-swapping operator could be added. .PP Many of the "features" of this driver are there for historical reasons; the manual page could be a lot cleaner if these were left out. .SH "HISTORY" .TP 8-Oct-85 Jeffrey Mogul at Stanford University Revised to describe 4.3BSD version of driver. .TP 18-Oct-84 Jeffrey Mogul at Stanford University Added short-circuit operators, changed discussion of priorities to reflect new arrangement. .TP 18-Jan-84 Jeffrey Mogul at Stanford University Updated for 4.2BSD (device-independent) version, including documentation of all non-kernel ioctls. .TP 17-Nov-81 Mike Accetta (mja) at Carnegie-Mellon University Added mention of to include examples. .TP 29-Sep-81 Mike Accetta (mja) at Carnegie-Mellon University Changed to describe new EIOCSETW and EIOCFLUSH ioctl calls and the new multiple packet queuing features. .TP 12-Nov-80 Mike Accetta (mja) at Carnegie-Mellon University Added description of signal mechanism for input packets. .TP 07-Mar-80 Mike Accetta (mja) at Carnegie-Mellon University Created.