+ /*
+ * Force to long boundary so we do longword aligned
+ * memory operations. It is too hard to do byte
+ * adjustment, do only word adjustment.
+ */
+ if (((int)w&0x2) && mlen >= 2) {
+ sum += *w++;
+ mlen -= 2;
+ }
+ /*
+ * Do as much of the checksum as possible 32 bits at at time.
+ * In fact, this loop is unrolled to make overhead from
+ * branches &c small.
+ *
+ * We can do a 16 bit ones complement sum 32 bits at a time
+ * because the 32 bit register is acting as two 16 bit
+ * registers for adding, with carries from the low added
+ * into the high (by normal carry-chaining) and carries
+ * from the high carried into the low on the next word
+ * by use of the adwc instruction. This lets us run
+ * this loop at almost memory speed.
+ *
+ * Here there is the danger of high order carry out, and
+ * we carefully use adwc.
+ */