Commit | Line | Data |
---|---|---|
f9628029 KM |
1 | *************************************************************************** |
2 | * * | |
3 | * Copyright (c) 1985 Regents of the University of California. * | |
4 | * * | |
5 | * Use and reproduction of this software are granted in accordance with * | |
6 | * the terms and conditions specified in the Berkeley Software License * | |
7 | * Agreement (in particular, this entails acknowledgement of the programs' * | |
8 | * source, and inclusion of this notice) with the additional understanding * | |
9 | * that all recipients should regard themselves as participants in an * | |
10 | * ongoing research project and hence should feel obligated to report * | |
11 | * their experiences (good or bad) with these elementary function codes, * | |
12 | * using "sendbug 4bsd-bugs@BERKELEY", to the authors. * | |
13 | * * | |
14 | * K.C. Ng, with Z-S. Alex Liu, S. McDonald, P. Tang, W. Kahan. * | |
9c5c2d2a | 15 | * Revised on 5/10/85, 5/13/85, 6/14/85, 8/20/85, 8/27/85, 9/11/85. * |
f9628029 KM |
16 | * * |
17 | *************************************************************************** | |
89f89af7 | 18 | |
95f51977 | 19 | /* @(#)README 1.6 (Berkeley) 9/12/85 */ |
89f89af7 | 20 | |
f9628029 KM |
21 | NB. The machine-independent Version 7 math library found in 4.2BSD |
22 | is now /usr/lib/libom.a. To compile with those routines use -lom. | |
35d276a5 MAN |
23 | |
24 | ****************************************************************************** | |
25 | * This is a description of the upgraded elementary functions (listed in 1). * | |
e4b02bf2 | 26 | * Bessel functions (j0, j1, jn, y0, y1, yn), floor, and fabs passed over * |
f9628029 | 27 | * from 4.2BSD without change except perhaps for the way floating point * |
9c5c2d2a KM |
28 | * exception is signaled on a VAX. Three lines that contain "errno" in erf.c* |
29 | * (error functions erf, erfc) have been deleted to prevent overriding the * | |
f9628029 | 30 | * system "errno". * |
35d276a5 MAN |
31 | ****************************************************************************** |
32 | ||
e4b02bf2 MAN |
33 | 0. Total number of files: 40 |
34 | ||
f9628029 KM |
35 | IEEE/Makefile VAX/Makefile VAX/support.s erf.c lgamma.c |
36 | IEEE/atan2.c VAX/argred.s VAX/tan.s exp.c log.c | |
37 | IEEE/cabs.c VAX/atan2.s acosh.c exp__E.c log10.c | |
38 | IEEE/cbrt.c VAX/cabs.s asincos.c expm1.c log1p.c | |
39 | IEEE/support.c VAX/cbrt.s asinh.c floor.c log__L.c | |
40 | IEEE/trig.c VAX/infnan.s atan.c j0.c pow.c | |
41 | Makefile VAX/sincos.s atanh.c j1.c sinh.c | |
42 | README VAX/sqrt.s cosh.c jn.c tanh.c | |
e4b02bf2 MAN |
43 | |
44 | 1. Functions implemented : | |
45 | (A). Standard elementary functions (total 22) : | |
f9628029 KM |
46 | acos(x) ...in file asincos.c |
47 | asin(x) ...in file asincos.c | |
48 | atan(x) ...in file atan.c | |
49 | atan2(x,y) ...in files IEEE/atan2.c, VAX/atan2.s | |
9c5c2d2a KM |
50 | sin(x) ...in files IEEE/trig.c, VAX/sincos.s |
51 | cos(x) ...in files IEEE/trig.c, VAX/sincos.s | |
52 | tan(x) ...in files IEEE/trig.c, VAX/tan.s | |
53 | cabs(x,y) ...in files IEEE/cabs.c, VAX/cabs.s | |
54 | hypot(x,y) ...in files IEEE/cabs.c, VAX/cabs.s | |
55 | cbrt(x) ...in files IEEE/cbrt.c, VAX/cbrt.s | |
f9628029 KM |
56 | exp(x) ...in file exp.c |
57 | expm1(x):=exp(x)-1 ...in file expm1.c | |
58 | log(x) ...in file log.c | |
59 | log10(x) ...in file log10.c | |
60 | log1p(x):=log(1+x) ...in file log1p.c | |
61 | pow(x,y) ...in file pow.c | |
62 | sinh(x) ...in file sinh.c | |
63 | cosh(x) ...in file cosh.c | |
9c5c2d2a | 64 | tanh(x) ...in file tanh.c |
f9628029 KM |
65 | asinh(x) ...in file asinh.c |
66 | acosh(x) ...in file acosh.c | |
67 | atanh(x) ...in file atanh.c | |
68 | ||
35d276a5 | 69 | (B). Kernel functions : |
f9628029 KM |
70 | exp__E(x,c) ...in file exp__E.c, used by expm1/exp/pow/cosh |
71 | log__L(s) ...in file log__L.c, used by log1p/log/pow | |
9c5c2d2a | 72 | libm$argred ...in file VAX/argred.s, used by VAX version of sin/cos/tan |
35d276a5 MAN |
73 | |
74 | (C). System supported functions : | |
9c5c2d2a KM |
75 | sqrt() ...in files IEEE/support.c, VAX/sqrt.s |
76 | drem() ...in files IEEE/support.c, VAX/support.s | |
77 | finite() ...in files IEEE/support.c, VAX/support.s | |
78 | logb() ...in files IEEE/support.c, VAX/support.s | |
79 | scalb() ...in files IEEE/support.c, VAX/support.s | |
80 | copysign() ...in files IEEE/support.c, VAX/support.s | |
f9628029 | 81 | rint() ...in file floor.c |
35d276a5 MAN |
82 | |
83 | ||
84 | Notes: | |
9c5c2d2a | 85 | i. The codes in files ending with ".s" are written in VAX assembly |
f9628029 | 86 | language. They are intended for VAX computers. |
35d276a5 | 87 | |
9c5c2d2a | 88 | Files that end with ".c" are written in C. They are intended |
f9628029 | 89 | for either a VAX or a machine that conforms to the IEEE |
9c5c2d2a | 90 | standard 754 for double precision floating-point arithmetic. |
35d276a5 MAN |
91 | |
92 | ii. On other than VAX or IEEE machines, run the original math | |
9c5c2d2a KM |
93 | library, formerly "/usr/lib/libm.a", now "/usr/lib/libom.a", if |
94 | nothing better is available. | |
35d276a5 | 95 | |
e4b02bf2 | 96 | iii. The trigonometric functions sin/cos/tan/atan2 in files "VAX/sincos.s", |
f9628029 KM |
97 | "VAX/tan.s" and "VAX/atan2.s" are different from those in |
98 | "IEEE/trig.c" and "IEEE/atan2.c". The VAX assembler code uses the | |
99 | true value of pi to perform argument reduction, while the C code uses | |
100 | a machine value of PI (see "IEEE/trig.c"). | |
35d276a5 MAN |
101 | |
102 | ||
103 | 2. A computer system that conforms to IEEE standard 754 should provide | |
f9628029 KM |
104 | sqrt(x), |
105 | drem(x,p), (double precision remainder function) | |
106 | copysign(x,y), | |
107 | finite(x), | |
108 | scalb(x,N), | |
109 | logb(x) and | |
110 | rint(x). | |
9c5c2d2a | 111 | These functions are either required or recommended by the standard. |
35d276a5 | 112 | For convenience, a (slow) C implementation of these functions is |
9c5c2d2a | 113 | provided in the file "IEEE/support.c". |
35d276a5 MAN |
114 | |
115 | Warning: The functions in IEEE/support.c are somewhat machine dependent. | |
116 | Some modifications may be necessary to run them on a different machine. | |
9c5c2d2a KM |
117 | Currently, if compiled with a suitable flag, "IEEE/support.c" will work |
118 | on a National 32000, a Zilog 8000, a VAX, and a SUN (cf. the "Makefile" | |
119 | in this directory). Invoke the C compiler thus: | |
35d276a5 | 120 | |
f9628029 | 121 | cc -c -DVAX IEEE/support.c ... on a VAX, D-format |
9c5c2d2a | 122 | cc -c -DNATIONAL IEEE/support.c ... on a National 32000 |
f9628029 KM |
123 | cc -c IEEE/support.c ... on other IEEE machines, |
124 | we hope. | |
35d276a5 MAN |
125 | |
126 | Notes: | |
127 | 1. Faster versions of "drem" and "sqrt" for IEEE double precision | |
f9628029 | 128 | (coded in C but intended for assembly language) are given at the |
9c5c2d2a | 129 | end of "IEEE/support.c" but commented out since they require certain |
f9628029 | 130 | machine-dependent functions. |
35d276a5 | 131 | |
e4b02bf2 | 132 | 2. A fast VAX assembler version of the system supported functions |
f9628029 | 133 | copysign(), logb(), scalb(), finite(), and drem() appears in file |
9c5c2d2a KM |
134 | "VAX/support.s". A fast VAX assembler version of sqrt() is in |
135 | file "VAX/sqrt.s". | |
35d276a5 MAN |
136 | |
137 | 3. Two formats are supported by all the standard elementary functions: | |
9c5c2d2a KM |
138 | the VAX D-format (56-bit precision), and the IEEE double format |
139 | (53-bit precision). The cbrt() in "IEEE/cbrt.c" is for IEEE machines | |
35d276a5 | 140 | only. The functions in files that end with ".s" are for VAX computers |
9c5c2d2a KM |
141 | only. The functions in files that end with ".c" (except "IEEE/cbrt.c") |
142 | are for VAX and IEEE machines. To use the VAX D-format, compile the code | |
35d276a5 | 143 | with -DVAX; to use IEEE double format on various IEEE machines, see |
9c5c2d2a | 144 | "Makefile" in this directory). |
35d276a5 MAN |
145 | |
146 | Example: | |
f9628029 | 147 | cc -c -DVAX sin.c ... for VAX D-format |
35d276a5 | 148 | |
e4b02bf2 | 149 | Warning: The values of floating-point constants used in the code are |
f9628029 | 150 | given in both hexadecimal and decimal. The hexadecimal values |
9c5c2d2a | 151 | are the intended ones. The decimal values may be used provided |
f9628029 KM |
152 | that the compiler converts from decimal to binary accurately |
153 | enough to produce the hexadecimal values shown. If the | |
154 | conversion is inaccurate, then one must know the exact machine | |
9c5c2d2a KM |
155 | representation of the constants and alter the assembly |
156 | language output from the compiler, or play tricks like | |
f9628029 | 157 | the following in a C program. |
35d276a5 | 158 | |
f9628029 | 159 | Example: to store the floating-point constant |
35d276a5 | 160 | |
f9628029 | 161 | p1= 2^-6 * .F83ABE67E1066A (Hexadecimal) |
35d276a5 | 162 | |
9c5c2d2a | 163 | on a VAX in C, we use two longwords to store its |
f9628029 | 164 | machine value and define p1 to be the double constant |
9c5c2d2a | 165 | at the location of these two longwords: |
35d276a5 | 166 | |
f9628029 KM |
167 | static long p1x[] = { 0x3abe3d78, 0x066a67e1}; |
168 | #define p1 (*(double*)p1x) | |
35d276a5 | 169 | |
9c5c2d2a KM |
170 | Note: On a VAX, some functions have two codes. For example, cabs() has |
171 | one implementation in "IEEE/cabs.c", and another in "VAX/cabs.s". | |
172 | In this case, the assembly language version is preferred. | |
35d276a5 MAN |
173 | |
174 | ||
175 | 4. Accuracy. | |
176 | ||
e4b02bf2 | 177 | The errors in expm1(), log1p(), exp(), log(), cabs(), hypot() |
f9628029 | 178 | and cbrt() are below 1 ULP (Unit in the Last Place). |
35d276a5 | 179 | |
f9628029 KM |
180 | The error in pow(x,y) grows with the size of y. Nevertheless, |
181 | for integers x and y, pow(x,y) returns the correct integer value | |
182 | on all tested machines (VAX, SUN, NATIONAL, ZILOG), provided that | |
183 | x to the power of y is representable exactly. | |
35d276a5 | 184 | |
f9628029 KM |
185 | cosh, sinh, acosh, asinh, tanh, atanh and log10 have errors below |
186 | about 3 ULPs. | |
35d276a5 | 187 | |
f9628029 | 188 | For trigonometric and inverse trigonometric functions: |
35d276a5 | 189 | |
f9628029 | 190 | Let [trig(x)] denote the value actually computed for trig(x), |
35d276a5 | 191 | |
f9628029 KM |
192 | 1) Those codes using the machine's value PI (true pi rounded): |
193 | (source codes: IEEE/{trig.c,atan2.c}, asincos.c and atan.c) | |
35d276a5 | 194 | |
f9628029 KM |
195 | The errors in [sin(x)], [cos(x)], and [atan(x)] are below |
196 | 1 ULP compared with sin(x*pi/PI), cos(x*pi/PI), and | |
197 | atan(x)*PI/pi respectively, where PI is the machine's | |
9c5c2d2a | 198 | value of pi rounded. [tan(x)] returns tan(x*pi/PI) within |
f9628029 KM |
199 | about 2 ULPs; [acos(x)], [asin(x)], and [atan2(y,x)] |
200 | return acos(x)*PI/pi, asin(x)*PI/pi, and atan2(y,x)*PI/pi | |
201 | respectively to similar accuracy. | |
35d276a5 MAN |
202 | |
203 | ||
9c5c2d2a | 204 | 2) Those using true pi (for VAX D-format only): |
f9628029 KM |
205 | (source codes: VAX/{sincos.s,tan.s,atan2.s}, asincos.c and |
206 | atan.c) | |
35d276a5 | 207 | |
f9628029 | 208 | The errors in [sin(x)], [cos(x)], and [atan(x)] are below |
9c5c2d2a | 209 | 1 ULP. [tan(x)], [atan2(y,x)], [acos(x)], and [asin(x)] |
f9628029 | 210 | have errors below about 2 ULPs. |
35d276a5 MAN |
211 | |
212 | ||
f9628029 KM |
213 | Here are the results of some test runs to find worst errors on |
214 | the VAX : | |
35d276a5 | 215 | |
f9628029 KM |
216 | tan : 2.09 ULPs ...1,024,000 random arguments (machine PI) |
217 | sin : .861 ULPs ...1,024,000 random arguments (machine PI) | |
218 | cos : .857 ULPs ...1,024,000 random arguments (machine PI) | |
35d276a5 MAN |
219 | (compared with tan, sin, cos of (x*pi/PI)) |
220 | ||
f9628029 KM |
221 | acos : 2.07 ULPs .....200,000 random arguments (machine PI) |
222 | asin : 2.06 ULPs .....200,000 random arguments (machine PI) | |
223 | atan2 : 1.41 ULPs .....356,000 random arguments (machine PI) | |
224 | atan : 0.86 ULPs ...1,536,000 random arguments (machine PI) | |
35d276a5 MAN |
225 | (compared with (PI/pi)*(atan, asin, acos, atan2 of x)) |
226 | ||
f9628029 KM |
227 | tan : 2.15 ULPs ...1,024,000 random arguments (true pi) |
228 | sin : .814 ULPs ...1,024,000 random arguments (true pi) | |
229 | cos : .792 ULPs ...1,024,000 random arguments (true pi) | |
230 | acos : 2.15 ULPs ...1,024,000 random arguments (true pi) | |
231 | asin : 1.99 ULPs ...1,024,000 random arguments (true pi) | |
232 | atan2 : 1.48 ULPs ...1,024,000 random arguments (true pi) | |
233 | atan : .850 ULPs ...1,024,000 random arguments (true pi) | |
234 | ||
235 | acosh : 3.30 ULPs .....512,000 random arguments | |
236 | asinh : 1.58 ULPs .....512,000 random arguments | |
237 | atanh : 1.71 ULPs .....512,000 random arguments | |
238 | cosh : 1.23 ULPs .....768,000 random arguments | |
239 | sinh : 1.93 ULPs ...1,024,000 random arguments | |
240 | tanh : 2.22 ULPs ...1,024,000 random arguments | |
241 | log10 : 1.74 ULPs ...1,536,000 random arguments | |
242 | pow : 1.79 ULPs .....100,000 random arguments, 0 < x, y < 20. | |
243 | ||
244 | exp : .768 ULPs ...1,156,000 random arguments | |
245 | expm1 : .844 ULPs ...1,166,000 random arguments | |
246 | log1p : .846 ULPs ...1,536,000 random arguments | |
247 | log : .826 ULPs ...1,536,000 random arguments | |
248 | cabs : .959 ULPs .....500,000 random arguments | |
35d276a5 MAN |
249 | cbrt : .666 ULPs ...5,120,000 random arguments |
250 | ||
251 | ||
252 | 5. Speed. | |
253 | ||
9c5c2d2a KM |
254 | Some functions coded in VAX assembly language (cabs(), hypot() and |
255 | sqrt()) are significantly faster than the corresponding ones in 4.2BSD. | |
256 | In general, to improve performance, all functions in "IEEE/support.c" | |
257 | should be written in assembly language and, whenever possible, should | |
258 | be called via short subroutine calls. | |
e4b02bf2 MAN |
259 | |
260 | ||
9c5c2d2a | 261 | 6. j0, j1, jn. |
e4b02bf2 | 262 | |
9c5c2d2a | 263 | The modifications to these routines were only in how an invalid |
f9628029 | 264 | floating point operations is signaled. |