Commit | Line | Data |
---|---|---|
27c51c7b GK |
1 | *************************************************************************** |
2 | * * | |
3 | * Copyright (c) 1985 Regents of the University of California. * | |
4 | * * | |
5 | * Use and reproduction of this software are granted in accordance with * | |
6 | * the terms and conditions specified in the Berkeley Software License * | |
7 | * Agreement (in particular, this entails acknowledgement of the programs' * | |
8 | * source, and inclusion of this notice) with the additional understanding * | |
9 | * that all recipients should regard themselves as participants in an * | |
10 | * ongoing research project and hence should feel obligated to report * | |
11 | * their experiences (good or bad) with these elementary function codes, * | |
12 | * using "sendbug 4bsd-bugs@BERKELEY", to the authors. * | |
13 | * * | |
14 | * K.C. Ng, with Z-S. Alex Liu, S. McDonald, P. Tang, W. Kahan. * | |
15 | * Revised on 5/10/85, 5/13/85, 6/14/85, 8/20/85, 8/27/85, 9/11/85. * | |
16 | * * | |
17 | *************************************************************************** | |
fc550d58 | 18 | |
27c51c7b GK |
19 | /* @(#)README 1.6 (Berkeley) 9/12/85; 1.3 (ucb.elefunt) %G% */ |
20 | ||
21 | NB. The machine-independent Version 7 math library found in 4.2BSD | |
22 | is now /usr/lib/libom.a. To compile with those routines use -lom. | |
fc550d58 ZAL |
23 | |
24 | ****************************************************************************** | |
25 | * This is a description of the upgraded elementary functions (listed in 1). * | |
26 | * Bessel functions (j0, j1, jn, y0, y1, yn), floor, and fabs passed over * | |
27 | * from 4.2BSD without change except perhaps for the way floating point * | |
27c51c7b GK |
28 | * exception is signaled on a VAX. Three lines that contain "errno" in erf.c* |
29 | * (error functions erf, erfc) have been deleted to prevent overriding the * | |
fc550d58 ZAL |
30 | * system "errno". * |
31 | ****************************************************************************** | |
32 | ||
33 | 0. Total number of files: 40 | |
34 | ||
27c51c7b | 35 | IEEE/Makefile VAX/Makefile VAX/support.s erf.c lgamma.c |
fc550d58 ZAL |
36 | IEEE/atan2.c VAX/argred.s VAX/tan.s exp.c log.c |
37 | IEEE/cabs.c VAX/atan2.s acosh.c exp__E.c log10.c | |
38 | IEEE/cbrt.c VAX/cabs.s asincos.c expm1.c log1p.c | |
39 | IEEE/support.c VAX/cbrt.s asinh.c floor.c log__L.c | |
40 | IEEE/trig.c VAX/infnan.s atan.c j0.c pow.c | |
41 | Makefile VAX/sincos.s atanh.c j1.c sinh.c | |
42 | README VAX/sqrt.s cosh.c jn.c tanh.c | |
43 | ||
27c51c7b GK |
44 | 1. Functions implemented : |
45 | (A). Standard elementary functions (total 22) : | |
46 | acos(x) ...in file asincos.c | |
47 | asin(x) ...in file asincos.c | |
48 | atan(x) ...in file atan.c | |
49 | atan2(x,y) ...in files IEEE/atan2.c, VAX/atan2.s | |
50 | sin(x) ...in files IEEE/trig.c, VAX/sincos.s | |
51 | cos(x) ...in files IEEE/trig.c, VAX/sincos.s | |
52 | tan(x) ...in files IEEE/trig.c, VAX/tan.s | |
53 | cabs(x,y) ...in files IEEE/cabs.c, VAX/cabs.s | |
54 | hypot(x,y) ...in files IEEE/cabs.c, VAX/cabs.s | |
55 | cbrt(x) ...in files IEEE/cbrt.c, VAX/cbrt.s | |
56 | exp(x) ...in file exp.c | |
57 | expm1(x):=exp(x)-1 ...in file expm1.c | |
58 | log(x) ...in file log.c | |
59 | log10(x) ...in file log10.c | |
60 | log1p(x):=log(1+x) ...in file log1p.c | |
61 | pow(x,y) ...in file pow.c | |
62 | sinh(x) ...in file sinh.c | |
63 | cosh(x) ...in file cosh.c | |
64 | tanh(x) ...in file tanh.c | |
65 | asinh(x) ...in file asinh.c | |
66 | acosh(x) ...in file acosh.c | |
67 | atanh(x) ...in file atanh.c | |
68 | ||
fc550d58 | 69 | (B). Kernel functions : |
27c51c7b GK |
70 | exp__E(x,c) ...in file exp__E.c, used by expm1/exp/pow/cosh |
71 | log__L(s) ...in file log__L.c, used by log1p/log/pow | |
72 | libm$argred ...in file VAX/argred.s, used by VAX version of sin/cos/tan | |
fc550d58 ZAL |
73 | |
74 | (C). System supported functions : | |
27c51c7b GK |
75 | sqrt() ...in files IEEE/support.c, VAX/sqrt.s |
76 | drem() ...in files IEEE/support.c, VAX/support.s | |
77 | finite() ...in files IEEE/support.c, VAX/support.s | |
78 | logb() ...in files IEEE/support.c, VAX/support.s | |
79 | scalb() ...in files IEEE/support.c, VAX/support.s | |
80 | copysign() ...in files IEEE/support.c, VAX/support.s | |
81 | rint() ...in file floor.c | |
fc550d58 ZAL |
82 | |
83 | ||
84 | Notes: | |
55de90f0 | 85 | i. The codes in files ending with ".s" are written in VAX assembly |
fc550d58 ZAL |
86 | language. They are intended for VAX computers. |
87 | ||
55de90f0 | 88 | Files that end with ".c" are written in C. They are intended |
fc550d58 | 89 | for either a VAX or a machine that conforms to the IEEE |
55de90f0 | 90 | standard 754 for double precision floating-point arithmetic. |
fc550d58 ZAL |
91 | |
92 | ii. On other than VAX or IEEE machines, run the original math | |
27c51c7b GK |
93 | library, formerly "/usr/lib/libm.a", now "/usr/lib/libom.a", if |
94 | nothing better is available. | |
fc550d58 | 95 | |
27c51c7b GK |
96 | iii. The trigonometric functions sin/cos/tan/atan2 in files "VAX/sincos.s", |
97 | "VAX/tan.s" and "VAX/atan2.s" are different from those in | |
98 | "IEEE/trig.c" and "IEEE/atan2.c". The VAX assembler code uses the | |
99 | true value of pi to perform argument reduction, while the C code uses | |
100 | a machine value of PI (see "IEEE/trig.c"). | |
fc550d58 ZAL |
101 | |
102 | ||
103 | 2. A computer system that conforms to IEEE standard 754 should provide | |
27c51c7b GK |
104 | sqrt(x), |
105 | drem(x,p), (double precision remainder function) | |
106 | copysign(x,y), | |
107 | finite(x), | |
108 | scalb(x,N), | |
109 | logb(x) and | |
110 | rint(x). | |
55de90f0 | 111 | These functions are either required or recommended by the standard. |
fc550d58 | 112 | For convenience, a (slow) C implementation of these functions is |
55de90f0 | 113 | provided in the file "IEEE/support.c". |
fc550d58 | 114 | |
27c51c7b | 115 | Warning: The functions in IEEE/support.c are somewhat machine dependent. |
fc550d58 | 116 | Some modifications may be necessary to run them on a different machine. |
27c51c7b GK |
117 | Currently, if compiled with a suitable flag, "IEEE/support.c" will work |
118 | on a National 32000, a Zilog 8000, a VAX, and a SUN (cf. the "Makefile" | |
119 | in this directory). Invoke the C compiler thus: | |
fc550d58 ZAL |
120 | |
121 | cc -c -DVAX IEEE/support.c ... on a VAX, D-format | |
55de90f0 | 122 | cc -c -DNATIONAL IEEE/support.c ... on a National 32000 |
fc550d58 ZAL |
123 | cc -c IEEE/support.c ... on other IEEE machines, |
124 | we hope. | |
125 | ||
126 | Notes: | |
127 | 1. Faster versions of "drem" and "sqrt" for IEEE double precision | |
128 | (coded in C but intended for assembly language) are given at the | |
55de90f0 | 129 | end of "IEEE/support.c" but commented out since they require certain |
fc550d58 ZAL |
130 | machine-dependent functions. |
131 | ||
132 | 2. A fast VAX assembler version of the system supported functions | |
133 | copysign(), logb(), scalb(), finite(), and drem() appears in file | |
55de90f0 ZAL |
134 | "VAX/support.s". A fast VAX assembler version of sqrt() is in |
135 | file "VAX/sqrt.s". | |
fc550d58 ZAL |
136 | |
137 | 3. Two formats are supported by all the standard elementary functions: | |
55de90f0 ZAL |
138 | the VAX D-format (56-bit precision), and the IEEE double format |
139 | (53-bit precision). The cbrt() in "IEEE/cbrt.c" is for IEEE machines | |
fc550d58 | 140 | only. The functions in files that end with ".s" are for VAX computers |
27c51c7b GK |
141 | only. The functions in files that end with ".c" (except "IEEE/cbrt.c") |
142 | are for VAX and IEEE machines. To use the VAX D-format, compile the code | |
fc550d58 | 143 | with -DVAX; to use IEEE double format on various IEEE machines, see |
55de90f0 | 144 | "Makefile" in this directory). |
fc550d58 ZAL |
145 | |
146 | Example: | |
147 | cc -c -DVAX sin.c ... for VAX D-format | |
148 | ||
149 | Warning: The values of floating-point constants used in the code are | |
150 | given in both hexadecimal and decimal. The hexadecimal values | |
55de90f0 | 151 | are the intended ones. The decimal values may be used provided |
fc550d58 ZAL |
152 | that the compiler converts from decimal to binary accurately |
153 | enough to produce the hexadecimal values shown. If the | |
154 | conversion is inaccurate, then one must know the exact machine | |
55de90f0 | 155 | representation of the constants and alter the assembly |
27c51c7b | 156 | language output from the compiler, or play tricks like |
fc550d58 ZAL |
157 | the following in a C program. |
158 | ||
159 | Example: to store the floating-point constant | |
160 | ||
161 | p1= 2^-6 * .F83ABE67E1066A (Hexadecimal) | |
162 | ||
55de90f0 | 163 | on a VAX in C, we use two longwords to store its |
fc550d58 | 164 | machine value and define p1 to be the double constant |
55de90f0 | 165 | at the location of these two longwords: |
fc550d58 | 166 | |
27c51c7b | 167 | static long p1x[] = { 0x3abe3d78, 0x066a67e1}; |
fc550d58 ZAL |
168 | #define p1 (*(double*)p1x) |
169 | ||
55de90f0 | 170 | Note: On a VAX, some functions have two codes. For example, cabs() has |
27c51c7b | 171 | one implementation in "IEEE/cabs.c", and another in "VAX/cabs.s". |
55de90f0 | 172 | In this case, the assembly language version is preferred. |
fc550d58 ZAL |
173 | |
174 | ||
175 | 4. Accuracy. | |
176 | ||
177 | The errors in expm1(), log1p(), exp(), log(), cabs(), hypot() | |
178 | and cbrt() are below 1 ULP (Unit in the Last Place). | |
179 | ||
180 | The error in pow(x,y) grows with the size of y. Nevertheless, | |
181 | for integers x and y, pow(x,y) returns the correct integer value | |
182 | on all tested machines (VAX, SUN, NATIONAL, ZILOG), provided that | |
183 | x to the power of y is representable exactly. | |
184 | ||
27c51c7b GK |
185 | cosh, sinh, acosh, asinh, tanh, atanh and log10 have errors below |
186 | about 3 ULPs. | |
fc550d58 | 187 | |
27c51c7b GK |
188 | For trigonometric and inverse trigonometric functions: |
189 | ||
190 | Let [trig(x)] denote the value actually computed for trig(x), | |
fc550d58 ZAL |
191 | |
192 | 1) Those codes using the machine's value PI (true pi rounded): | |
27c51c7b | 193 | (source codes: IEEE/{trig.c,atan2.c}, asincos.c and atan.c) |
fc550d58 ZAL |
194 | |
195 | The errors in [sin(x)], [cos(x)], and [atan(x)] are below | |
196 | 1 ULP compared with sin(x*pi/PI), cos(x*pi/PI), and | |
197 | atan(x)*PI/pi respectively, where PI is the machine's | |
55de90f0 | 198 | value of pi rounded. [tan(x)] returns tan(x*pi/PI) within |
fc550d58 ZAL |
199 | about 2 ULPs; [acos(x)], [asin(x)], and [atan2(y,x)] |
200 | return acos(x)*PI/pi, asin(x)*PI/pi, and atan2(y,x)*PI/pi | |
201 | respectively to similar accuracy. | |
202 | ||
27c51c7b | 203 | |
55de90f0 | 204 | 2) Those using true pi (for VAX D-format only): |
27c51c7b GK |
205 | (source codes: VAX/{sincos.s,tan.s,atan2.s}, asincos.c and |
206 | atan.c) | |
fc550d58 ZAL |
207 | |
208 | The errors in [sin(x)], [cos(x)], and [atan(x)] are below | |
27c51c7b | 209 | 1 ULP. [tan(x)], [atan2(y,x)], [acos(x)], and [asin(x)] |
fc550d58 ZAL |
210 | have errors below about 2 ULPs. |
211 | ||
27c51c7b | 212 | |
fc550d58 ZAL |
213 | Here are the results of some test runs to find worst errors on |
214 | the VAX : | |
215 | ||
216 | tan : 2.09 ULPs ...1,024,000 random arguments (machine PI) | |
217 | sin : .861 ULPs ...1,024,000 random arguments (machine PI) | |
218 | cos : .857 ULPs ...1,024,000 random arguments (machine PI) | |
219 | (compared with tan, sin, cos of (x*pi/PI)) | |
220 | ||
221 | acos : 2.07 ULPs .....200,000 random arguments (machine PI) | |
222 | asin : 2.06 ULPs .....200,000 random arguments (machine PI) | |
223 | atan2 : 1.41 ULPs .....356,000 random arguments (machine PI) | |
224 | atan : 0.86 ULPs ...1,536,000 random arguments (machine PI) | |
225 | (compared with (PI/pi)*(atan, asin, acos, atan2 of x)) | |
226 | ||
227 | tan : 2.15 ULPs ...1,024,000 random arguments (true pi) | |
228 | sin : .814 ULPs ...1,024,000 random arguments (true pi) | |
229 | cos : .792 ULPs ...1,024,000 random arguments (true pi) | |
230 | acos : 2.15 ULPs ...1,024,000 random arguments (true pi) | |
231 | asin : 1.99 ULPs ...1,024,000 random arguments (true pi) | |
232 | atan2 : 1.48 ULPs ...1,024,000 random arguments (true pi) | |
233 | atan : .850 ULPs ...1,024,000 random arguments (true pi) | |
234 | ||
235 | acosh : 3.30 ULPs .....512,000 random arguments | |
236 | asinh : 1.58 ULPs .....512,000 random arguments | |
237 | atanh : 1.71 ULPs .....512,000 random arguments | |
238 | cosh : 1.23 ULPs .....768,000 random arguments | |
239 | sinh : 1.93 ULPs ...1,024,000 random arguments | |
240 | tanh : 2.22 ULPs ...1,024,000 random arguments | |
241 | log10 : 1.74 ULPs ...1,536,000 random arguments | |
242 | pow : 1.79 ULPs .....100,000 random arguments, 0 < x, y < 20. | |
243 | ||
244 | exp : .768 ULPs ...1,156,000 random arguments | |
245 | expm1 : .844 ULPs ...1,166,000 random arguments | |
246 | log1p : .846 ULPs ...1,536,000 random arguments | |
247 | log : .826 ULPs ...1,536,000 random arguments | |
248 | cabs : .959 ULPs .....500,000 random arguments | |
249 | cbrt : .666 ULPs ...5,120,000 random arguments | |
250 | ||
251 | ||
252 | 5. Speed. | |
253 | ||
55de90f0 ZAL |
254 | Some functions coded in VAX assembly language (cabs(), hypot() and |
255 | sqrt()) are significantly faster than the corresponding ones in 4.2BSD. | |
256 | In general, to improve performance, all functions in "IEEE/support.c" | |
257 | should be written in assembly language and, whenever possible, should | |
258 | be called via short subroutine calls. | |
fc550d58 ZAL |
259 | |
260 | ||
27c51c7b | 261 | 6. j0, j1, jn. |
fc550d58 ZAL |
262 | |
263 | The modifications to these routines were only in how an invalid | |
27c51c7b | 264 | floating point operations is signaled. |