I am trying to implement the range reduction operation for trigonometry. But instead I think it might be better to just perform a modulo pi/2 operation on incoming data. I was wondering what algorithms exist and are efficient for this operation for 32-bit IEEE 754 floating-point?
I have to implement this in assembly, so fmod, division, multiplication, etc. aren't available to me with just one instruction. My processor uses 16-bit words and I have implemented 32-bit floating point addition, subtraction, multiplication, division, square root, cosine, and sine. I just need range reduction (modulus) for inputting values to cosine and sine.
I think standard library's fmod()
will be the best choice in most cases. Here's a link to a discussion of several simple algorithms.
On my machine, fmod()
uses optimized inline assembly code (/usr/include/bits/mathinline.h
):
#if defined __FAST_MATH__ && !__GNUC_PREREQ (3, 5)
__inline_mathcodeNP2 (fmod, __x, __y, \
register long double __value; \
__asm __volatile__ \
("1: fprem\n\t" \
"fnstsw %%ax\n\t" \
"sahf\n\t" \
"jp 1b" \
: "=t" (__value) : "0" (__x), "u" (__y) : "ax", "cc"); \
return __value)
#endif
So it actually uses a dedicated CPU instruction (fprem) for the calculation.