Implementing exception checking and rounding direction changing functions in a Cygwin/Intel environment, which appear on standard Unix platforms in ieeefp.h. It should also work in Linux environments that don't provide these functions. Version 3.0 As of Version 3.0, CIieeefp is released under a GNU Lesser General Public Licence. 0 Installation and use THIS CODE IS FOR INTEL PLATFORMS ONLY. It uses assembly language containing instructions from the Intel manual. If your CPU is not Intel or Intel compatible, this software will almost certainly NOT work. THIS SOFTWARE IS A SHAMELESS HACK. Although I have made an effort to do things properly, my knowledge of Intel assembly language is rudimentary at best, and I cannot guarantee that the functions herein work exactly as required. Version 3.0 is rather less of a hack than earlier versions thanks to a contribution from Glenn Burkhardt. To build this software do: make make test env PREFIX=/install/dir make install /install/dir should be a path to an include directory, into which the header files will be put, and a lib directory, into which the library file will be put. So, for example, if you are using this software in conjunction with Swarm, you might want the libraries to be installed alongside your Swarm installation. By default, this is in the directory /Swarm-2.1.1, in which case you would do: env PREFIX=/Swarm-2.1.1 make install Then the library files would be installed in /Swarm-2.1.1/lib and the header files in /Swarm-2.1.1/include. Later versions of Swarm are installed alongside the Cygwin installation, and you would probably be better off using the default installation location of /usr/local. If you're lazy, then you could just use /usr. On some Cygwin systems the root directory is not recognised for some reason (i.e. typing ls / gives a 'file not found' error). In such cases you will have to specify a prefix when doing make install that includes a top level drive letter, e.g.: env PREFIX=//c/install/dir make install If C is not a valid drive letter on your system then replace //c in the above command with //? where ? is a valid drive letter. Later versions of Cygwin don't accept the //c notation for a drive, and you should replace //? for drive ? with /cygdrive/?. The Swarm example above would be: env PREFIX=//c/Swarm-2.1.1 make install The first build creates the CIieeefp library. It requires a version of gcc 3.1 or later to use the more programmer-friendly asm features these versions enable. If you have a version of gcc earlier than this, you should mv x87FPUcmds.c-gcc-pre3.1 x87FPUcmds.c, before doing make. The CIieeefp library should be compilable with -O2 optimisation (LIB_OPTIM variable in the Makefile). Remove the -O2 optimisation if you have problems or prefer to be cautious. The make test command then checks that various functions work correctly, with particular attention on those controlling rounding direction and detecting floating point errors using the sticky bits. If the test fails then you probably don't have the same Intel chip as me, or you don't have an Intel chip at all. The TEST_OPTIM variable in the Makefile allows you to set compiler optimisation flags for the test-CIieeefp program. By default these are cleared, as some of the tests fail with compiler optimisation of the test-CIieeefp program. If you want to compare the behaviour of the floating point unit with that on a Sparc chip, you can do make comparison. This will give you some output indicating where the behaviour is different. Some differences in behaviour between the Sparc and Intel chips should be expected. The IEEE standard is not intended to be sufficiently restrictive that the result of all possible floating point operations is guaranteed to be the same on all platforms. The library contains eight IEEE functions: fpgetsticky(), fpsetsticky(), fpgetround(), fpsetround(), fpgetmask(), fpsetmask(), fpclass(), finite(). These are designed to function as per the POSIX standard. In version 3.0, the functioning of fpgetmask() and fpsetmask() has been corrected to follow the normal pattern of bit set => exception rather than bit clear => exception as in earlier versions of CIieeefp. Four extra functions are also contained within the library: fpgetprecision(), fpsetprecision(), print_fpu_control(), and print_fpu_status(). The first two are designed specifically for Intel processors to force the correct (64 bit) precision to be used in operations, as opposed to the default, which is to use extended (80 bit) precision. The print_fpu_ functions allow you to print the contents of the Floating Point Unit (FPU) status and control words on the chip, if you are interested in that sort of thing. fpgetsticky() returns a variable of type fp_except, which can be checked for various exceptions using a bitwise AND. The exceptions you can check for are defined using macros -- e.g. in C code: { fp_except x = fpgetsticky(); if(x & FP_X_INV) { /* invalid operation since last fpsetsticky(0) */ } if(x & FP_X_DNML) { /* denormalised operand since last fpsetsticky(0) */ } if(x & FP_X_DZ) { /* division by zero since last fpsetsticky(0) */ } if(x & FP_X_OFL) { /* overflow since last fpsetsticky(0) */ } if(x & FP_X_UFL) { /* underflow since last fpsetsticky(0) */ } if(x & FP_X_IMP) { /* imprecision since last fpsetsticky(0) */ } } Calling fpsetsticky() sets the exception flags to the required value, returning the prior settings as per fpgetsticky(). fpgetmask() and fpsetmask() work in a similar way to fpget/setsticky(). They provide an alternative means to handle errors from floating point arithmetic -- a rather less user-friendly one. If a bit corresponding to an exception is set in the exception mask, then the program will simply terminate. On Unix systems, the masks can be used to deliver signals (SIGFPE) that can be caught and dealt with using an appropriate handler. I have not tested whether this works on a Cygwin platform. e.g. { fpsetmask(FP_X_DZ | FP_X_IMP); /* Cause the program to terminate immediately any floating point error EXCEPT division by zero or imprecision occurs */ } fpgetround() returns the current rounding direction setting in the CPU floating point unit as type fp_rnd. Again, macros are defined for you to check this: { fp_rnd r = fpgetround(); if(r == FP_RN) { /* round to nearest even */ } if(r == FP_RM) { /* round to minus infinity */ } if(r == FP_RP) { /* round to plus infinity */ } if(r == FP_RZ) { /* round to zero */ } } You can call fpsetround(arg) with arg set to FP_RN, FP_RM, FP_RP or FP_RZ to set the required rounding direction accordingly. All floating point operations after this call and until the next call will have the given rounding direction, including those in any library function you might call. fpclass(arg) allows you to determine the kind of (double precision) floating point number that arg is. The following gives an example: { double num = 123.456; switch(fpclass(num)) { case FP_SNAN: /* Signalling NaN (not-a-number). You won't be able to check for this. */ break; case FP_QNAN: /* Quiet NaN. e.g. from sqrt(-1) */ break; case FP_NINF: case FP_PINF: /* [N]egative/[P]ositive infinity */ break; case FP_NDENORM: case FP_PDENORM: /* [N]egative/[P]ositive denormalised number. These are small numbers less (in magnitude) than DBL_MIN (float.h) but more than zero, which allow a gradual underflow, but with considerable loss of accuracy. for some reason this implementation of fpclass never detects these classes -- they are always returned as normalised. */ break; case FP_NZERO: /* Negative zero. The IEEE 754 standard stipulates that negative zero be representable. It is equal to positive zero (i.e. -0.0 == 0.0), and is not less than positive zero. The only way to test for negative zero is to use fpclass(). */ break; case FP_PZERO: /* Positive zero. */ break; case FP_NNORM: case FP_PNORM: /* [N]egative/[P]ositive normalised number. These form the majority of floating point numbers in the IEEE standard, and (with the exception of zero) are probably what you are thinking of when you are writing code using floating point numbers). */ break; default: /* PANIC! Not a valid floating point number class */ abort(); } } finite(arg) returns 1 if the double arg is one of the zero, normalised or denormalised classes returned by fpclass(), and 0 otherwise. fpgetprecision() returns a variable of type fp_pctl containing the precision control setting in the CPU floating point unit. Macros are defined for you to check the result: { fp_pctl p = fpgetprecision(); if(r == FP_PC_SGL) { /* Single precision calculations */ } if(r == FP_PC_DBL) { /* Double precision calculations */ } if(r == FP_PC_EXT) { /* Double extended precision calculations (default) */ } if(r == FP_PC_RES) { /* Reserved precision control setting */ } } Note that the default setting is used regardless of the type of the variable. For example: int main(int argc, char **argv) { double x = 1.0, y = 3.0; double z; z = x / y; /* This calculation is performed in extended precision and z contains the truncated result. z is not guaranteed to be the nearest double precision number to the infinitely precise result */ fpsetprecision(FP_PC_DBL); z = x / y; /* Now the calculation is performed in double precision, and z is guaranteed to be the nearest double precision number to the infinitely precise result */ return 0; } fpsetprecision() allows you to set the precision control field, as demonstrated above. The precision control field can be set to FP_PC_SGL, FP_PC_DBL, or FP_PC_EXT. Any other setting will result in a fatal error message. The function returns the previous setting. There are a few calculations where using FP_PC_EXT will not return the closest double precision floating point number to the infinitely precise result, as stipulated by the IEEE standard. One example is 2^53 + (1 + 2^12), which returns 2^53 instead of the (closer) 2^53 + 2 that is returned when FP_PC_DBL is used. 1 Introduction The IEEE 754 standard stipulates that a conforming environment provides functions for checking floating point exceptions and changing rounding directions. The functions to do this in C are provided through the header file ieeefp.h in POSIX conforming platforms. Although this header file appears in the Cygwin environment, the functions are not implemented, and any use of them in C code results in linker errors at compile time. Thus Cygwin is not fully IEEE 754 compliant. (The remainder of) this document describes the process by which a some of the functions provided by ieeefp.h have been (partially) implemented in the Cygwin environment for an Intel CPU. These functions are fpsetsticky(), fpgetsticky(), fpsetround(), fpgetround(), fpsetmask(), fpgetmask(), fpclass(), and finite(). 2 Intel CPU floating point handling instructions The sources for this material are the Intel Pentium IV manuals parts 1 (Basic Architecture, BA) and 2 (Instruction Set Reference, ISR) available (at the time of writing) in PDF format from http://www.intel.com/design/Pentium4/manuals/. Floating point arithmetic in Intel CPUs is complicated somewhat by the fact that there are two floating point units in the CPU: the x87 FPU, and the MMX FPU, which share state registers (BA, p. 8-2). Code using both FPUs therefore has to "manage" the x87 and MMX states (ibid.). This document, and the implemented functions, assume no x87/MMX mix in the code, and that all operations take place on the x87 FPU. The floating point execution environment is contained in a series of registers (BA, p. 8-1): * Eight data registers. These are 80-bit floating point operands with 65 bits of significand (including the hidden bit) and 15 bits of exponent, in IEEE-754 double-extended conforming format. * The status register, which contains, among other things, the exception flags. * The control word register, which contains, among other things, the rounding and precision control. * The tag word register, which contains information about the class of floating point number for each of the data registers (valid, zero, empty or NaN/Inf/denormal). * The last instruction pointer register -- provided for exception handlers. * The last data (operand) pointer register -- provided for exception handlers. * Opcode register -- provided for exception handlers. The registers of interest for the functions implemented are the status register and the control word register. The status register is arranged thus: 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 ## | C3 | ## | ## | ## | C2 | C1 | C0 | ## | ## | PE | UE | OE | ZE | DE | IE ... where ## are reserved/irrelevant, PE is Precision Exception, UE and OE are underflow and overflow respectively, ZE is division by zero, DE denormalised operand, and IE invald operation. These exception flags are 'sticky' in that they are set once their conditions occur, and remain set until an instruction that explicitly clears them is executed. The C3, C2, C1, and C0 bits are Condition Code flags that are used to determine the class of a floating point number following a call to FXAM (ISR, p. 3-306). The FSTSW addr instruction stores the status word at the specified 16-bit location given as argument, after checking for any pending unmasked floating-point exceptions (ISR, p. 3-291). The exception flags are cleared using the FCLEX instruction (ISR, p. 3-215), which also first checks for pending unmasked floating-point exceptions. Somewhat irritatingly, the only way to set the exception flags to a specified value involves saving then loading the whole floating point environment using the FSTENV (ISR, p. 3-288) and FLDENV (ISR, p. 3-253) instructions. The size and layout of the floating point execution environment depends on the operating mode (protected or real-address) and operand-size (32 or 16-bit) attributes of the processor. This information is obtainable, but considerably complicates things, and is therefore not considered for use here. The control word register is arranged thus (BA, p. 8-9): 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 ## | ## | ## | ## | RC | PC | ## | ## | PM | UM | OM | ZM | DM | IM Where RC affects the rounding control as follows (BA, p. 4-20): * RC = 00B -- Round to nearest even. * RC = 01B -- Round to negative infinity. * RC = 10B -- Round to positive infinity. * RC = 11B -- Round to zero. PC affects the precision control as follows (BA, p. 8-10): * PC = 00B -- Single precision (24 bit significand). * PC = 01B -- Reserved. * PC = 10B -- Double precision (53 bit significand). * PC = 11B -- Double extended precision (64 bit significand). The PM, UM, OM, ZM, DM, IM bits correspond to floating point exception masks for precision, underflow, overflow, division by zero, denormalisation, and invalid operation exceptions respectively. The control word is accessed using the FSTCW addr instruction, which stores the control word at the specified 16-bit location after checking for pending unmasked floating-point exceptions. It is set using the FLDCW addr instruction, which puts the contents of the specified 16-bit location into the control word register. This instruction can cause pending floating point exceptions to be generated, which can be avoided by preceding it with an FCLEX instruction. The fact that FLDCW should be preceded by an FCLEX, and that the status word cannot be directly set without using the mode-dependent FLDENV instruction means that using rounding control and exception checking in the same program needs to be handled carefully. It also means that software using these functions cannot be guaranteed safe if used in conjunction with other software libraries that check the floating point exception flags by means other than the functions provided here. The FXAM instruction sets the Condition Code flags according to the class of the number on the top of the floating point stack as follows: Class Sign C3 C2 C1 C0 Not a Number | n/a 0 0 0/1 0 | -ve 0 1 1 0 normalised | | +ve 0 1 0 0 | -ve 0 1 1 1 infinity | | +ve 0 1 0 1 | -ve 1 0 1 0 zero | | +ve 1 0 0 0 | -ve 1 1 1 0 denormalised | | +ve 1 1 0 0 A Signalling NaN is indicated by a 0 as the MSB of the significand, otherwise the number is a Quiet Nan. Currently the way that fpclass is written, a signalling NaN will not be successfully detected, and a quiet NaN will be returned. For some reason, denormalised numbers are also not successfully detected in this implementation, and are instead returned as normalised numbers. This is strange, since other classes using the C3 and C2 bits (normalised, infinity & zero) are all correctly detected. 3 Implementing the ieeefp functions 3.1 Accessing the assembly instructions Implementing the ieeefp functions requires access to assembly language instructions that are not generated by the compiler from C code. Five instructions are required: FSTSW (read the status word), FSTCW (read the control word), FCLEX (clear the sticky exception flags), FLDCW (write the control word), FXAM (examine the number on the top of the stack). Access to these functions is given by x87FPUcmds.c, which from version 3.0 uses inline assembly language to put the instructions in. (Thanks to Glenn Burkhardt.) Once these functions are available, it is then possible to implement the IEEE functions. However, since these all involve accessing particular bits in the control and status words, it is worth having a few utility functions to help with this. The code in x87FPUutils.c contains these functions, which shift out irrelevant bits in the status or control words, returning the bits required in the least significant bits of result. 3.2 fpget/setsticky() The sticky bits need to be saved in a global variable saved_sticky_bits, because those functions requiring a call to FLDCW require a call to FCLEX to clear the exception flags before the control word is set to a new value. fpgetsticky() simply makes a call to access bits 0-5 of the status word, makes sure that saved_sticky_bits is kept up to date, and returns the result. fpsetsticky() actually just clears the exception flags on the chip using FCLEX, saving the required setting in the global variable saved_sticky_bits. 3.3 fpget/setmask() fpgetmask() just returns the result of a call to FSTCW, accessing bits 0-5 of the control word. fpsetmask() sets the mask bits on the control word using FLDCW, but must first save the sticky bits in the status word, as the call to FLDCW requires FCLEX to be executed first, which clears these bits. Note that on Sun machines, the fpsetmask() function clears the sticky bit corresponding to any exception being enabled. This is not done here, and probably should be at some point. 3.4 fpget/setround() fpgetround() accesses the rounding control flags through a call to FSTCW, and checks the result before returning it. fpsetround() sets the rounding control bits on the control word using a call to FLDCW, but must first save the sticky bits as the call to FLDCW requires FCLEX to be executed first. 3.5 fpget/setprecision() fpgetprecision() accesses the precision control flags on the control word through a call to FSTCW, and checks the result before returning it. fpsetprecision() sets the precision control bits on the control word using a call to FLDCW, but must first save the sticky bits as the call to FLDCW requires FCLEX to be executed first. 3.6 fpclass() fpclass() makes a call to FXAM that sets the condition code bits on the status word to values that indicate the floating point class of the number. The x87FPUfxam() function includes a call to FSTSW, and returns the status word. A switch statement is then used to resolve the condition code bit settings into the appropriate number classes. The type of NaN (signalling/quiet) is then determined using a method that is supposed to be independent of whether the architecture is big or little endian, but although this method should work in theory, in practice a signalling NaN would get converted to a quiet NaN by this process. This can happen simply when setting a variable to a value that is a signalling NaN. 3.7 finite() finite() just checks the result of a call to fpclass(), returning 1 if it is one of the finite classes and 0 otherwise. 5 Improvements These functions have been implemented with only the most basic understanding of assembly language and the GNU assembler. The assumption has been made that the GNU assembler understands the Intel instructions for accessing the floating point registers. Furthermore, the assumption has been made that no other process can interfere with the register settings for the process using these functions, and vice versa. There are also potential issues with MMX technology, for which no allowance has been made. The functions make no allowance for the code to include other functions that access the floating point registers. Other functions appearing as standard in ieeefp.h are not implemented here. In summary, this is very much a hacked effort, based on the most rudimentary knowledge, and should be treated with appropriate suspicion. That said, the test software in test-CIieeefp.c does successfully demonstrate that rounding control, checking of sticky bits, and to a certain extent, determination of floating point number class, have all been enabled.