I've recently read up quite a bit on IEEE 754 and the x87 architecture. I was thinking of using NaN as a "missing value" in some numeric calculation code I'm working on, and I was hoping that using signaling NaN would allow me to catch a floating point exception in the cases where I don't want to proceed with "missing values." Conversely, I would use quiet NaN to allow the "missing value" to propagate through a computation. However, signaling NaNs don't work as I thought they would based on the (very limited) documentation that exists on them.
Here is a summary of what I know (all of this using x87 and VC++):
The Standard Library provides a way to access the NaN values:
std::numeric_limits<double>::signaling_NaN();
and
std::numeric_limits<double>::quiet_NaN();
The problem is that I see no use whatsoever for the signaling NaN. If _EM_INVALID is masked it behaves exactly the same as quiet NaN. Since no NaN is comparable to any other NaN, there is no logical difference.
If _EM_INVALID is not masked (exception is enabled), then one cannot even initialize a variable with a signaling NaN:
double dVal = std::numeric_limits<double>::signaling_NaN();
because this throws an exception (the signaling NaN value is loaded into an x87 register to store it to the memory address).
You may think the following as I did:
However, step 2 causes the signaling NaN to be converted to a quiet NaN, so subsequent uses of it will not cause exceptions to be thrown! So WTF?!
Is there any utility or purpose whatsoever to a signaling NaN? I understand one of the original intents was to initialize memory with it so that use of an unitialized floating point value could be caught.
Can someone tell me if I am missing something here?
EDIT:
To further illustrate what I had hoped to do, here is an example:
Consider performing mathematical operations on a vector of data (doubles). For some operations, I want to allow the vector to contain a "missing value" (pretend this corresponds to a spreadsheet column, for example, in which some of the cells do not have a value, but their existence is significant). For some operations, I do not want to allow the vector to contain a "missing value." Perhaps I want to take a different course of action if a "missing value" is present in the set -- perhaps performing a different operation (thus this is not an invalid state to be in).
This original code would look something like this:
const double MISSING_VALUE = 1.3579246e123;
using std::vector;
vector<double> missingAllowed(1000000, MISSING_VALUE);
vector<double> missingNotAllowed(1000000, MISSING_VALUE);
// ... populate missingAllowed and missingNotAllowed with (user) data...
for (vector<double>::iterator it = missingAllowed.begin(); it != missingAllowed.end(); ++it) {
if (*it != MISSING_VALUE) *it = sqrt(*it); // sqrt() could be any operation
}
for (vector<double>::iterator it = missingNotAllowed.begin(); it != missingNotAllowed.end(); ++it) {
if (*it != MISSING_VALUE) *it = sqrt(*it);
else *it = 0;
}
Note that the check for the "missing value" must be performed every loop iteration. While I understand in most cases, the sqrt
function (or any other mathematical operation) will likely overshadow this check, there are cases where the operation is minimal (perhaps just an addition) and the check is costly. Not to mention the fact that the "missing value" takes a legal input value out of play and could cause bugs if a calculation legitimately arrives at that value (unlikely though it may be). Also to be technically correct, the user input data should be checked against that value and an appropriate course of action should be taken. I find this solution inelegant and less-than-optimal performance-wise. This is performance-critical code, and we definitely do not have the luxury of parallel data structures or data element objects of some sort.
The NaN version would look like this:
using std::vector;
vector<double> missingAllowed(1000000, std::numeric_limits<double>::quiet_NaN());
vector<double> missingNotAllowed(1000000, std::numeric_limits<double>::signaling_NaN());
// ... populate missingAllowed and missingNotAllowed with (user) data...
for (vector<double>::iterator it = missingAllowed.begin(); it != missingAllowed.end(); ++it) {
*it = sqrt(*it); // if *it == QNaN then sqrt(*it) == QNaN
}
for (vector<double>::iterator it = missingNotAllowed.begin(); it != missingNotAllowed.end(); ++it) {
try {
*it = sqrt(*it);
} catch (FPInvalidException&) { // assuming _seh_translator set up
*it = 0;
}
}
Now the explicit check is eliminated and performance should be improved. I think this would all work if I could initialize the vector without touching the FPU registers...
Furthermore, I would imagine any self-respecting sqrt
implementation checks for NaN and returns NaN immediately.
As I understand it, the purpose of signaling NaN is to initialize data structures, but, of course runtime initialization in C runs the risk of having the NaN loaded into a float register as part of initialization, thereby triggering the signal because the the compiler isn't aware that this float value needs to be copied using an integer register.
I would hope that you could could initialize a static
value with a signaling NaN, but even that would require some special handling by the compiler to avoid having it converted to a quiet NaN. You could perhaps use a bit of casting magic to avoid having it treated as a float value during initialization.
If you were writing in ASM, this would not be an issue. but in C and especially in C++, I think you will have to subvert the type system in order to initialize a variable with NaN. I suggest using memcpy
.