How to manually parse a floating point number from a string

Thomas picture Thomas · Sep 17, 2008 · Viewed 17.6k times · Source

Of course most languages have library functions for this, but suppose I want to do it myself.

Suppose that the float is given like in a C or Java program (except for the 'f' or 'd' suffix), for example "4.2e1", ".42e2" or simply "42". In general, we have the "integer part" before the decimal point, the "fractional part" after the decimal point, and the "exponent". All three are integers.

It is easy to find and process the individual digits, but how do you compose them into a value of type float or double without losing precision?

I'm thinking of multiplying the integer part with 10^n, where n is the number of digits in the fractional part, and then adding the fractional part to the integer part and subtracting n from the exponent. This effectively turns 4.2e1 into 42e0, for example. Then I could use the pow function to compute 10^exponent and multiply the result with the new integer part. The question is, does this method guarantee maximum precision throughout?

Any thoughts on this?

Answer

user7116 picture user7116 · Sep 17, 2008

All of the other answers have missed how hard it is to do this properly. You can do a first cut approach at this which is accurate to a certain extent, but until you take into account IEEE rounding modes (et al), you will never have the right answer. I've written naive implementations before with a rather large amount of error.

If you're not scared of math, I highly recommend reading the following article by David Goldberg, What Every Computer Scientist Should Know About Floating-Point Arithmetic. You'll get a better understanding for what is going on under the hood, and why the bits are laid out as such.

My best advice is to start with a working atoi implementation, and move out from there. You'll rapidly find you're missing things, but a few looks at strtod's source and you'll be on the right path (which is a long, long path). Eventually you'll praise insert diety here that there are standard libraries.

/* use this to start your atof implementation */

/* atoi - [email protected] */
/* PUBLIC DOMAIN */
long atoi(const char *value) {
  unsigned long ival = 0, c, n = 1, i = 0, oval;
  for( ; c = value[i]; ++i) /* chomp leading spaces */
    if(!isspace(c)) break;
  if(c == '-' || c == '+') { /* chomp sign */
    n = (c != '-' ? n : -1);
    i++;
  }
  while(c = value[i++]) { /* parse number */
    if(!isdigit(c)) return 0;
    ival = (ival * 10) + (c - '0'); /* mult/accum */
    if((n > 0 && ival > LONG_MAX)
    || (n < 0 && ival > (LONG_MAX + 1UL))) {
      /* report overflow/underflow */
      errno = ERANGE;
      return (n > 0 ? LONG_MAX : LONG_MIN);
    }
  }
  return (n>0 ? (long)ival : -(long)ival);
}