Why would R use the "L" suffix to denote an integer?

Question 1

Why would R use the "L" suffix to denote an integer?

r parsing integer semantics

Simon O'Hanlon · Jun 22, 2014 · Viewed 27.1k times · Source

Answer

Answer

Why is "L" used as a suffix?

I've never seen it written down, but I theorise in short for two reasons:

Because R handles complex numbers which may be specified using the suffix "i" and this would be too simillar to "I"
Because R's integers are 32-bit long integers and "L" therefore appears to be sensible shorthand for referring to this data type.

The value a long integer can take depends on the word size. R does not natively support integers with a word length of 64-bits. Integers in R have a word length of 32 bits and are signed and therefore have a range of −2,147,483,648 to 2,147,483,647. Larger values are stored as double.

This wiki page has more information on common data types, their conventional names and ranges.

And also from ?integer

Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9: doubles can hold much larger integers exactly.

Why do 1.0L and 1.1L return different types?

The reason that 1.0L and 1.1L will return different data types is because returning an integer for 1.1 will result in loss of information, whilst for 1.0 it will not (but you might want to know you no longer have a floating point numeric). Buried deep with the lexical analyser (/src/main/gram.c:4463-4485) is this code (part of the function NumericValue()) which actually creates a int data type from a double input that is suffixed by an ascii "L":

/* Make certain that things are okay. */
if(c == 'L') {
double a = R_atof(yytext);
int b = (int) a;
/* We are asked to create an integer via the L, so we check that the
   double and int values are the same. If not, this is a problem and we
   will not lose information and so use the numeric value.
*/
if(a != (double) b) {
    if(GenerateCode) {
    if(seendot == 1 && seenexp == 0)
        warning(_("integer literal %s contains decimal; using numeric value"), yytext);
    else {
        /* hide the L for the warning message */
        *(yyp-2) = '\0';
        warning(_("non-integer value %s qualified with L; using numeric value"), yytext);
        *(yyp-2) = (char)c;
    }
    }
    asNumeric = 1;
    seenexp = 1;
}
}

Question 2

In R we all know it is convenient for those times we want to ensure we are dealing with an integer to specify it using the "L" suffix like this:

1L
# [1] 1

If we don't explicitly tell R we want an integer it will assume we meant to use a numeric data type...

str( 1 * 1 )
# num 1
str( 1L * 1L )
# int 1

Why is "L" the preferred suffix, why not "I" for instance? Is there a historical reason?

In addition, why does R allow me to do (with warnings):

str(1.0L)
# int 1
# Warning message:
# integer literal 1.0L contains unnecessary decimal point

But not..

str(1.1L)
# num 1.1
#Warning message:
#integer literal 1.1L contains decimal; using numeric value

I'd expect both to either return an error.

Why would R use the "L" suffix to denote an integer?

Answer

Why is "L" used as a suffix?

Why do 1.0L and 1.1L return different types?

Related questions