In R we all know it is convenient for those times we want to ensure we are dealing with an integer to specify it using the "L"
suffix like this:
1L
# [1] 1
If we don't explicitly tell R we want an integer it will assume we meant to use a numeric
data type...
str( 1 * 1 )
# num 1
str( 1L * 1L )
# int 1
Why is "L" the preferred suffix, why not "I" for instance? Is there a historical reason?
In addition, why does R allow me to do (with warnings):
str(1.0L)
# int 1
# Warning message:
# integer literal 1.0L contains unnecessary decimal point
But not..
str(1.1L)
# num 1.1
#Warning message:
#integer literal 1.1L contains decimal; using numeric value
I'd expect both to either return an error.
I've never seen it written down, but I theorise in short for two reasons:
Because R handles complex numbers which may be specified using the
suffix "i"
and this would be too simillar to "I"
Because R's integers are 32-bit long integers and "L" therefore appears to be sensible shorthand for referring to this data type.
The value a long integer can take depends on the word size. R does not natively support integers with a word length of 64-bits. Integers in R have a word length of 32 bits and are signed and therefore have a range of −2,147,483,648
to 2,147,483,647
. Larger values are stored as double
.
This wiki page has more information on common data types, their conventional names and ranges.
And also from ?integer
Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9: doubles can hold much larger integers exactly.
The reason that 1.0L
and 1.1L
will return different data types is because returning an integer for 1.1
will result in loss of information, whilst for 1.0
it will not (but you might want to know you no longer have a floating point numeric). Buried deep with the lexical analyser (/src/main/gram.c:4463-4485
) is this code (part of the function NumericValue()
) which actually creates a int
data type from a double
input that is suffixed by an ascii "L"
:
/* Make certain that things are okay. */
if(c == 'L') {
double a = R_atof(yytext);
int b = (int) a;
/* We are asked to create an integer via the L, so we check that the
double and int values are the same. If not, this is a problem and we
will not lose information and so use the numeric value.
*/
if(a != (double) b) {
if(GenerateCode) {
if(seendot == 1 && seenexp == 0)
warning(_("integer literal %s contains decimal; using numeric value"), yytext);
else {
/* hide the L for the warning message */
*(yyp-2) = '\0';
warning(_("non-integer value %s qualified with L; using numeric value"), yytext);
*(yyp-2) = (char)c;
}
}
asNumeric = 1;
seenexp = 1;
}
}