Format numbers with million (M) and billion (B) suffixes

emehex picture emehex · Jan 26, 2015 · Viewed 21k times · Source

I have large numbers, e.g. currency or dollar:

1 6,000,000
2 75,000,400
3 743,450,000
4 340,000
5 4,300,000

I want to format them using suffixes, like M (million) and B (billion):

1 6.0 M
2 75.0 M
3 743.5 M
4 0.3 M
5 4.3 M 

Answer

IRTFM picture IRTFM · Jan 27, 2015

Obviously you first need to get rid of the commas in the formatted numbers, and gsub("\\,", ...) is the way to go. This uses findInterval to select the appropriate suffix for labeling and determine the denominator for a more compact display. Can be easily extended in either direction if one wanted to go below 1.0 or above 1 trillion:

comprss <- function(tx) { 
      div <- findInterval(as.numeric(gsub("\\,", "", tx)), 
         c(0, 1e3, 1e6, 1e9, 1e12) )  # modify this if negative numbers are possible
      paste(round( as.numeric(gsub("\\,","",tx))/10^(3*(div-1)), 2), 
           c("","K","M","B","T")[div] )}

You don't need to remove the as.numeric or gsub if the input is numeric. It's admittedly superfluous, but would succeed. This is the result with Gregor's example:

> comprss (big_x)
 [1] "123 "     "500 "     "999 "     "1.05 K"   "9 K"     
 [6] "49 K"     "105.4 K"  "998 K"    "1.5 M"    "20 M"    
[11] "313.4 M"  "453.12 B"

And with the original input (which was probably a factor variable if entered with read.table, read.csv or created with data.frame.)

comprss (dat$V2)
[1] "6 M"      "75 M"     "743.45 M" "340 K"    "4.3 M"  

And of course these can be printed without the quotes using either an explicit print command using quotes=FALSE or by using cat.