I'm eager to know,
I think it might reveal some interesting facts.
Edit: bonus points for animated graphics showing the time-evolution of CRAN packages.
A better way than scraping a web page to get the names of packages is to use the available.packages()
function and process those results. available.packages()
returns a matrix contains details of all packages available (but is filtered by default — see the Details section of ?available.packages
for more).
pkgs <- available.packages(filters = "duplicates")
nameCount <- unname(nchar(pkgs[, "Package"]))
table(nameCount)
> table(nameCount)
nameCount
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
32 311 374 360 434 445 368 277 199 132 99 56 56 43 22 19 18 2 12 8
22 24 25 31
5 2 1 1
Using nameCount
we can select packages with names containing any number of characters without needing to resort to regexp etc:
> unname(pkgs[which(nameCount == 2), "Package"])
[1] "BB" "bs" "ca" "cg" "dr" "ez" "FD" "ff" "HH" "HI" "iv" "JM" "ks" "M3" "mi"
[16] "np" "oc" "oz" "PK" "PP" "qp" "QT" "RC" "rv" "Rz" "sm" "sn" "sp" "st" "SV"
[31] "tm" "wq"