Referencing Row Number in R

r row
Michael picture Michael · Jul 18, 2013 · Viewed 97.5k times · Source

How do I reference the row number of an observation? For example, if you have a data.frame called "data" and want to create a variable data$rownumber equal to each observation's row number, how would you do it without using a loop?

Answer

ricardo picture ricardo · Jul 18, 2013

These are present by default as rownames when you create a data.frame.

R> df = data.frame('a' = rnorm(10), 'b' = runif(10), 'c' = letters[1:10])
R> df
            a          b c
1   0.3336944 0.39746731 a
2  -0.2334404 0.12242856 b
3   1.4886706 0.07984085 c
4  -1.4853724 0.83163342 d
5   0.7291344 0.10981827 e
6   0.1786753 0.47401690 f
7  -0.9173701 0.73992239 g
8   0.7805941 0.91925413 h
9   0.2469860 0.87979229 i
10  1.2810961 0.53289335 j

and you can access them via the rownames command.

R> rownames(df)
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"

if you need them as numbers, simply coerce to numeric by adding as.numeric, as in as.numeric(rownames(df)).

You don't need to add them, as if you know what you are looking for (say item df$c == 'i', you can use the which command:

R> which(df$c =='i')
[1] 9

or if you don't know the column

R> which(df == 'i', arr.ind=T)
     row col
[1,]   9   3

you may access the element using df[9, 'c'], or df$c[9].

If you wanted to add them you could use df$rownumber <- as.numeric(rownames(df)), though this may be less robust than df$rownumber <- 1:nrow(df) as there are cases when you might have assigned to rownames so they will no longer be the default index numbers (the which command will continue to return index numbers even if you do assign to rownames).