Print the Nth Row in a List of Data Frames

geoQuant picture geoQuant · Aug 4, 2013 · Viewed 19.9k times · Source

I am cleaning several excel files in R. They unfortunately are of unequal dimensions, rows and columns. Currently I am storing each excel sheet as a data frame in a list. I know how to print the 4th row of the first data frame in a list by issuing this command:

df.list1[[1]][4,]

Or a range of rows like this:

df.list1[[1]][1:10,]

My question is: How do I print a particular row for every data frame in the list? In other words:

df.list1[[i]][4,]

df.list1 has 30 data frames in it, but my other df.lists have over 140 data frames that I am looking to extract their rows. I'd like to be able to store particular locations across several data frames into a new list. I'm thinking the solution might involve lapply.

Furthermore, is there a way to extract rows in every data frame in a list based on a condition? For example, for all 30 data frames in the list df.list1, extract the row if the value is equal to "Apartment" or some other string of characters.

Appreciate your help, please let me know if I can help clarify my problem.

Answer

thelatemail picture thelatemail · Aug 4, 2013

You could also just directly lapply the extraction function @Justin suggests, e.g.:

# example data of a list containing 10 data frames:
test <- replicate(10,data.frame(a=1:10),simplify=FALSE)

# extract the fourth row of each one - setting drop=FALSE means you get a
# data frame returned even if only one vector/column needs to be returned.
lapply(test,"[",4,,drop=FALSE)

The format is:

lapply(listname,"[",rows.to.return,cols.to.return,drop=FALSE)

# the example returns the fourth row only from each data frame
#[[1]]
#  a
#4 4
# 
#[[2]]
#  a
#4 4
# etc...

To generalise this when you are completing an extraction based on a condition, you would have to change it up a little to something like the below example extracting all rows where a in each data.frame is >4. In this case, using an anonymous function is probably the clearest method, e.g.:

lapply(test, function(x) with(x,x[a>4,,drop=FALSE]) )

#[[1]]
#    a
#5   5
#6   6
#7   7
#8   8
#9   9
#10 10
# etc...