How to get column name and column index

Anagha picture Anagha · Sep 6, 2017 · Viewed 16.2k times · Source

Hi I have the below dataframe. Since the column contains NA's the datatype of the column is character. Now, I need to get the column name and index which only contains String value.

In the example below, I want to get the column name and column index of Zo-A and Zo-B:

 ZONE-1        Zo-A         Zone-3        Zo-B
 58            On             75          NA
 60            NA             NA          High
 NA            Off            68          Low
 70            On             NA          NA

So far I tried to first convert all of them to numeric, which created NA's for Zo-A and Zo-B column. And if I use the below code for column index, I'm getting NA's as a result

a <- which(colnames(df)=="Zo-A" )
integer(0)

match_col <- match(c("Zo-A","Zo-B")names(df))
NA NA

I need to perform below operations:

  1. I need to first get the column names which consists of String values
  2. I need the column index for the same

Answer

Rui Barradas picture Rui Barradas · Sep 6, 2017

For what I understand of your question, what you want or need is really, really simple.

First, read the data in.

df <- read.table(text = "
ZONE-1        Zo-A         Zone-3        Zo-B
 58            On             75          NA
 60            NA             NA          High
 NA            Off            68          Low
 70            On             NA          NA
", header = TRUE, check.names = FALSE)

str(df)
'data.frame':   4 obs. of  4 variables:
 $ ZONE-1: int  58 60 NA 70
 $ Zo-A  : Factor w/ 2 levels "Off","On": 2 NA 1 2
 $ Zone-3: int  75 NA 68 NA
 $ Zo-B  : Factor w/ 2 levels "High","Low": NA 1 2 NA

df
  ZONE-1 Zo-A Zone-3 Zo-B
1     58   On     75 <NA>
2     60 <NA>     NA High
3     NA  Off     68  Low
4     70   On     NA <NA>

Now, question (1), "first get the column names which consists of String values". All column names consist of string values so this can be done either with names or with colnames.

names(df)
[1] "ZONE-1" "Zo-A"   "Zone-3" "Zo-B" 

colnames(df)
[1] "ZONE-1" "Zo-A"   "Zone-3" "Zo-B" 

Now question (2), to get the column index of "the same". (I assume it's of column Zo-A you are asking for.)

a <- which(colnames(df) == "Zo-A")
a
[1] 2

a2 <- grep("Zo-A", colnames(df))
a2
[1] 2

Data in dput format.

df <-
structure(list(`ZONE-1` = c(58L, 60L, NA, 70L), `Zo-A` = structure(c(2L, 
NA, 1L, 2L), .Label = c("Off", "On"), class = "factor"), `Zone-3` = c(75L, 
NA, 68L, NA), `Zo-B` = structure(c(NA, 1L, 2L, NA), .Label = c("High", 
"Low"), class = "factor")), .Names = c("ZONE-1", "Zo-A", "Zone-3", 
"Zo-B"), class = "data.frame", row.names = c(NA, -4L))

Edit
If you need to get only the column names composed of alphabetic characters and punctuation marks, you can use the following regular expression.

a3 <- grep("^[[:alpha:]|[:punct:]]*$", colnames(df))
a3
[1] 2 4