Change the Blank Cells to "NA"

r na
S Das picture S Das · Jun 11, 2014 · Viewed 237.3k times · Source

Here's the link of my data.

My target is to assign "NA" to all blank cells irrespective of categorical or numerical values. I am using na.strings="". But it's not assigning NA to all blank cells.

## reading the data
dat <- read.csv("data2.csv")
head(dat)
  mon hr        acc   alc sex spd axles door  reg                                 cond1 drug1
1   8 21 No Control  TRUE   F   0     2    2      Physical Impairment (Eyes, Ear, Limb)     A
2   7 20 No Control FALSE   M 900     2    2                                Inattentive     D
3   3  9 No Control FALSE   F 100     2    2 2004                                Normal     D
4   1 15 No Control FALSE   M   0     2    2      Physical Impairment (Eyes, Ear, Limb)     D
5   4 21 No Control FALSE      25    NA   NA                                                D
6   4 20 No Control    NA   F  30     2    4                Drinking Alcohol - Impaired     D
       inj1 PED_STATE st rac1
1     Fatal      <NA>  F <NA>
2  Moderate      <NA>  F <NA>
3  Moderate      <NA>  M <NA>
4 Complaint      <NA>  M <NA>
5 Complaint      <NA>  F <NA>
6  Moderate      <NA>  M <NA>


## using na.strings
dat2 <- read.csv("data2.csv", header=T, na.strings="")
head(dat2)
  mon hr        acc   alc sex spd axles door  reg                                 cond1 drug1
1   8 21 No Control  TRUE   F   0     2    2 <NA> Physical Impairment (Eyes, Ear, Limb)     A
2   7 20 No Control FALSE   M 900     2    2 <NA>                           Inattentive     D
3   3  9 No Control FALSE   F 100     2    2 2004                                Normal     D
4   1 15 No Control FALSE   M   0     2    2 <NA> Physical Impairment (Eyes, Ear, Limb)     D
5   4 21 No Control FALSE      25    NA   NA <NA>                                  <NA>     D
6   4 20 No Control    NA   F  30     2    4 <NA>           Drinking Alcohol - Impaired     D
       inj1 PED_STATE st rac1
1     Fatal        NA  F   NA
2  Moderate        NA  F   NA
3  Moderate        NA  M   NA
4 Complaint        NA  M   NA
5 Complaint        NA  F   NA
6  Moderate        NA  M   NA

Answer

Badoe picture Badoe · Jun 11, 2014

I'm assuming you are talking about row 5 column "sex." It could be the case that in the data2.csv file, the cell contains a space and hence is not considered empty by R.

Also, I noticed that in row 5 columns "axles" and "door", the original values read from data2.csv are string "NA". You probably want to treat those as na.strings as well. To do this,

dat2 <- read.csv("data2.csv", header=T, na.strings=c("","NA"))

EDIT:

I downloaded your data2.csv. Yes, there is a space in row 5 column "sex". So you want

na.strings=c(""," ","NA")