I have a large data.frame of character data that I want to convert based on what is commonly called a dictionary in other languages.
Currently I am going about it like so:
foo <- data.frame(snp1 = c("AA", "AG", "AA", "AA"), snp2 = c("AA", "AT", "AG", "AA"), snp3 = c(NA, "GG", "GG", "GC"), stringsAsFactors=FALSE)
foo <- replace(foo, foo == "AA", "0101")
foo <- replace(foo, foo == "AC", "0102")
foo <- replace(foo, foo == "AG", "0103")
This works fine, but it is obviously not pretty and seems silly to repeat the replace
statement each time I want to replace one item in the data.frame.
Is there a better way to do this since I have a dictionary of approximately 25 key/value pairs?
If you're open to using packages, plyr
is a very popular one and has this handy mapvalues() function that will do just what you're looking for:
foo <- mapvalues(foo, from=c("AA", "AC", "AG"), to=c("0101", "0102", "0103"))
Note that it works for data types of all kinds, not just strings.