How to read data with different separators?

yliueagle picture yliueagle · May 9, 2014 · Viewed 21.4k times · Source

I have a file looks like:

a 1,2,3,5
b 4,5,6,7
c 5,6,7,8
...

That the separator between 1st and 2nd is '\t', other separators are comma. How can I read this kind of data set as as dataframe having 5 fields.

Answer

Josh O'Brien picture Josh O'Brien · May 9, 2014

I'd probably do this.

read.table(text = gsub(",", "\t", readLines("file.txt")))
  V1 V2 V3 V4 V5
1  a  1  2  3  5
2  b  4  5  6  7
3  c  5  6  7  8

Unpacking that just a bit:

  • readLines() reads the file into R as a character vector with one element for each line.
  • gsub(",", "\t", ...) replaces every comma with a tab, so that now we've got lines with just one kind of separating character.
  • The text = argument to read.table() lets it know you are passing it a character vector to be read directly (rather than the name of a file containing your text data).