Preliminary information OS: Windows XP Professional Version 2002 Service Pack 3; R version: R 2.12.2 (2011-02-25)
I am attempting to read a 30,000 row by 80 column, tab-delimited text file into R using the read.delim()
function. This file does have column headers with following naming convention: "_". The code that I use to attempt to read the data in is:
cc <- c("integer", "character", "integer", rep("character", 3),
rep("integer", 73))
example_data <- read.delim(file = 'C:/example.txt', row.names = FALSE,
col.names = TRUE, as.is = TRUE, colClasses = cc)
After I submit this command, I receive the following error message:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names
In addition: Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
header and 'col.names' are of different lengths
Information that may be important - from column 8 until column 80 the count of zeros in each column is as follows:
column 08: 29,000 zeros
column 13: 15,000 zeros
column 19: 500 zeros
column 43: 15,000 zeros
columns 65-80: 29,000 zeros for each column
Can anyone help identify reasons that I am receiving the above error messages? Any help will be greatly appreciated.
The cause of the problem is your use of the col.names=TRUE
argument. This is supposed to be used manually to specify column names for the resulting data frame, and therefore must be a vector with the same length as there are columns in the input, one name per column.
f you want read.delim
to take column names from the file, consider using header=TRUE
; you may also wish to reconsider row.names=TRUE
as again this is intended as a specification of the row names rather than an instruction to read them from the file.
More information is available on the help page for read.delim
.