I have a small issue when I'm trying to import data from CSV files with numpy's loadtxt function. Here's a sample of the type of data files I have.
Call it 'datafile1.csv':
# Comment 1
# Comment 2
x,y,z
1,2,3
4,5,6
7,8,9
...
...
# End of File Comment
The script that I thought would work for this situation looks like:
import numpy as np
FH = np.loadtxt('datafile1.csv',comments='#',delimiter=',',skiprows=1)
But, I'm getting an error:
ValueError: could not convert string to float: x
This tells me that the kwarg 'skiprows' is not skipping the header, it's skipping the first row of comments. I could simply make sure that skiprows=3, but the complication is that I have a very large number of files, which don't all necessarily have the same number of commented lines at the top of the file. How can I make sure that when I use loadtxt I'm only getting the actual data in a situation like this?
P.S. - I'm open to bash solutions, too.
Skip comment line manually using generator expression:
import numpy as np
with open('datafile1.csv') as f:
lines = (line for line in f if not line.startswith('#'))
FH = np.loadtxt(lines, delimiter=',', skiprows=1)