I need to import a CSV file in Python on Windows. My file is delimited by ';' and has strings with non-English symbols and commas (',').
I've read posts:
Importing a CSV file into a sqlite3 database table using Python
When I run:
with open('d:/trade/test.csv', 'r') as f1:
reader1 = csv.reader(f1)
your_list1 = list(reader1)
I get an issue: comma is changed to '-' symbol.
When I try:
df = pandas.read_csv(csvfile)
I got errors:
pandas.io.common.CParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 2.
Please help. I would prefer to use pandas as the code is shorter without listing all field names from the CSV file.
I understand there could be the work around of temporarily replacing commas. Still, I would like to solve it by some parameters to pandas.
Pandas solution - use read_csv
with regex separator [;,]
. You need add engine='python'
, because warning:
ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
import pandas as pd
import io
temp=u"""a;b;c
1;1,8
1;2,1
1;3,6
1;4,3
1;5,7
"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep="[;,]", engine='python')
print (df)
a b c
0 1 1 8
1 1 2 1
2 1 3 6
3 1 4 3
4 1 5 7