Python import CSV short code (pandas?) delimited with ';' and ',' in entires

Alexei Martianov picture Alexei Martianov · Jun 19, 2016 · Viewed 18.6k times · Source

I need to import a CSV file in Python on Windows. My file is delimited by ';' and has strings with non-English symbols and commas (',').

I've read posts:

Importing a CSV file into a sqlite3 database table using Python

Python import csv to list

When I run:

with open('d:/trade/test.csv', 'r') as f1:
    reader1 = csv.reader(f1)
    your_list1 = list(reader1)

I get an issue: comma is changed to '-' symbol.

When I try:

df = pandas.read_csv(csvfile)

I got errors:

pandas.io.common.CParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 2.

Please help. I would prefer to use pandas as the code is shorter without listing all field names from the CSV file.

I understand there could be the work around of temporarily replacing commas. Still, I would like to solve it by some parameters to pandas.

Answer

jezrael picture jezrael · Jun 19, 2016

Pandas solution - use read_csv with regex separator [;,]. You need add engine='python', because warning:

ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.

import pandas as pd
import io

temp=u"""a;b;c
1;1,8
1;2,1
1;3,6
1;4,3
1;5,7
"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep="[;,]", engine='python')
print (df)

   a  b  c
0  1  1  8
1  1  2  1
2  1  3  6
3  1  4  3
4  1  5  7