I am trying to read an excel file this way :
newFile = pd.ExcelFile(PATH\FileName.xlsx)
ParsedData = pd.io.parsers.ExcelFile.parse(newFile)
which throws an error that says two arguments expected, I don't know what the second argument is and also what I am trying to achieve here is to convert an Excel file to a DataFrame, Am I doing it the right way? or is there any other way to do this using pandas?
Close: first you call ExcelFile
, but then you call the .parse
method and pass it the sheet name.
>>> xl = pd.ExcelFile("dummydata.xlsx")
>>> xl.sheet_names
[u'Sheet1', u'Sheet2', u'Sheet3']
>>> df = xl.parse("Sheet1")
>>> df.head()
Tid dummy1 dummy2 dummy3 dummy4 dummy5 \
0 2006-09-01 00:00:00 0 5.894611 0.605211 3.842871 8.265307
1 2006-09-01 01:00:00 0 5.712107 0.605211 3.416617 8.301360
2 2006-09-01 02:00:00 0 5.105300 0.605211 3.090865 8.335395
3 2006-09-01 03:00:00 0 4.098209 0.605211 3.198452 8.170187
4 2006-09-01 04:00:00 0 3.338196 0.605211 2.970015 7.765058
dummy6 dummy7 dummy8 dummy9
0 0.623354 0 2.579108 2.681728
1 0.554211 0 7.210000 3.028614
2 0.567841 0 6.940000 3.644147
3 0.581470 0 6.630000 4.016155
4 0.595100 0 6.350000 3.974442
What you're doing is calling the method which lives on the class itself, rather than the instance, which is okay (although not very idiomatic), but if you're doing that you would also need to pass the sheet name:
>>> parsed = pd.io.parsers.ExcelFile.parse(xl, "Sheet1")
>>> parsed.columns
Index([u'Tid', u'dummy1', u'dummy2', u'dummy3', u'dummy4', u'dummy5', u'dummy6', u'dummy7', u'dummy8', u'dummy9'], dtype=object)