After looking at different ways to read an url link, pointing to a .xls file, I decided to go with using xlrd.
I am having a difficult time converting a 'xlrd.book.Book' type to a 'pandas.DataFrame'
I have the following:
import pandas
import xlrd
import urllib2
link ='http://www.econ.yale.edu/~shiller/data/chapt26.xls'
socket = urllib2.urlopen(link)
#this line gets me the excel workbook
xlfile = xlrd.open_workbook(file_contents = socket.read())
#storing the sheets
sheets = xlfile.sheets()
I want to tak the last sheet of sheets
and import as a pandas.DataFrame
, any ideas as to how I can accomplish this? I've tried, pandas.ExcelFile.parse()
but it wants a path to an excel file. I can of certainly save the file to memory and then parse (using tempfile
or something), but I'm trying to follow pythonic guidelines and use functionality likely already written into pandas.
Any guidance is greatly appreciated as always.
You can pass your socket
to ExcelFile
:
>>> import pandas as pd
>>> import urllib2
>>> link = 'http://www.econ.yale.edu/~shiller/data/chapt26.xls'
>>> socket = urllib2.urlopen(link)
>>> xd = pd.ExcelFile(socket)
NOTE *** Ignoring non-worksheet data named u'PDVPlot' (type 0x02 = Chart)
NOTE *** Ignoring non-worksheet data named u'ConsumptionPlot' (type 0x02 = Chart)
>>> xd.sheet_names
[u'Data', u'Consumption', u'Calculations']
>>> df = xd.parse(xd.sheet_names[-1], header=None)
>>> df
0 1 2 3 4
0 Average Real Interest Rate: NaN NaN NaN 1.028826
1 Geometric Average Stock Return: NaN NaN NaN 0.065533
2 exp(geo. Avg. return) NaN NaN NaN 0.067728
3 Geometric Average Dividend Growth NaN NaN NaN 0.012025