Is there a way to access all rows in a column in a specific sheet by using python xlrd.
e.g:
workbook = xlrd.open_workbook('ESC data.xlsx', on_demand=True)
sheet = workbook.sheet['sheetname']
arrayofvalues = sheet['columnname']
Or do i have to create a dictionary by myself?
The excel is pretty big so i would love to avoid iterating over all the colnames/sheets
Yes, you are looking for the col_values()
worksheet method. Instead of
arrayofvalues = sheet['columnname']
you need to do
arrayofvalues = sheet.col_values(columnindex)
where columnindex
is the number of the column (counting from zero, so column A is index 0, column B is index 1, etc.). If you have a descriptive heading in the first row (or first few rows) you can give a second parameter that tells which row to start from (again, counting from zero). For example, if you have one header row, and thus want values starting in the second row, you could do
arrayofvalues = sheet.col_values(columnindex, 1)
Please check out the tutorial for a reasonably readable discussion of the xlrd
package. (The official xlrd
documentation is harder to read.)
Also note that (1) while you are free to use the name arrayofvalues
, what you are really getting is a Python list, which technically isn't an array, and (2) the on_demand
workbook parameter has no effect when working with .xlsx files, which means xlrd
will attempt to load the entire workbook into memory regardless. (The on_demand
feature works for .xls files.)