View row values in openpyxl

dyao picture dyao · Jul 6, 2015 · Viewed 31.2k times · Source

In the csv module in python, there is a function called csv.reader which allows you to iterate over a row, returns a reader object and can be held in a container like a list.

So when the list assigned to a variable and is printed, ie:

csv_rows = list(csv.reader(csvfile, delimiter=',', quotechar='|'))
print (csv_rows)
>
>
>
[['First Name', 'Last Name', 'Zodicac', 'Date of birth', 'Sex'] # I gave an example of the function outputting a header row

So far, I don't see a similar function like this in the openpyxl. I could be mistaken so I'm wondering if any of you can help me out.

Update

@alecxe, your solution works perfectly (except its casting my date of birth as a datetime format instead of a regular string).

def iter_rows(ws):
for row in ws.iter_rows():
    yield [cell.value for cell in row]
>
>
>>> pprint(list(iter_rows(ws)))
[['First Nam', 'Last Name', 'Zodicac', 'Date of birth', 'Sex'], ['John', 'Smith', 'Snake', datetime.datetime(1989, 9, 4, 0, 0), 'M']]

Since I'm a beginner I wanted to know how this would work if I used a for loop instead of a list comprehension.

So I used this:

def iter_rows(ws):
result=[]
for row in ws.iter_rows()
    for cell in row:
        result.append(cell.value)
yield result

It almost gives me the exact same output, instead it gives me this: As you can tell, it essentially gives me one gigantic list instead of nested list in the result you gave me.

>>>print(list(iter_rows(ws)))

[['First Nam', 'Last Name', 'Zodicac', 'Date of birth', 'Sex', 'David', 'Yao', 'Snake', datetime.datetime(1989, 9, 4, 0, 0), 'M']]

Answer

alecxe picture alecxe · Jul 6, 2015

iter_rows() has probably a similar sense:

Returns a squared range based on the range_string parameter, using generators. If no range is passed, will iterate over all cells in the worksheet

>>> from openpyxl import load_workbook
>>> 
>>> wb = load_workbook('test.xlsx')
>>> ws = wb.get_sheet_by_name('Sheet1')
>>> 
>>> pprint(list(ws.iter_rows()))
[(<Cell Sheet1.A1>,
  <Cell Sheet1.B1>,
  <Cell Sheet1.C1>,
  <Cell Sheet1.D1>,
  <Cell Sheet1.E1>),
 (<Cell Sheet1.A2>,
  <Cell Sheet1.B2>,
  <Cell Sheet1.C2>,
  <Cell Sheet1.D2>,
  <Cell Sheet1.E2>),
 (<Cell Sheet1.A3>,
  <Cell Sheet1.B3>,
  <Cell Sheet1.C3>,
  <Cell Sheet1.D3>,
  <Cell Sheet1.E3>)]

You can modify it a little bit to yield a list of row values, for example:

def iter_rows(ws):
    for row in ws.iter_rows():
        yield [cell.value for cell in row]

Demo:

>>> pprint(list(iter_rows(ws)))
[[1.0, 1.0, 1.0, None, None],
 [2.0, 2.0, 2.0, None, None],
 [3.0, 3.0, 3.0, None, None]]