How to use field name or column header in openpyxl?

stahna picture stahna · Dec 15, 2015 · Viewed 26.3k times · Source

See my code below. This code works very well, but I would like to do two things. One thing is I made if statement with or much shorter than actual for example. I have many columns like this, not all next to each other. I would like it to be shorter. Also, sometimes I may not know exact column letter.

So I want to know if there is a way to know the column name or header. Like the values that would be in very top row. So I can test to see if it is one of those values to always perform function on that cell if it's in the specified column. I can't find openpyxl function to do column name. Not sure if it understands that first row is different than rest. I think maybe if not I can try to do test on first row, but don't understand how to make this.

So is there a way to call column name? or if there is no way to call column name to test, can someone help me with doing check on first row to see if it has value? then do change on correct row I'm in? Does this make sense.

So instead of code saying:

if cellObj.column == 'H' or ...

It would say:

if cellObj.column_header == 'NameOfField or ...

Or if not possible to do that, then:

if this cell has column where first row value is 'NameOfField' ...

Please help with best way to do this. I have looked on stackoverflow and in book and blog site, but does not seem to be a way to call column name (not the letter of column).

for row in sheet.iter_rows():
 for cellObj in row:
    if cellObj.column == 'H' or cellObj.column == 'I' or cellObj.column == 'L' or cellObj.column == 'M':
        print(cellObj.value),
        if cellObj.value.upper() == 'OldValue1':
            cellObj.value = 1
            print(cellObj.value)
        elif cellObj.value.upper() == 'OldValue2':
            cellObj.value = 2
            print(cellObj.value)

Answer

Mike Müller picture Mike Müller · Dec 15, 2015

EDIT

Assuming these are the header names you are looking for:

colnames = ['Header1', 'Header2', 'Header3']

Find the indices for these columns:

col_indices = {n for n, cell in enumerate(sheet.rows[0]) if cell.value in colnames}

Now iterate over the remain rows:

for row in sheet.rows[1:]:
    for index, cell in enumerate(row):
         if index in col_indices:
             if cell.value.upper() == 'OldValue1':
                  cell.value = 1
                  print(cell.value)
             elif cell.value.upper() == 'OldValue2':
                 cell.value = 2
                 print(cell.value)

Use a dictionary instead of a set to keep the column names around:

col_indices = {n: cell.value for n, cell in enumerate(sheet.rows[0]) 
               if cell.value in colnames}

for row in sheet.rows[1:]:
    for index, cell in enumerate(row):
        if index in col_indices:
            print('col: {}, row: {}, content: {}'.format(
                   col_indices[index], index, cell.value))
            if cell.value.upper() == 'OldValue1':
                 cell.value = 1
            elif cell.value.upper() == 'OldValue2':
                 cell.value = 2

Old answer

This makes your if statement shorter:

if cellObj.column in 'HILM':
    print(cellObj.value),

For multi letter column coordinates you need to use a list:

if cellObj.column in ['H', 'AA', 'AB', 'AD']:
    print(cellObj.value),