Python script skip header row in excel file

LauraJ picture LauraJ · May 31, 2013 · Viewed 10.9k times · Source

I wrote a python script that will pull excel files from a folder and write them into a SQL table. I got the code to work, but only if I delete the first line of the excel file which contains the headers. I'm new to Python so this is probably something simple, but I looked at a lot of different techniques and couldn't figure out how to insert it into my code. Any ideas would be greatly appreciated!

# Import arcpy module
from xlrd import open_workbook ,cellname
import arcpy
import pyodbc as p

# Database Connection Info
server = "myServer"
database = "my_Tables"
connStr = ('DRIVER={SQL Server Native Client 10.0};SERVER=' + server + ';DATABASE=' + database + ';' + 'Trusted_Connection=yes')

# Assign path to Excel file
file_to_import = '\\\\Location\\Report_Test.xls'

# Assign column count
column_count=10

# Open entire workbook
book = open_workbook(file_to_import)

# Use first sheet
sheet = book.sheet_by_index(0)

# Open connection to SQL Server Table
conn = p.connect(connStr)

# Get cursor
cursor = conn.cursor()

# Assign the query string without values once, outside the loop
query = "INSERT INTO HED_EMPLOYEE_DATA (Company, Contact, Email, Name, Address, City, CentralCities, EnterpriseZones, NEZ, CDBG) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"

# Iterate through each row

for row_index in range(sheet.nrows):

    row_num          = row_index
    Company          = sheet.cell(row_index,0).value
    Contact          = sheet.cell(row_index,1).value
    Email            = sheet.cell(row_index,2).value
    Name             = sheet.cell(row_index,3).value
    Address          = sheet.cell(row_index,4).value
    City             = sheet.cell(row_index,5).value
    CentralCities    = sheet.cell(row_index,6).value
    EnterpriseZones  = sheet.cell(row_index,7).value
    NEZ              = sheet.cell(row_index,8).value
    CDBG             = sheet.cell(row_index,9).value

    values = (Company, Contact, Email, Name, Address, City, CentralCities, EnterpriseZones, NEZ, CDBG)

    cursor.execute(query, values)

# Close cursor
cursor.close()

# Commit transaction
conn.commit()

# Close SQL server connection
conn.close()

Answer

Miquel picture Miquel · May 31, 2013

You can initialize the iteration at the second row. Try the following:

for row_index in range(1,sheet.nrows):

Edit: If you need to iterate over a list of .xls files, as you asked in the comments, the basic idea is to perform an external loop over the files. Here it comes some hints:

# You need to import the os library. At the beinning of your code
import os

...
# Part of your code here
...

# Assign path to Excel file
#file_to_import = '\\\\Location\\Report_Test.xls'
folder_to_import = '\\\\Location'
l_files_to_import = os.listdir(folder_to_import)
for file_to_import in l_files_to_import:
    if file_to_import.endswith('.xls'):
        # The rest of your code here. Be careful with the indentation!
        column_count=10
        ...