Insert Data to SQL Server Table using pymssql

Krishnang K Dalal picture Krishnang K Dalal · Dec 21, 2017 · Viewed 14.3k times · Source

I am trying to write the data frame into the SQL Server Table. My code:

conn = pymssql.connect(host="Dev02", database="DEVDb")
cur = conn.cursor()
query = "INSERT INTO dbo.SCORE_TABLE VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
cur.executemany(query, df_sql)
conn.commit()
cur.close()
conn.close()

The dimension of the df_sql is (5860, 20) i.e. the number of columns in the data frame is same as the number of columns in the SQL Server Table. Still I am getting following error:

ValueError: more placeholders in sql than params available

UPDATED BELOW

As per one of the comments, I tried using turbodbc as below:

conn = turbodbc.connect(driver="{SQL Server}", server="Dev02", Database="DEVDb")
conn.use_async_io = True
cur = conn.cursor()
query = "INSERT INTO dbo.STG_CONTACTABILITY_SCORE VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
cur.executemany(query, df_sql.values)
cur.commit()
cur.close()
conn.close()

I am getting following error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I don't get it. What is wrong here. I see df_sql.values and I don't find anything wrong.

The first row of ndarray is as below:

[nan 'DUSTIN HOPKINS' 'SOUTHEAST MISSOURI STATE UNIVERSITY' 13.0
  '5736512217' None None 'Monday' '8:00AM' '9:00AM' 'Summer' None None None
  None '2017-12-22 10:39:30.626331' 'Completed' None '1-11KUFFZ'
  'Central Time Zone']

Answer

virtualdvid picture virtualdvid · Dec 22, 2017

I think you just need to specify each column name and don't forget the table must have the id field to charge the data frame index:

conn = pymssql.connect(host="Dev02", database="DEVDb")
cur = conn.cursor()
query = """INSERT INTO dbo.SCORE_TABLE(index, column1, column2, ..., column20)
            VALUES (?, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, 
            %s, %s, %s, %s, %s, %s)"""
cur.executemany(query, df_sql)
conn.commit()
cur.close()
conn.close()

Ok I have been using pandas and I exported the last data frame to csv like:

df.to_csv('new_file_name.csv', sep=',', encoding='utf-8')

Then I just used pyobdc and BULK INSERT Transact-SQL like:

import pyodbc

conn = pyodbc.connect(DRIVER='{SQL Server}', Server='server_name', Database='Database_name', trusted_connection='yes')
cur = conn.cursor()

cur.execute("""BULK INSERT table_name
               FROM 'C:\\Users\\folders path\\new_file_name.csv'
               WITH
               (
                   CODEPAGE = 'ACP',
                   FIRSTROW = 2,
                   FIELDTERMINATOR = ',',
                   ROWTERMINATOR = '\n'
               )""")
conn.commit()

cur.close()
conn.close()

It was a second to charge 15314 rows into SQL Server. I hope this gives you an idea.