using pyodbc on ubuntu to insert a image field on SQL Server

nosklo picture nosklo · Jun 29, 2009 · Viewed 7.1k times · Source

I am using Ubuntu 9.04

I have installed the following package versions:

unixodbc and unixodbc-dev: 2.2.11-16build3
tdsodbc: 0.82-4
libsybdb5: 0.82-4
freetds-common and freetds-dev: 0.82-4
python2.6-dev

I have configured /etc/unixodbc.ini like this:

[FreeTDS]
Description             = TDS driver (Sybase/MS SQL)
Driver          = /usr/lib/odbc/libtdsodbc.so
Setup           = /usr/lib/odbc/libtdsS.so
CPTimeout               = 
CPReuse         = 
UsageCount              = 2

I have configured /etc/freetds/freetds.conf like this:

[global]
    tds version = 8.0
    client charset = UTF-8
    text size = 4294967295

I have grabbed pyodbc revision 31e2fae4adbf1b2af1726e5668a3414cf46b454f from http://github.com/mkleehammer/pyodbc and installed it using "python setup.py install"

I have a windows machine with Microsoft SQL Server 2000 installed on my local network, up and listening on the local ip address 10.32.42.69. I have an empty database created with name "Common". I have the user "sa" with password "secret" with full privileges.

I am using the following python code to setup the connection:

import pyodbc
odbcstring = "SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Common;DRIVER=FreeTDS"
con = pyodbc.connect(odbcstring)
cur = con.cursor()

cur.execute("""
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
      WHERE TABLE_NAME = 'testing')
   DROP TABLE testing
""")
cur.execute('''
CREATE TABLE testing (
    id INTEGER NOT NULL IDENTITY(1,1), 
    myimage IMAGE NULL, 
    PRIMARY KEY (id)
)
    ''')
con.commit()

Everything WORKS up to this point. I have used SQLServer's Enterprise Manager on the server and the new table is there. Now I want to insert some data on the table.

cur = con.cursor()
# using web data for exact reproduction of the error by all.
# I'm actually reading a local file in my real code.
url = 'http://www.forestwander.com/wp-content/original/2009_02/west-virginia-mountains.jpg'
data = urllib2.urlopen(url).read()

sql = "INSERT INTO testing (myimage) VALUES (?)"

Now here on my original question, I was having trouble using cur.execute(sql, (data,)) but now I've edited the question, because following Vinay Sajip's answer below (THANKS), I have changed it to:

cur.execute(sql, (pyodbc.Binary(data),)) 
con.commit()

And insertion is working perfectly. I can confirm the size of the inserted data using the following test code:

cur.execute('SELECT DATALENGTH(myimage) FROM testing WHERE id = 1')
data_inside = cur.fetchone()[0]
assert data_inside == len(data)

Which passes perfectly!!!

Now the problem is on retrieval of the data back.

I am trying the common approach:

cur.execute('SELECT myimage FROM testing WHERE id = 1')
result = cur.fetchone()
returned_data = str(result[0]) # transforming buffer object
print 'Original: %d; Returned: %d' % (len(data), len(returned_data))
assert data == returned_data

However that fails!!

Original: 4744611; Returned: 4096
Traceback (most recent call last):
  File "/home/nosklo/devel/teste_mssql_pyodbc_unicode.py", line 53, in <module>
    assert data == returned_data
AssertionError

I've put all the code above in a single file here, for easy testing of anyone that wants to help.

Now for the question:

I want python code to insert an image file into mssql. I want to query the image back and show it to the user.

I don't care about the column type in mssql. I am using the "IMAGE" column type on the example, but any binary/blob type would do, as long as I get the binary data for the file I inserted back unspoiled. Vinay Sajip said below that this is the preferred data type for this in SQL SERVER 2000.

The data is now being inserted without errors, however when I retrieve the data, only 4k are returned. (Data is truncated on 4096).

How can I make that work?


EDITS: Vinay Sajip's answer below gave me a hint to use pyodbc.Binary on the field. I have updated the question accordingly. Thanks Vinay Sajip!

Alex Martelli's comment gave me the idea of using the DATALENGTH MS SQL function to test if the data is fully loaded on the column. Thanks Alex Martelli !

Answer

nosklo picture nosklo · Jul 2, 2009

Huh, just after offering the bounty, I've found out the solution.

You have to use SET TEXTSIZE 2147483647 on the query, in addition of text size configuration option in /etc/freetds/freetds.conf.

I have used

cur.execute('SET TEXTSIZE 2147483647 SELECT myimage FROM testing WHERE id = 1')

And everything worked fine.

Strange is what FreeTDS documentation says about the text size configuration option:

default value of TEXTSIZE, in bytes. For text and image datatypes, sets the maximum width of any returned column. Cf. set TEXTSIZE in the T-SQL documentation for your server.

The configuration also says that the maximum value (and the default) is 4,294,967,295. However when trying to use that value in the query I get an error, the max number I could use in the query is 2,147,483,647 (half).

From that explanation I thought that only setting this configuration option would be enough. It turns out that I was wrong, setting TEXTSIZE in the query fixed the issue.

Below is the complete working code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pyodbc
import urllib2

odbcstring = "SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Common;DRIVER=FreeTDS"
con = pyodbc.connect(odbcstring)
cur = con.cursor()

cur.execute("""
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
      WHERE TABLE_NAME = 'testing')
   DROP TABLE testing
""")

cur.execute('''
CREATE TABLE testing (
    id INTEGER NOT NULL IDENTITY(1,1), 
    myimage IMAGE NULL,
    PRIMARY KEY (id)
)
    ''')

con.commit()
cur = con.cursor()
url = 'http://www.forestwander.com/wp-content/original/2009_02/west-virginia-mountains.jpg'
data = urllib2.urlopen(url).read()

sql = "INSERT INTO testing (myimage) VALUES (?)"
cur.execute(sql, (pyodbc.Binary(data),))
con.commit()

cur.execute('SELECT DATALENGTH(myimage) FROM testing WHERE id = 1')
data_inside = cur.fetchone()[0]
assert data_inside == len(data)

cur.execute('SET TEXTSIZE 2147483647 SELECT myimage FROM testing WHERE id = 1')
result = cur.fetchone()
returned_data = str(result[0])
print 'Original: %d; Returned; %d' % (len(data), len(returned_data))
assert data == returned_data