I am working on a Python
script to replicate some Postgresql
tables from one environment to another (which does a little more than pg_dump
). It works except when I am copying a table that has bytea
data type.
I read the source table data in memory, then I dump the memory in the target database with concatenated inserts.
Here is my method that produces an insert statement:
def generateInsert(self, argCachedRow):
colOrd = 0;
valClauseList = []
hasBinary = False
for colData in argCachedRow:
colOrd += 1
colName = self.colOrdLookup.get(colOrd)
col = self.colLookup.get(colName)
dataType = col.dataType
insVal = None
if colData is not None:
strVal = str(colData)
if dataType.useQuote:
if "'" in strVal:
strVal = strVal.replace("'", "''")
insVal = "'%s'" % strVal
else:
if dataType.binary:
hasBinary = True
#insVal = psycopg2.Binary(colData)
#else:
insVal = strVal
else:
insVal = "NULL"
valClauseList.append(insVal)
valClause = ", ".join(valClauseList)
if hasBinary:
valClause = psycopg2.Binary(valClause)
result = "INSERT INTO %s VALUES (%s)" % (self.name, valClause)
return result
which works with every table that doesn't have binary data.
I also tried (intuitively) to wrap just the binary column data in psycopg2.Binary
, which is the commented out line and then not do it to the whole row value list but that didn't work either.
Here is my simple DataType wrapper, which is loaded by reading Postgres' information_schema tables:
class DataType(object):
def __init__(self, argDispName, argSqlName, argUseQuote, argBin):
self.dispName = argDispName
self.sqlName = argSqlName
self.useQuote = argUseQuote
self.binary = argBin
How do I read and insert bytea columns using psycopg2
?
If you have this database structure:
CREATE TABLE test (a bytea,
b int,
c text)
then inserting binary data into the request can be done like so, without any wrappers:
bin_data = b'bytes object'
db = psycopg2.connect(*args) # DB-API 2.0
c = db.cursor()
c.execute('''INSERT INTO test VALUES (%s, %s, %s)''', (bin_data, 1337, 'foo'))
c.execute('''UPDATE test SET a = %s''', (bin_data + b'1',))
Then, when you query it:
c.execute('''SELECT a FROM test''')
You'll receive a memoryview
, which is easily converted back to bytes
:
mview = c.fetchone()
new_bin_data = bytes(mview)
print(new_bin_data)
Output: b'bytes object1'
Also, I'd suggest you not to assemble queries by string formatting. psycopg2
's built-in parameter substitution is much more convenient and you don't have to worry about validating data to protect from SQL injections.