One of our software-projects uses a PostgreSQL-table with a column 'guid' of type bytea.
This is used with hibernate 3.3.2.GA with PostgreSQL 8.4, which serializes the java UUID type using java object serialization. The result is a value like the following escape
format bytea literal:
'\254\355\000\005sr\000\016java.util.UUID\274\231\003\367\230m\205/\002\000\002J\000\014leastSigBitsJ\000\013mostSigBitsxp\273\222)\360*r\322\262u\274\310\020\342\004M '
... which we cannot easily use in a query as select or condition to retrieve relevant rows.
Does anyone have a way to read or use the bytea-column in the select- or where-parts of a query (e.g. via psql or pgadmin3), without setting up some hibernate-query?
Update: See edit to question, this answer applies to the commonplace 16-byte serializations of uuid; the question was amended to reflect java serialization.
Interesting problem. I landed up writing a simple C extension to do it efficiently, but it's probably more sensible to use the PL/Python version below.
Because uuid
is a fixed sized type and bytea
is varlena
you can't just create cast ... as implicit
to binary-coerce them, because the variable length field header would get in the way.
There's no built-in function for bytea input to return a uuid. It'd be a handy thing to have, but I don't think anyone's done it yet.
Update: There's actually a simple way to do this. bytea
in hex form is actually a valid uuid literal once the \x
is stripped off, because uuid_in
accepts plain undecorated hex without -
or {}
. So just:
regress=> SET bytea_output = 'hex';
SET
regress=> SELECT CAST( substring(CAST (BYTEA '\x0FCC6350118D11E4A5597DE5338EB025' AS text) from 3) AS uuid);
substring
--------------------------------------
0fcc6350-118d-11e4-a559-7de5338eb025
(1 row)
It involves a couple of string copies and a hex encode/decode cycle, but it'll be tons faster than any of the PL answers I suggested earlier, though slower than C.
Personally I recommend using PL/Perl or pl/pythonu. I'll follow up with an example.
Assuming your uuid is the hex-format bytea literal:
'\x0FCC6350118D11E4A5597DE5338EB025'
you could turn it into a uuid
type with:
create language plperlu;
create or replace function to_uuid(bytea) returns uuid language plperlu immutable as $$
use Data::UUID;
my $ug = new Data::UUID;
my $uuid = $ug->from_hexstring(substr($_[0],2));
return $ug->to_string($uuid);
$$
SET bytea_output = hex;
SELECT to_uuid(BYTEA '\x0FCC6350118D11E4A5597DE5338EB025');
It's probably faster and cleaner in Python because the PL/Python interface passes bytea
as raw bytes not as hex strings:
CREATE LANGUAGE plpythonu;
CREATE or replace function to_uuid(uuidbytes bytea)
RETURNS uuid LANGUAGE plpythonu IMMUTABLE
AS $$
import uuid
return uuid.UUID(bytes=uuidbytes)
$$;
SELECT to_uuid(BYTEA '\x0FCC6350118D11E4A5597DE5338EB025');
You can see the C extension module here.
But really, I mean it about it being ugly. If you want it done properly in C, it's best to actually patch PostgreSQL rather than use an extension.