Postgresql query array of objects in JSONB field

user3761100 picture user3761100 · Feb 12, 2015 · Viewed 37.1k times · Source

I have a table in a postgresql 9.4 database with a jsonb field called receivers. Some example rows:

[{"id": "145119603", "name": "145119603", "type": 2}]
[{"id": "1884595530", "name": "1884595530", "type": 1}]
[{"id": "363058213", "name": "363058213", "type": 1}]
[{"id": "1427965764", "name": "1427965764", "type": 1}]
[{"id": "193623800", "name": "193623800", "type": 0}, {"id": "419955814", "name": "419955814", "type": 0}]
[{"id": "624635532", "name": "624635532", "type": 0}, {"id": "1884595530", "name": "1884595530", "type": 1}]
[{"id": "791712670", "name": "791712670", "type": 0}]
[{"id": "895207852", "name": "895207852", "type": 0}]
[{"id": "144695994", "name": "144695994", "type": 0}, {"id": "384217055", "name": "384217055", "type": 0}]
[{"id": "1079725696", "name": "1079725696", "type": 0}]

I have a list of values for id and want to select any row that contains an object with any of the values from that list, within the array in the jsonb field.

Is that possible? Is there a GIN index I can make that will speed this up?

Answer

pozs picture pozs · Feb 13, 2015

There is no single operation, which can help you, but you have a few options:

1. If you have a small (and fixed) number of ids to query, you can use multiple containment operators @> combined with or; f.ex.:

where data @> '[{"id": "1884595530"}]' or data @> '[{"id": "791712670"}]'

A simple gin index can help you on your data column here.

2. If you have variable number of ids (or you have a lot of them), you can use json[b]_array_elements() to extract each element of the array, build up an id list and then query it with the any-containment operator ?|:

select *
from   jsonbtest
where  to_json(array(select jsonb_array_elements(data) ->> 'id'))::jsonb ?|
         array['1884595530', '791712670'];

Unfortunately, you cannot index an expression, which has a sub-query in it. If you want to index it, you need to create a function for it:

create function idlist_jsonb(jsonbtest)
  returns jsonb
  language sql
  strict
  immutable
as $func$
  select to_json(array(select jsonb_array_elements($1.data) ->> 'id'))::jsonb
$func$;

create index on jsonbtest using gin (idlist_jsonb(jsonbtest));

After this, you can query ids like this:

select *, jsonbtest.idlist_jsonb
from   jsonbtest
where  jsonbtest.idlist_jsonb ?| array['193623800', '895207852'];

Note: I used dot notation / computed field here, but you don't have to.

3. But at this point, you don't have to stick with json[b]: you have a simple text array, which is supported by PostgreSQL too.

create function idlist_array(jsonbtest)
  returns text[]
  language sql
  strict
  immutable
as $func$
  select array(select jsonb_array_elements($1.data) ->> 'id')
$func$;

create index on jsonbtest using gin (idlist_array(jsonbtest));

And query this computed field with the overlap array operator &&:

select *, jsonbtest.idlist_array
from   jsonbtest
where  jsonbtest.idlist_array && array['193623800', '895207852'];

Note: From my internal testing, this latter solution is calculated with a higher cost than the jsonb variant, but in fact it is faster than that, a little. If performance really matters to you, you should test both.