Querying inside Postgres JSON arrays

Joe Shaw picture Joe Shaw · Sep 16, 2013 · Viewed 18.5k times · Source

How would you go about searching for an element inside an array stored in a json column? (Update: Also see the 9.4 updated answer for jsonb columns.)

If I have a JSON document like this, stored in a json column named blob:

{"name": "Wolf",
 "ids": [185603363281305602,185603363289694211]}

what I'd like to be able to do is something like:

SELECT * from "mytable" WHERE 185603363289694211 = ANY("blob"->'ids');

and get all matching rows out. But this doesn't work because "blob"->'ids' returns JSON values, not Postgres arrays.

I'd also like to build an index on the individual IDs, if that's possible.

Answer

Joe Shaw picture Joe Shaw · Sep 16, 2013

The following original answer applies only for Postgres 9.3. For a Postgres 9.4 answer, see the Update below.

This builds on Erwin's referenced answers, but is a little bit more explicit to this question.

The IDs in this case are bigints, so create a helper function for converting a JSON array to a Postgres bigint array:

CREATE OR REPLACE FUNCTION json_array_bigint(_j json)
  RETURNS bigint[] AS
$$
SELECT array_agg(elem::text::bigint)
FROM json_array_elements(_j) AS elem
$$
  LANGUAGE sql IMMUTABLE;

We could just have easily (and perhaps more re-usably) returned a text array here instead. I suspect indexing on bigint is a lot faster than text but I'm having a difficult time finding evidence online to back that up.

For building the index:

CREATE INDEX "myindex" ON "mytable" 
  USING GIN (json_array_bigint("blob"->'ids'));

For querying, this works and uses the index:

SELECT * FROM "mytable" 
  WHERE '{185603363289694211}' <@ json_array_bigint("blob"->'ids');

Doing this will also work for querying, but it doesn't use the index:

SELECT * FROM "mytable" 
  WHERE 185603363289694211 = ANY(json_array_bigint("blob"->'ids'));

Update for 9.4

Postgres 9.4 introduced the jsonb type. This is a good SO answer about jsonb and when you should use it over json. In short, if you're ever querying the JSON, you should use jsonb.

If you build your column as jsonb, you can use this query:

SELECT * FROM "mytable"
  WHERE blob @> '{"ids": [185603363289694211]}';

The @> is Postgres' contains operator, documented for jsonb here. Thanks to Alain's answer for bringing this to my attention.