I use PostgreSQL 9.5 and Rails 5. I want to query the jsonb
column shown below that holds an array of JSON objects to return all the JSON array element containing {"kind":"person"}
and also perform a count.
The SQL I use is shown below the json data. Running the query just returns an empty array.
I have tried the queries suggested here and here.
This is what my jsonb
data looks like:
'[
{"kind":"person", "filter_term":"56","selected_attr":"customer"},
{"kind":"email", "filter_term":"marketer","selected_attr":"job_title"}
]'
I want one of the sql query to return:
data
----------------------------------------------------------------------
'{"kind":"person", "filter_term":"56","selected_attr":"customer"}'
(1 row)
and another query to return array back so that I can call count on it in my app and also loop over it to create forms:
data
----------------------------------------------------------------------
'[{"kind":"person", "filter_term":"56","selected_attr":"customer"}]'
(1 row)
I tried this SQL query:
"SELECT * FROM \"segments\" WHERE (payload @> '[{\"kind\":\"person\"}]')"
I also tried this query:
"SELECT payload FROM segments WHERE payload @> '[{\"kind\":\"person\"}]'::jsonb;"
Here is a 3rd query:
"SELECT * FROM segments s WHERE s.payload->'\"#{a}\"' @> '[{\"kind\":\"person\"}]';"
The model:
class Segment < ApplicationRecord
store_accessor :payload,:kind, :filter_term, :selected_model_name, :selected_attr, :limit, :selected_operator
end
The migration:
create_table "segments", force: :cascade do |t|
t.jsonb "payload", default: "[]", null: false
t.index ["payload"], name: "index_segments_on_payload", using: :gin
end
Assuming this table definition:
CREATE TABLE segments (segments_id serial PRIMARY KEY, payload jsonb);
With JSON values like this:
INSERT INTO segments (payload)
VALUES ('[
{
"kind": "person",
"limit": "1",
"filter_term": "56",
"selected_attr": "customer",
"selected_operator": "less_than"
},
{
"kind": "email",
"filter_term": "marketer",
"selected_attr": "job_title",
"selected_operator": "equals"
}
]'
);
"kind":"person"
(not a nested JSON object {"kind":"person"}
) - and count array elements as well as table rows (there may be multiple matching array elements per row).To get the count of rows containing a qualifying jsonb
value in column segments
:
SELECT count(*)
FROM segments s
WHERE s.payload @> '[{"kind":"person"}]';
To get all qualifying JSON array elements (being JSON objects themselves) - plus the total count of elements (may be greater than above count at the same time:
SELECT j.*
FROM segments s
JOIN LATERAL jsonb_array_elements(s.payload) j(elem) ON j.elem @> '{"kind":"person"}'
WHERE s.payload @> '[{"kind":"person"}]';
Returns:
elem ------------------------------------------------------------ {"kind": "person", "limit": "1", "filter_term": "56", ... }
To get all at once:
SELECT j.*, count(*) OVER () AS ct_elem, s.ct_rows
FROM (
SELECT payload, count(*) OVER () AS ct_rows
FROM segments
WHERE payload @> '[{"kind":"person"}]'
) s
JOIN LATERAL jsonb_array_elements(s.payload) j(elem) ON j.elem @> '{"kind":"person"}';
Returns (for a table with more entries):
elem | ct_elem | ct_rows --------------------------+---------+--------- {"kind": "person", ... } | 4 | 3 {"kind": "person", ... } | 4 | 3 ...
But I think you really want this:
SELECT a.*
, sum(ct_elem_row) OVER () AS ct_elem_total
, count(*) OVER () AS ct_rows
FROM segments s
JOIN LATERAL (
SELECT json_agg(j.elem) AS filtered_payload, count(*) AS ct_elem_row
FROM jsonb_array_elements(s.payload) j(elem)
WHERE j.elem @> '{"kind":"person"}'
) a ON ct_elem_row > 0
WHERE s.payload @> '[{"kind":"person"}]';
Returns (for a table with more entries):
filtered_payload | ct_elem_row | ct_elem_total | ct_rows -----------------------------------------------------+-------------+---------------+--------- [{"kind": "person", ... }] | 1 | 4 | 3 [{"kind": "person", ... }] | 1 | 4 | 3 [{"kind": "person", ... }, {"kind": "person", ... }] | 2 | 4 | 3
This identifies matching rows, then select matching array elements and builds an array per row with only those. Plus counts.
For best performance you would have a jsonb_path_ops
GIN index like:
CREATE INDEX segments_path_ops_gin_idx ON segments
USING gin (payload jsonb_path_ops);
(But a more generic index to serve more different queries may be a better choice.)
Related:
Terminology
We are dealing with a JSON object containing a JSON array, saved as Postgres jsonb
data type - a "JSON array" for short, but not an "array of JSON".