i'm trying to extract two key from every json in an arry of jsons(using sql legacy) currently i am using json extract function :
json_extract(json_column , '$[1].X') AS X,
json_extract(json_column , '$[1].Y') AS Y,
how can i make it run on every json at the 'json arry column', and not just [1] (for example)?
An example json:
[
{"blabla":000,"X":1,"blabla":000,"blabla":000,"blabla":000,,"Y":"2"},
{"blabla":000,"X":3,"blabla":000,"blabla":000,"blabla":000,,"Y":"4"},
]
thanks in advance!
Now BigQuery supports JSON_EXTRACT_ARRAY()
:
For example, to solve this particular question:
SELECT id
, ARRAY(
SELECT JSON_EXTRACT_SCALAR(x, '$.author.email')
FROM UNNEST(JSON_EXTRACT_ARRAY(payload, "$.commits"))x
) emails
FROM `githubarchive.day.20180830`
WHERE type='PushEvent'
AND id='8188163772'
Let's start with a similar problem - this is not a very convenient way to extract all emails from a json array:
SELECT id
, [ JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[0].author.email')
, JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[1].author.email')
, JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[2].author.email')
, JSON_EXTRACT_SCALAR(JSON_EXTRACT(payload, '$.commits'), '$[3].author.email')
] emails
FROM `githubarchive.day.20180830`
WHERE type='PushEvent'
AND id='8188163772'
The best way we have right now to deal with this is to use some JavaScript in an UDF to split a json-array into a SQL array:
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(json).map(x=>JSON.stringify(x));
""";
SELECT * EXCEPT(array_commits),
ARRAY(SELECT JSON_EXTRACT_SCALAR(x, '$.author.email') FROM UNNEST(array_commits) x) emails
FROM (
SELECT id
, json2array(JSON_EXTRACT(payload, '$.commits')) array_commits
FROM `githubarchive.day.20180830`
WHERE type='PushEvent'
AND id='8188163772'
)