Google Analytics BigQuery tables are structured like this (Legacy SQL notations - only relevant fields are shown):
visitId: INTEGER
hits: RECORD/REPEATED
hits.hour: INTEGER
On one such table, the following query works well:
SELECT
visitId,
MIN(h.hour) AS firstHitHour
FROM
`my-table.ga_sessions_20161122`, UNNEST(hits) AS h
GROUP BY
visitId
But using this alternative syntax:
SELECT
visitId,
(SELECT MIN(hour) FROM UNNEST(hits)) as firstHitHour
FROM
`my-table.ga_sessions_20161122`
GROUP BY
visitId
Triggers the following error:
Error: UNNEST expression references column hits which is neither grouped nor aggregated
I understand that UNNEST(hits)
must be somehow grouped or aggregated, but since this column is an array (repeated)
, what does it mean exactly?
If I try to "group the column hits
", as requested, like this:
(SELECT MIN(hour) FROM UNNEST(hits) as h GROUP BY h) as firstHitHour
Then I get a Grouping by expressions of type STRUCT is not allowed
error.
How can this alternative syntax be corrected to produce the same result as the first one?
My first Answer is for original version of this question.
When I answered, I realized you have changed it to quite different one :o)
So below answer is for most recent version of your question:
I think that in "alternative" version you just do not need GROUP BY at all, because you operate on original (un-flattened) row by row and for each row (visitId) you calculate firstHitHour
SELECT
visitId,
(SELECT MIN(hour) FROM UNNEST(hits)) as firstHitHour
FROM
`my-table.ga_sessions_20161122`
In your initial query - you kind of flattening all records for each row - so that's why you need then to group them back