Google BigQuery: prefix for all columns of joined table with duplicated names

Question 1

Google BigQuery: prefix for all columns of joined table with duplicated names

join google-bigquery alias prefix bigquery-standard-sql

Deviling Master · Nov 24, 2017 · Viewed 7.2k times · Source

Answer

Answer

The only option I can think of so far is as below

#standardSQL
WITH user AS (
  SELECT "john" AS name, "smith" AS surname, 1 AS parent UNION ALL
  SELECT "maggie" AS name, "smith" AS surname, 2 AS parent
), parent AS (
  SELECT 1 AS id, "john" AS name, "doe" AS surname UNION ALL
  SELECT 2 AS id, "jane" AS name, "smith" AS surname   
)
SELECT user, parent  
FROM user  
JOIN parent 
ON user.parent = parent.id

with result as

Row user.name   user.surname    user.parent parent.id   parent.name parent.surname   
1   john        smith           1           1           john        doe  
2   maggie      smith           2           2           jane        smith

It is not exactly what you expect, but the closest to it, as it wraps each row from respective joined tables into respective STRUCTs - for example:

{
"user": {"name": "john", "surname": "smith","parent": "1"},
"parent": {"id": "1","name": "john","surname": "doe"}
}

Question 2

On Google BigQuery (using #standardSQL), when there is a Join between 2 tables, I need to apply a fixed prefix to all the columns of each table.

Here is the scenario, I have a structure like this:

#standardSQL
WITH user AS (
  SELECT "john" as name, "smith" as surname, 1 as parent
  UNION ALL
  SELECT "maggie" as name, "smith" as surname, 2 as parent
),

parent AS (
  SELECT 1 as id, "john" as name, "doe" as surname
  UNION ALL
  SELECT 2 as id, "jane" as name, "smith" as surname
)

User table

+-----+--------+---------+--------+
| Row |  name  | surname | parent |
+-----+--------+---------+--------+
|   1 | john   | smith   |      1 |
|   2 | maggie | smith   |      2 |
+-----+--------+---------+--------+

Parent table

+-----+----+------+---------+
| Row | id | name | surname |
+-----+----+------+---------+
|   1 |  1 | john | doe     |
|   2 |  2 | jane | smith   |
+-----+----+------+---------+

A query like this

SELECT u.*, p.* FROM user u JOIN parent p ON u.parent = p.id

produces the following error

Error: Duplicate column names in the result are not supported. Found duplicate(s): name, surname

I want to avoid performing a custom aliasing of the table like this

SELECT
  u.name as user_name,
  u.surname as user_surname,
  p.name as parent_name,
  p.surname as parent_surname
FROM user u JOIN parent p ON u.parent = p.id

+-----+-----------+--------------+-------------+----------------+
| Row | user_name | user_surname | parent_name | parent_surname |
+-----+-----------+--------------+-------------+----------------+
|   1 | john      | smith        | john        | doe            |
|   2 | maggie    | smith        | jane        | smith          |
+-----+-----------+--------------+-------------+----------------+

If the table will change on fields, I'll need every time to edit the statement (or the statements) in order to apply the new fields with the given prefix. So this approach using fixed column names is not a suitable way

Is there a way, a query operator, in order to obtain the table as mentioned up there, automatic applying a prefix? Something like:

SELECT u.* AS user_*, p.* AS parent_*
FROM user u JOIN parent p ON u.parent = p.id

Google BigQuery: prefix for all columns of joined table with duplicated names

Answer

Related questions