How can users prove a specific account has access to BigQuery? (AEAD encryption and authorized views)

Felipe Hoffa picture Felipe Hoffa · May 14, 2019 · Viewed 16.8k times · Source

I'm running an event where multiple people need to access BigQuery, and I want them to send me a proof that they have access:

  • I want to collect email addresses of participants that can access BigQuery.
  • I want proof that they have used BigQuery.

How can a user prove that their specific email account has access to BigQuery?

I'm wondering if we could build a solution with authorized views and encryption functions.

Answer

Felipe Hoffa picture Felipe Hoffa · May 14, 2019

We are going to use authorized views and the new AEAD encryption functions.

First, let's create a key that only I can see in one of my private datasets:

CREATE TABLE `secrets.keys`
AS 
SELECT KEYS.NEW_KEYSET('AEAD_AES_GCM_256') key, CURRENT_TIMESTAMP() ts, 'for_a_view' label

Now I can create a view in a public dataset that leverages SESSION_USER() and AEAD.ENCRYPT():

SELECT SESSION_USER() user, ts, key_ts, key_label
  , AEAD.ENCRYPT(key, FORMAT('%s|%t', SESSION_USER(), ts), '') enc_user_ts
  , AEAD.ENCRYPT(key, SESSION_USER(), '') enc_user
  , AEAD.ENCRYPT(key, CAST(ts AS STRING), '') enc_ts

FROM (SELECT CURRENT_TIMESTAMP() ts)
, (SELECT ts key_ts, key, label key_label FROM `fh-bigquery.secrets.keys`  WHERE label='for_a_view')

To keep my view public and the keys secret, I can set in my secrets dataset access to this new authorized view to read the key:

enter image description here

Now anyone can run the following query:

SELECT enc_user_ts
FROM `fh-bigquery.public_dump.encrypting_user_view`

Which returns:

AQElE+8cn+uHouGhZO7895UNjVs3/k05ZJLkEceQ8kszHJjQkbvuB6Vx/miDi3DhFTty7ZifXps1VKWC5OtTrQOkCQqoFFQu+VkDfz9F65R5f3PGPA==

I can ask users to share that value with me, and I can share that value with you - but only I can decrypt it using my secret key.

Decrypting with my secret key:

SELECT AEAD.DECRYPT_STRING(key, FROM_BASE64('AQElE+8cn+uHouGhZO7895UNjVs3/k05ZJLkEceQ8kszHJjQkbvuB6Vx/miDi3DhFTty7ZifXps1VKWC5OtTrQOkCQqoFFQu+VkDfz9F65R5f3PGPA=='), '')
FROM (SELECT key FROM `fh-bigquery.secrets.keys` WHERE label='for_a_view' LIMIT 1)

Which returns:

⬛⬛⬛⬛⬛⬛⬛⬛[email protected]|2019-05-14 03:51:15.047791+00   

Note that's the exact account and timestamp they used to run their query!

And this is how - if I ever need proof that the account you are telling me you are using to use BigQuery is indeed that account, I'll ask you to run the following query and share its result with me:

SELECT enc_user_ts
FROM `fh-bigquery.public_dump.encrypting_user_view`