I am running a query that gives a non-overlapping set of first_party_id's - ids that are associated with one third party but not another. This query does not run in Athena, however, giving the error: Correlated queries not yet supported.
Was looking at prestodb docs, https://prestodb.io/docs/current/sql/select.html (Athena is prestodb under the hood), for an alternative to nested queries. The with statement
example given doesn't seem to translate well for this not in
clause. Wondering what the alternative to a nested query would be - Query below.
SELECT
COUNT(DISTINCT i.third_party_id) AS uniques
FROM
db.ids i
WHERE
i.third_party_type = 'cookie_1'
AND i.first_party_id NOT IN (
SELECT
i.first_party_id
WHERE
i.third_party_id = 'cookie_2'
)
There may be a better way to do this - I would be curious to see it too! One way I can think of would be to use an outer join. (I'm not exactly sure about how your data is structured, so forgive the contrived example, but I hope it would translate ok.) How about this?
with
a as (select *
from (values
(1,'cookie_n',10,'cookie_2'),
(2,'cookie_n',11,'cookie_1'),
(3,'cookie_m',12,'cookie_1'),
(4,'cookie_m',12,'cookie_1'),
(5,'cookie_q',13,'cookie_1'),
(6,'cookie_n',13,'cookie_1'),
(7,'cookie_m',14,'cookie_3')
) as db_ids(first_party_id, first_party_type, third_party_id, third_party_type)
),
b as (select first_party_type
from a where third_party_type = 'cookie_2'),
c as (select a.third_party_id, b.first_party_type as exclude_first_party_type
from a left join b on a.first_party_type = b.first_party_type
where a.third_party_type = 'cookie_1')
select count(distinct third_party_id) from c
where exclude_first_party_type is null;
Hope this helps!