I have a table:
key product_code cost
1 UK 20
1 US 10
1 EU 5
2 UK 3
2 EU 6
I would like to find the sum of all products for each group of "key" and append to each row. For example for key = 1, find the sum of costs of all products (20+10+5=35) and then append result to all rows which correspond to the key = 1. So end result:
key product_code cost total_costs
1 UK 20 35
1 US 10 35
1 EU 5 35
2 UK 3 9
2 EU 6 9
I would prefer to do this without using a sub-join as this would be inefficient. My best idea would be to use the over
function in conjunction with the sum
function but I cant get it to work. My best try:
SELECT key, product_code, sum(costs) over(PARTITION BY key)
FROM test
GROUP BY key, product_code;
Iv had a look at the docs but there so cryptic I have no idea how to work out how to do it. Im using Hive v0.12.0, HDP v2.0.6, HortonWorks Hadoop distribution.