How to calculate median in Hive

Aman picture Aman · Nov 11, 2014 · Viewed 44.1k times · Source

I have a hive table,

name    age     sal
A       45      1222
B       50      4555
c       44      8888
D       78      1222
E       12      7888
F       23      4555

I want to calculate median of age column.

Below is my approach

select min(age) as HMIN,max(age) as HMAX,count(age) as HCount,
IF(count(age)%2=0,'even','Odd') as PCOUNT 
from v_act_subjects_bh;

Appreciate any query suggestion

Answer

Amar picture Amar · Nov 11, 2014

You can use the percentile function to compute the median. Try this:

select percentile(cast(age as BIGINT), 0.5) from table_name