hive and presto,Integer division truncation problem

loonglee picture loonglee · Oct 31, 2018 · Viewed 9.6k times · Source

Why does the splitting of the two bigint type data in hive does not occur for integer division truncation, but occurs in presto

Answer

Archon picture Archon · Nov 1, 2018

Presto mechanics:

  1. Load data from various datasources via connectors into Presto JVM. (Hive connector, Mysql connector, etc. see this)

  2. Processing(scalar functions or aggregate functions) the data using Java code.

  3. Output the results from JVM (or disk if enable spill).

In Java 1/2=0 therefore Presto will be the same. In Hive, I think because of UDF like overrive operator: LanguageManual+UDF

To avoid truncation, just need to 'Thinking in Java':

int a = 1
int b = 2
c = 1.0*a/b

In Presto SQL

-- result: 0.3333333333333333
select cast(1 as double) / 3 from table_name 

see: Migrating From Hive