HIVE External Table - Set Empty Strings to NULL

DJElbow picture DJElbow · Oct 22, 2013 · Viewed 13.5k times · Source

Currently I have a HIVE 0.7 instance on Amazon EMR. I am trying to create a duplicate of this instance on a new EMR cluster using Hive 0.11.

In my 0.7 instance I have an external table that will set empty strings to NULL. Here is how I create the table:

CREATE EXTERNAL TABLE IF NOT EXISTS tablename
(column1 string,
column2 string)
PARTITIONED BY (year STRING, month STRING, day STRING)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t' 
LINES TERMINATED BY '\n'
TBLPROPERTIES ('serialization.null.format' = '');

Data is added to the table like this:

ALTER TABLE tablename 
ADD PARTITION (year = '2013', month = '10', day='01')
LOCATION '/location_in_hdfs';

This works great in 0.7 but in 0.11 it doesn't seem to be evaluating my empty strings as NULLS. Interestingly, creating a normal table with the same data and table definition seems to evaluate empty strings as NULLs as expected.

Is there are different way to do this with an external table in 0.11?

Answer

Srikanth picture Srikanth · Aug 12, 2014

Hive default partition properties overriding the table properties. Include SERDE properties in your alter statement:

ALTER TABLE tablename  ADD PARTITION (year = '2013', month = '10', day='01') SET  
SERDEPROPERTIES ('serialization.null.format' = '');