getting null values while loading the data from flat files into hive tables

user1823697 picture user1823697 · Nov 14, 2012 · Viewed 48.8k times · Source

I am getting the null values while loading the data from flat files into hive tables.
my tables structure is like this:

hive> create table test_hive (id int,value string);

and my flat file is like this: input.txt

1   a
2   b
3   c
4   d
5   e
6   F
7   G
8   j

when I am running the below commands I am getting null values:

hive> LOAD DATA LOCAL INPATH '/home/hduser/input.txt' OVERWRITE INTO TABLE test_hive;
hive> select * from test_hive;
OK<br>
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL

screen shot:

hive> create table test_hive (id int,value string);
OK
Time taken: 4.97 seconds
hive> show tables;
OK
test_hive
Time taken: 0.124 seconds
hive> LOAD DATA LOCAL INPATH '/home/hduser/input2.txt' OVERWRITE INTO TABLE test_hive;
Copying data from file:/home/hduser/input2.txt
Copying file: file:/home/hduser/input2.txt
Loading data to table default.test_hive
Deleted hdfs://hydhtc227141d:54310/app/hive/warehouse/test_hive
OK
Time taken: 0.572 seconds
hive> select * from test_hive;
OK
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
NULL    NULL
Time taken: 0.182 seconds

Answer

Mark Grover picture Mark Grover · Nov 15, 2012

The default field terminator in Hive is ^A. You need to explicitly mention in your create table statement that you are using a different field separator.

Similar to what Lorand Bending pointed in the comment, use:

CREATE TABLE test_hive(id INT, value STRING) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ';

You don't need to specify a location since you are creating a managed table (and not an external table).