Uploading CSV for Impala

LonelySoul picture LonelySoul · Aug 23, 2013 · Viewed 12.7k times · Source

I am trying to upload the csv file on HDFS for Impala and failing many time. Not sure what is wrong here as I have followed the guide. And the csv is also on HDFS.

 CREATE EXTERNAL TABLE gc_imp 
                 (
                  asd INT,
                  full_name STRING,
                  sd_fd_date STRING,
                  ret INT,
                  ftyu INT,
                  qwerINT
                  ) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY','
LOCATION '/user/hadoop/Gc_4';

Error which I am getting. And I am using Hue for it.

> TExecuteStatementResp(status=TStatus(errorCode=None,
> errorMessage='MetaException: hdfs://nameservice1/user/hadoop/Gc_4 is
> not a directory or unable to create one', sqlState='HY000',
> infoMessages=None, statusCode=3), operationHandle=None)

Any lead.

Answer

zsxwing picture zsxwing · Aug 23, 2013

/user/hadoop/Gc_4 must be a directory. So you need to create a directory, for example, /user/hadoop/Gc_4. Then you upload your Gc_4 to it. So the file path is /user/hadoop/Gc_4/Gc_4. After that, you can use LOCATION to specify the directory path /user/hadoop/Gc_4.

LOCATION must be a directory. This requirement is same in Hive and Impala.