Export as csv in beeline hive

Defcon picture Defcon · Sep 19, 2016 · Viewed 37.8k times · Source

I am trying to export my hive table as a csv in beeline hive. When I run the command !sql select * from database1 > /user/bob/output.csv it gives me syntax error.

I have successfully connected to the database at this point using the below command. The query outputs the correct results on console.

beeline -u 'jdbc:hive2://[databaseaddress]' --outputformat=csv

Also, not very clear where the file ends up. It should be the file path in hdfs correct?

Answer

ozw1z5rd picture ozw1z5rd · Sep 23, 2016

When hive version is at least 0.11.0 you can execute:

INSERT OVERWRITE LOCAL DIRECTORY '/tmp/directoryWhereToStoreData' 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ','  
LINES TERMINATED BY "\n"
SELECT * FROM yourTable;

from hive/beeline to store the table into a directory on the local filesystem.


Alternatively, with beeline, save your SELECT query in yourSQLFile.sql and run:

beeline -u 'jdbc:hive2://[databaseaddress]' --outputformat=csv2 -f yourSQlFile.sql > theFileWhereToStoreTheData.csv 

Also this will store the result into a file in the local file system.


From hive, to store the data somewhere into HDFS:

CREATE EXTERNAL TABLE output 
LIKE yourTable 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION 'hfds://WhereDoYou/Like';

INSERT OVERWRITE TABLE output SELECT * from yourTable;

then you can collect the data to a local file using:

hdfs dfs -getmerge /WhereDoYou/Like

This is another option to get the data using beeline only:

env HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline -u "jdbc:hive2://your.hive.server.address:10000/" --incremental=true --outputformat=csv2 -e "select * from youdatabase.yourtable" 

Working on:

Connected to: Apache Hive (version 1.1.0-cdh5.10.1)
Driver: Hive JDBC (version 1.1.0-cdh5.10.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.10.1 by Apache Hive