how to use presto to query hive data

Rui Li picture Rui Li · Nov 13, 2013 · Viewed 12.3k times · Source

I just installed presto and when I use the presto-cli to query hive data, I get the following error:

$ ./presto --server node6:8080 --catalog hive --schema default
presto:default> show tables;
Query 20131113_150006_00002_u8uyp failed: Table hive.information_schema.tables does not exist

The config.properties is:

coordinator=true
datasources=jmx,hive
http-server.http.port=8080
presto-metastore.db.type=h2
presto-metastore.db.filename=/root/h2
task.max-memory=1GB
discovery-server.enabled=true
discovery.uri=`http://node6:8080`

And the hive.properties is:

connector.name=hive-cdh4
hive.metastore.uri=thrift://node6:9083

The hadoop distribution I used is CDH 4.4. I believe it's properly installed and hive can process queries successfully on its own.

Can anyone help me work it out? Any ideas will be appreciated.

Answer

itzg picture itzg · Nov 15, 2013

As recommended by the Getting Started, I created a controller (jmx only) and a separate worker (jmx,hive), each on separate machines.

What finally solved this for me was to specify the worker's hostname and http-server.http.port as the --server argument to presto. When specifying the controller, it didn't work.

This all makes sense, but I am still wondering what will happen when I have two Presto-Hive workers...